Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  • articleNo Access

    Fault Detection for Message Passing Systems

    Many crucial dependable and secure services including atomic commitment, consensus and group membership, and middleware services (such as replica, communication and transaction services) use fault detectors. Through the use of fault detectors, the overlying service can be exempted from failure treatment and synchronization requirements. Fault detection is essential for proving that the services carried out are correct.

    In this paper, we first identify the necessary conditions to detect faults in a message passing system where multiple disjoint paths exist between each pair of endpoints. We then present the first fault detection protocol capable of detecting message meta-data modification in the presence of various message interferences in addition to other faults including omission faults, message replay and spurious messages using disjoint paths, where paths with faults are not known a priori. In addition, it authenticates message origins allowing Sybil attacks to be detected, identifies faulty paths, and classifies faults in the presence of multiple messages sent by various system processes. We establish the completeness and soundness properties of the proposed algorithm, i.e., it detects each considered fault and each detected fault is an actual fault, respectively. We also show that our algorithm does not yield a significant packet size and delay overheads. The algorithm shows the viability of the use of disjoint paths in fault detection.

  • articleNo Access

    ADAPTIVE FAULT-TOLERANT ROUTING IN STAR NETWORKS

    We take advantage of the hierarchical structure of the star graph network to obtain an efficient method for constructing node-disjoint paths between arbitrary pairs of nodes in the network. A distributed fault-tolerant routing algorithm for the star network based on this construction method is then presented and evaluated. The proposed algorithm adapts the routing decisions in response to node failures. Node failure and repair conditions may arise dynamically (at any time) provided that the total number of faulty nodes at any given time is less than the node-connectivity n - 1 of the n-star. When a message is blocked due to faulty components, the source of the message is warned and requested to switch to a different node-disjoint path. The methods used to identify the paths, to propagate failure information back to source nodes, and to switch from a routing path to another incur little communication and computation overhead. We show that if the node failures occur 'reasonably' apart in time, then all messages will be routed on paths of length δ + ε where δ is the minimum distance between the source and the destination and ε is 0, 2, or 4. In the unlikely case where more failures occur in a 'short period', the algorithm still delivers all messages but via possibly longer paths.

  • articleNo Access

    NEARLY OPTIMAL NODE-TO-SET PARALLEL ROUTING IN OTIS NETWORKS

    The node-to-set parallel routing problem for a k-connected network Γ is as follows: given a node s and k other nodes {t1, t2, … , tk} in Γ, find k node-disjoint paths connecting s and ti, for 1 ≤ i ≤ k. From the viewpoint of applications in synthesizing fast and resilient collective communication operations, it is desirable to make the parallel paths as short as possible. Building such paths is a nontrivial problem for a general network. Optical transpose interconnection system (OTIS, also known as swapped) networks, a class of hierarchical structures built of n identical n-node factor networks, are known to be maximally fault-tolerant for any connected factor network, implying that they have maximal connectivity. We propose a general algorithm for the node-to-set parallel routing problem in OTIS/swapped networks that yields paths of length no greater than D + 4 in O(Δ2 + Δf(n)) time, where D and Δ represent the diameter and degree of the OTIS network and O(f(n)) is the time complexity of a shortest-path routing algorithm for the n-node factor network. Our node-to-set routing algorithm is shown to have optimal time complexity for certain OTIS networks of practical interest, including OTIS-Mesh and OTIS-Hypercube.