Loading [MathJax]/jax/output/CommonHTML/jax.js
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  Bestsellers

  • articleNo Access

    AN ADAPTIVE FAULT-TOLERANT WORMHOLE ROUTING ALGORITHM FOR HYPERCUBES

    In this paper, we present an adaptive fault-tolerant wormhole routing algorithm for hypercubes by using 4 virtual networks. Each node is identified to be in one of the four states: safe, ordinarily unsafe, strongly unsafe, and faulty. Based on the concept of unsafe nodes, we design a routing algorithm for hypercubes that can tolerate at least n-1 faulty nodes and can route a message via a path of length no more than the Hamming distance between the source and destination plus four. Previous algorithms which achieve the same fault tolerant ability need at least 5 virtual channels per physical channel. Simulation results show that our algorithm outperforms previous known results.

  • articleNo Access

    EXPERIENCES WITH HYPERCUBE OPERATING SYSTEM INSTRUMENTATION

    The difficulties in conceptualizing the interactions among a large number of processors make it difficult both to identify the sources of inefficiencies and to determine how a parallel program could be made more efficient. This paper describes an instrumentation system that can trace the execution of distributed memory parallel programs by recording the occurrence of parallel program events. The resulting event traces can be used to compile summary statistics that provide a global view of program performance. In addition, visualization tools permit the graphic display of event traces. Visual presentation of performance data is particularly useful, indeed, necessary for large-scale parallel computers; the enormous volume of performance data mandates visual display.

  • articleNo Access

    PARALLEL ITERATIVE SOLUTION OF LARGE LINEAR SYSTEMS ON HYPERCUBES

    Solving large, linear systems is among the most important and most frequently encountered problems in computational mathematics and computer science. This paper presents efficient parallel Jacobi and Gauss-Seidel algorithms, in spite of the apparent inherent sequentiality of the latter, for the iterative solution of large linear systems on hypercube machines. To evaluate their performance, expressions for the speedup factor of the algorithms are derived. The results show that the hypercubes are highly effective in solving large systems of dense linear algebraic equations. Finally, the suitability of the hypercubes for solving sparse linear systems is discussed.

  • articleNo Access

    GLOBAL SYNCHRONIZATION ALGORITHMS FOR THE INTEL IPSC/860

    In a distributed memory multicomputer that has no global clock, global processor synchronization can only be achieved through software. Global synchronization is used in many applications, including tridiagonal systems solvers, CFD codes, and sequence comparison algorithms. For the Intel iPSC/860 in particular, global synchronization can also be used to ensure the most effective use of the communication network. Two global synchronization algorithms are considered for the iPSC/860: The gsync primitive provided by Intel and the RDS algorithm. Based on the communication model presented here, it is shown that gsync sometimes leaves the processors more poorly synchronized than they were to begin with. It is also shown that interrupts from the node operating system can cause gsync to contend for communication ports with the application code. The RDS algorithm does not have these shortcomings and costs only slightly more than the other algorithms. Measurements of the cost of message shift operations preceded by global synchronization confirm that the RDS algorithm always synchronizes the nodes more precisely than gsync.

  • articleNo Access

    A New Bidirectional Cholesky Factorization Algorithm for Parallel Solution of Sparse Symmetric Positive Definite Systems

    In this paper, we consider the problem of solving sparse linear systems occurring in finite difference applications (or N × N grid problems, N being the size of the linear system). We propose a new algorithm for the problem which is based on the Cholesky factorization, a symmetric variant of Gaussian elimination tailored to symmetric positive definite systems. The algorithm employs a new technique called bidirectional factorization to produce the complete solution vector by solving only one triangular system against two triangular systems in the existing Cholesky factorization after the factorization phase. The effectiveness of the new algorithm is demonstrated by comparing its performance with that of the existing Cholesky factorization for solving regular finite difference grid problems on hypercube multiprocessors.

  • articleNo Access

    A NOTE ON SCHEDULING PARALLEL UNIT JOBS ON HYPERCUBES

    We study the problem of scheduling independent unit-time parallel jobs on hypercubes. A parallel job has to be scheduled between its release time and deadline on a subcube of processors. The objective is to maximize the number of early jobs. Jobs' intervals of feasibility have to be nested. We provide a polynomial time algorithm for the problem.

  • articleNo Access

    Fault-Free Hamiltonian Cycles in Balanced Hypercubes with Conditional Edge Faults

    The balanced hypercube, BHn, is a variant of hypercube Qn. Zhou et al. [Inform. Sci.300 (2015) 20–27] proposed an interesting problem that whether there is a fault-free Hamiltonian cycle in BHn with each vertex incident to at least two fault-free edges. In this paper, we consider this problem and show that each fault-free edge lies on a fault-free Hamiltonian cycle in BHn after no more than 4n5 faulty edges occur if each vertex is incident with at least two fault-free edges for all n2. Our result is optimal with respect to the maximum number of tolerated edge faults.

  • articleNo Access

    The Non-inclusive Diagnosability of Hypercubes under the MM* Model

    Diagnosability is an important factor in multiple-processor systems defined as the maximum number of faulty nodes that a system can recognize. In this paper, we propose a new form of diagnosability called non-inclusive diagnosability that requires all faulty sets to be non-inclusive. Furthermore, we study the non-inclusive diagnosability of hypercubes under the MM* model for n1.

  • articleNo Access

    A Note on the Dimensionality of Modified Knödel Graphs

    We show that the edges of the modified Knödel graph can be grouped into dimensions which are similar to the dimensions of hypercubes. In particular, routing, broadcasting and gossiping, can be done easily in modified Knödel graphs using these dimensions.

  • articleNo Access

    A DYNAMIC SCHEDULING COMMUNICATION PROTOCOL AND ITS ANALYSIS FOR HYPERCUBE NETWORKS

    We propose a new protocol for one-to-one communication in multiprocessor networks, which we call the Dynamic Scheduling Communication (or DSC) protocol. In the DSC protocol, the capacity of a link is partitioned into two channels: a data channel, used to transmit packets, and a control channel used to make reservations. We initially describe the DSC protocol and the data structures needed to implement it for a general network topology. We then analyze the steady-state throughput of the DSC protocol for random node-to-node communication in a hypercube topology. The analytical results obtained are in very close agreement with corresponding simulation results. For the hypercube topology, and under the same set of assumptions on the node architecture and the routing algorithm used, the DSC protocol is found to achieve higher throughput than packet switching, provided that the size of the network is sufficiently large. We also investigate the relationship between the achievable throughput and the fraction of network capacity dedicated to the control channel, and present a method to select this fraction so as to optimize throughput.

  • articleNo Access

    DEADLOCK-FREE FAULT-TOLERANT MULTICAST ROUTING IN HYPERCUBES

    In this paper, an algorithm is presented to embedd a ring in an n-dimensional hypercube with at most n-1 faults; subsequently, the embedded ring is used for efficient deadlock-free multicast routing between all the fault-free nodes. Simulation results are presented to compare the performance of the algorithm to that of the routing algorithm proposed in [9] for a fault-free hypercube to study the performance degradation because of faults. A modification to the algorithm of [9] is also presented for improved performance.

  • articleNo Access

    QoS ROUTING IN HYPERCUBE MULTICOMPUTERS

    In this paper, we present a coding method to capture QoS information in hypercube multicomputers. The coding method is based on a localized algorithm where each node interacts with its neighbors to gather QoS information. Specifically, each node maintains a QoS vector where the k-th element represents the guaranteed QoS performance to a destination that is k hops away. The localized algorithm exhibits desirable properties of self-stabilizing, self-optimizing, and self-healing. Simulation results show that this coding method provides a good approximation of the minimum QoS value to a k-hop destination and, at the same time, uses a relatively small number of packets to propagate a change in link state (QoS value) compared with the classical distributed Bellman-Ford method.

  • articleNo Access

    ON THE COMPUTATIONAL COMPLEXITY OF ROUTING IN FAULTY K-ARY N-CUBES AND HYPERCUBES

    We equate a routing algorithm in a (faulty) interconnection network whose underlying graph is a k-ary n-cube or a hypercube, that attempts to route a packet from a fixed source node to a fixed destination node, with the sub-digraph of (healthy) links potentially usable by this routing algorithm as it attempts to route the packet. This gives rise to a naturally defined problem, parameterized by this routing algorithm, relating to whether a packet can be routed from a given source node to a given destination node in one of our interconnection networks in which there are (possibly exponentially many) faulty links. We show that there exist such problems that are PSPACE-complete (all are solvable in PSPACE) but that there are (existing and popular) routing algorithms for which the computational complexity of the corresponding problem is significantly easier (yet still computationally intractable).

  • articleNo Access

    MAPPING BINARY PRECEDENCE TREES TO HYPERCUBES

    This paper addresses the problem of mapping the modules of a task precedence graph to the processing elements of a parallel computer. The goal of the mapping is to minimize the total execution time of the task, sum of the computation and communications time, within a processor network of limited size. We consider the special instance where the precedence graph is a binary precedence tree. We present a linear time procedure for mapping an almost full binary precedence tree of n nodes to any p processor hypercube, with unit dilation cost and optimal execution time. The computation time under the mapping is seen to be O(n/p) and the communications time is seen to be log p.

  • articleNo Access

    EFFICIENT BROADCAST ON HYPERCUBES WITH WORMHOLE AND E-CUBE ROUTINGS

    We consider the problem of broadcasting on an n-dimensional hypercube with wormhole e-cube routing, intermediate reception capability, and one-port communication. We give an algorithm, optimal to within a multiplicative constant, that broadcasts in this model in Θ(n/log2(n+1)) routing steps. We also give routing algorithms that achieve tight time bounds for n-cubes where n ≤ 6.

  • articleNo Access

    TIGHT BOUNDS ON THE NUMBER OF l-NODES IN A FAULTY HYPERCUBE

    The spanning binomial tree is one of most frequently used spanning tree structures to implement various parallel applications in multiprocessor systems, such as hypercubes. In this paper, we define an l-node as a root node of an incomplete spanning binomial tree of a hypercube, which is defined as a connected subtree of a spanning binomial tree with the same root node that connects all the nonfaulty nodes in the hypercube. We show that in an n-dimensional hypercube with m faulty nodes there are at least 2n − 2ml-nodes. This implies that at least half of the nodes of the hypercube are l-nodes if the number of faulty nodes is less than the dimension of the hypercube.

  • articleNo Access

    APPLICATIONS OF EVOLUTIONARY STRATEGIES TO FINE-GRAINED TASK SCHEDULING

    Embedding task graphs onto hypercubes is a difficult problem. When the embedding is one-to-one, schedule length is strongly influenced by dilation. Therefore, it is desirable to find low dilation embeddings. This paper describes a heuristic embedding technique based upon evolutionary strategies. The technique has been extensively investigated using task graphs which are trees, forests, and butterflies. In all cases the technique has found low dilation embeddings.

  • articleNo Access

    Messy Broadcasting

    In this note, we continue the study of messy broadcasting. We obtain exact values for the messy broadcasting time of complete graphs, paths, cycles, and complete d-ary trees. For hypercubes, we obtain exact values for messy broadcasting time under two of the models and present upper and lower bounds for the third model. We compare these times with (regular) broadcasting times in these graphs. We also present some simple bounds for arbitrary graphs which we use to compare the messy broadcasting times of cube-connected cycles, shuffle-exchange graphs, butterfly graphs, and DeBruijn graphs with their (regular) broadcasting times.

  • articleNo Access

    Neighbourhood Gossiping in Hypercubes

    In the neighbourhood gossiping problem, each node of a network starts with a unique message and must learn the messages of all of its neighbours. In this paper, we prove upper and lower bounds for neighbourhood gossiping in hypercubes under the single-port half-duplex and single-port full-duplex communication models.

  • articleNo Access

    IMAGE ANALYSIS ON MASSIVELY PARALLEL COMPUTERS: AN ARCHITECTURAL POINT OF VIEW

    Finding a parallel architecture adapted to a given class of algorithms is a central problem for architects. This paper presents a methodology to realize it, and provides an illustration using image analysis. First, we show a set of common basic operations that can be used to solve most image analysis problems. Then these movements are translated to fit some natural communications in a given architecture. The considered data movements (global operations on connected pixel sets) can express a large class of algorithms. Their implementation on exemplary massively parallel architectures (arrays, hypercubes, pyramids) is discussed.