Processing math: 100%
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  Bestsellers

  • articleNo Access

    EMBEDDING HAMILTONIAN CYCLES, LINEAR ARRAYS AND RINGS IN A FAULTY SUPERCUBE

    We consider the problem of finding Hamiltonian cycles, linear arrays and rings of a faulty supercube, if any. The proof of the existence of Hamiltonian cycles in hypercubes is easy due to the fact they are symmetric graphs. Since the supercube is asymmetric, the proof of the existence of Hamiltonian cycles is not trivial. We show that for any supercube SN, where N is the number of nodes in the supercube, there exists a Hamiltonian cycle. This implies that for any r such that 3≤r≤N, there exists a cycle of r nodes in a supercube. There are embedding algorithms proposed in this paper. The embedding algorithms show a ring with any number of nodes which can be embedded in a faulty supercube with load 1, congestion 1 and dilation 4 such that O(n2-(⌊log2 m⌋)2) faults can be tolerated, where n is the dimension of the supercube and m is the number of nodes of the ring.

  • articleNo Access

    PRODUCT OF QUASI-GROUP GRAPHS AS INTERCONNECTION NETWORK TOPOLOGIES

    The class of hypercubes is one of the most important and popular topologies for interconnection networks of multicomputer systems. This class includes the binary hypercube and generalized hypercube. Based on the observation that these two graphs can be constructed using a graph theoretic operation known as the product of graphs, we propose a new method for generating large symmetric graphs for networks of multicomputer systems. This method is essentially algebraic in nature, and makes use of the product of a class of graphs known as quasi-group graphs. We call the graphs we obtain PQG graphs. Because these graphs are constructed by an algebraic operation, it simplifies the analysis of their performance. Many of the well-known topologies can in fact be expressed as PQG graphs; this makes the method a very general one. We also investigate the problem of routing in a PQG graph, and propose a hardware implementation of the routing algorithm to reduce the delay in the routing of messages. We then apply our results to the Petersen graph and show that the product of such graphs has vastly superior topological properties than hypercubes of the same degree.

  • articleNo Access

    FLO52 ON HYPERCUBES: PERFORMANCE AND MODELLING OF A MULTIGRID EULER CODE

    This paper evaluates the performance of hypercube machines on a computational fluid dynamics problem. Our evaluation focuses on a prototype of a class of widely used fluid dynamics codes, FLO52, written by Antony Jameson, which solves the two-dimensional steady Euler equations describing flow around an airfoil. In this paper, we give a description of FLO52, its hypercube mapping, and the code modifications to increase machine utilization. Results from two hypercube computers (a 16 node iPSC/2, and a 512 node NCUBE/ten) are presented and compared. In addition, we develop a mathematical model of the execution time as a function of several machine and algorithm parameters. This model accurately predicts the actual run times obtained. Predictions about future hypercubes are made using this timing model.

  • articleNo Access

    A MODEL AND IMPLEMENTATION OF MULTIGRID FOR MASSIVELY PARALLEL COMPUTERS

    Multigrid is a fast and popular iterative method used to solve linear partial differential equations. However, because the solution of very small problems is inherent in the multigrid iteration, it is difficult to implement efficiently on a massively parallel computer. In this paper, we present a model for the efficiency of multigrid on a parallel computer that depends only on the efficiency of the smoother at each level. This model can be used to verify that it is indeed difficult to obtain extremely high efficiencies (>95%), but that it is relatively easy to obtain moderately high. efficiencies (70% to 85%). The model also indicates that moderately high efficiencies can still be achieved as the number of dimensions in the partial differential equation increases. We also present an implementation of the multigrid v-cycle that has achieved 84% efficiency on the 1,024-processor NCUBE/ten.

  • articleNo Access

    ADAPTIVE ROUTING FOR HYPERCUBE MULTIPROCESSORS: A PERFORMANCE STUDY

    Through simulation, we studied the performance of three adaptive “wormhole” routing strategies, and compared them with static routing. Since adaptive routing is susceptible to deadlock, an abort-and-retry strategy was used to prevent it from arising. The impact of packetization of long messages and buffering at message destinations were also studied. Results are presented and analyzed for a variety of hardware configurations and traffic conditions. The combination of adaptive routing, abort-and-retry, and buffering at the destination is shown to achieve excellent performance for modest cost.

  • articleNo Access

    AN EFFICIENT PARALLEL ALGORITHM FOR MATRIX-VECTOR MULTIPLICATION

    The multiplication of a vector by a matrix is the kernel operation in many algorithms used in scientific computation. A fast and efficient parallel algorithm for this calculation is therefore desirable. This paper describes a parallel matrix-vector multiplication algorithm which is particularly well suited to dense matrices or matrices with an irregular sparsity pattern. Such matrices can arise from discretizing partial differential equations on irregular grids or from problems exhibiting nearly random connectivity between data structures. The communication cost of the algorithm is independent of the matrix sparsity pattern and is shown to scale as formula for an n×n matrix on p processors. The algorithm’s performance is demonstrated by using it within the well known NAS conjugate gradient benchmark. This resulted in the fastest run times achieved to date on both the 1024 node nCUBE 2 and the 128 node Intel iPSC/860. Additional improvements to the algorithm which are possible when integrating it with the conjugate gradient algorithm are also discussed.

  • articleNo Access

    PARALLEL AND PIPELINED PARALLEL CONSECUTIVE SUMS ON A HYPERCUBE WITH APPLICATION TO RAY CASTING

    Communication penalty for parallel computation is related to message startup time and speed of data transmission between the host and processing elements (PEs). We propose two algorithms in this paper to show that the first factor can be alleviated by reducing the number of messages and the second by making the host-PE communication concurrent with computation on the PE array.

    The algorithms perform 2n consecutive sums of 2n numbers each on a hypercube of degree n. The first algorithm leaves one sum on each processor. It takes n steps to complete the sums and reduces the number of messages generated by a PE from 2n to n. The second algorithm sends all the sums back to the host as the sums are generated one by one. It takes 2n+n−1 steps to complete the sums in a pipeline so that one sum is completed every step after the initial (n−1) steps.

    We apply our second algorithm to the front-to-back composition for ray casting. For large number of rays, the efficiency and speedup of our algorithm are close to theoretically optimal values.

  • articleNo Access

    ON IMPROVING THE PERFORMANCE OF TREE MACHINES

    In this paper we introduce a class of trees, called generalized compressed trees. Generalized compressed trees can be derived from complete binary trees by performing certain ‘contraction’ operations. A generalized compressed tree CT of height h has approximately 25% fewer nodes than a complete binary tree T of height h. We show that these trees have smaller (up to a 74% reduction) 2-dimensional and 3-dimensional VLSI layouts than the complete binary trees. We also show that algorithms initially designed for T can be simulated by CT with at most a constant slow-down. In particular, algorithms having non-pipelined computation structure and originally designed for T can be simulated by CT with no slow-down.

  • articleNo Access

    Mapping Pipelined Divided–Difference Computations into Hypercubes

    In numerical computations, the method of divided differences is a very important technique for polynomial approximations. Consider a pipelined divided–difference computation for approximating an nth degree polynomial. This paper first presents a method to transform the computational structure of divided differences into the pyramid tree with formula nodes. Based on graph embedding technique, without any extra communication delay, the pipelined divided–difference computation can be performed in a (2k + 1)-dimensional fault–free hypercube for n + 1 = 2k + t, k > 0, and 0 < t < 2k; the pipelined divided-difference computation can be further performed in a (2k + 2)-dimensional faulty hypercube to tolerate arbitrary (k - 1) faulty nodes/links. To the best of our knowledge, this is the first time such mapping methods are being proposed in the literature.

  • articleNo Access

    FAULT-TOLERANT CHARACTERISTICS AND TOPOLOGICAL PROPERTIES OF A HIERARCHICAL NETWORK OF HYPERCUBES

    We analyse the fault-tolerant parameters and topological properties of a hierarchical network of hypercubes. We take a close look at the Extended Hypercube (EH) and the Hyperweave (HW) architectures and also compare them with other popular architectures. These two architectures have low diameter and constant degree of connectivity making it possible to expand these networks without affecting the existing configuration. A scheme for incrementally expanding this network is also presented. We also look at the performance of the ASCEND/DESCEND class of algorithms on these architectures.

  • articleNo Access

    SIMULATION OF CYCLES IN THE IEH GRAPH

    The Incrementally Extensible Hypercube (IEH) is a novel interconnection network derived from the hypercube. Unlike the hypercube, the IEH graph is incrementally extensible, that is, it can be constructed for any number of nodes. In addition, it has optimal fault tolerance and its diameter is logarithmic in the number of nodes and the difference of the maximum and the minimum degree of a node in the graph is formula (i.e., the graph is almost regular). In this paper, we show that almost the entire IEH graph, except for those with N =2n-1 nodes for all formula, has a Hamiltonian cycle; if an IEH graph has N=2n-1 nodes then it has only a Hamiltonian path, not cycle. These results enable us to obtain the good embedding of rings and linear arrays into the IEH graph.

  • articleNo Access

    A METHOD FOR EVALUATING THE EXPECTED LOAD OF DYNAMIC TREE EMBEDDINGS IN HYPERCUBES

    A key issue in performing tree structured parallel computations is to distribute process components of a parallel program over processors in a parallel computer at run time such that both the maximum load and dilation are minimized. The main contribution of this paper is the application of recurrence relations in studying the performance of a dynamic tree embedding algorithm in hypercubes. We develop recurrence relations that characterize the expected load in randomized tree embeddings where, a tree grows by letting its nodes to take random walks of short distance. By using these recurrence relations, we are able to calculate the expected load on each processor. Therefore, for constant dilation embeddings, we are able to evaluate expected loads numerically and analytically. The applicability of recurrence relations is due to the recursive structure of trees and the fact that embeddings of the subtrees of a process node are independent of each other. Our methodology does not depend on the hypercube topology. Hence, it can be applied to studying dynamic tree growing in other networks.

  • articleNo Access

    EFFICIENT ALGORITHMS FOR SELECTION AND SORTING OF LARGE DISTRIBUTED FILES ON DE BRUIJN AND HYPERCUBE STRUCTURES

    In this paper we show the power of sampling techniques in designing efficient distributed algorithms. In particular, we apply sampling techniques in the design of selection algorithms on the hypercube and de Bruijn networks, and show that the message complexity of selecting an item from a set (file) is less sensitive to the cardinality of the set (file). Given a file with n keys, our algorithm performs a selection on a p-node de Bruijn network or hypercube using only O(p loglog n) messages and suffering a delay of O(τ log p loglog n), with high probability. Our selection scheme outperforms the existing approaches in terms of both message complexity and communication delay. Because of the lesser sensitivity of message complexity and communication delay of our algorithms to the file size, our distributed selection schemes are very attractive in applications where very large database systems are involved. Using our selection algorithms, we also show that both quicksort-based sorting scheme and enumeration sorting scheme can be developed for sorting large distributed files on the hypercube and de Bruijn networks. Both of our sorting algorithms outperform the existing distributed sorting schemes in terms of both message complexity and communication delay.

  • articleNo Access

    ON HIERARCHICAL CONFIGURATION OF DISTRIBUTED SYSTEMS ON MESH AND HYPERCUBE

    We study hierarchical configuration of distributed systems for achieving optimized system performance. A distributed system consists of a collection of local processes which are distributed over a network of processors, and work in a cooperative manner to fulfill various tasks. A hierarchical approach is to group and organize the distributed processes into a logical hierarchy of multiple levels, so as to coordinate the local computation/control activities to improve the overall system performance. It has been proposed as an effective way to solve various problems in distributed computing, such as distributed monitoring, resource scheduling, and network routing. The optimization problem considered in this paper is concerned with finding an optimal hierarchical partition of the processors, so that the total traffic flow over the network is minimized. The problem in its general form has been known to be NP-hard. Therefore, we just focus on distributed computing jobs which require collecting and processing information from all processors. By limiting levels of the hierarchy to two, we will establish the analytically optimal hierarchical configurations for two popular interconnection networks: mesh and hypercube. Based on analytical results, partitioning algorithms are proposed to achieve minimal communication cost (network traffic flow). We will also present and discuss heuristic algorithms for multiple-level hierarchical partitions.

  • articleNo Access

    Component Edge Connectivity of Hypercubes

    Fault tolerance is an important issue in interconnection networks, and the traditional edge connectivity is an important measure to evaluate the robustness of an interconnection network. The component edge connectivity is a generalization of the traditional edge connectivity. The g-component edge connectivity cλg(G) of a non-complete graph G is the minimum number of edges whose deletion results in a graph with at least g components. Let g be an integer and g=si=02ti be the decomposition of g such that t0=[log2g], and ti=[log2(gi1r=02tr)] for i1. In this note, we determine the g-component edge connectivity of the hypercube Qn, cλg(Qn)=ng(si=0ti2ti1+si=0i2ti) for g2[n2]+1,n7. Moreover, we classify the corresponding optimal solutions.

  • articleNo Access

    k-Fibonacci Cubes: A Family of Subgraphs of Fibonacci Cubes

    Hypercubes and Fibonacci cubes are classical models for interconnection networks with interesting graph theoretic properties. We consider k-Fibonacci cubes, which we obtain as subgraphs of Fibonacci cubes by eliminating certain edges during the fundamental recursion phase of their construction. These graphs have the same number of vertices as Fibonacci cubes, but their edge sets are determined by a parameter k. We obtain properties of k-Fibonacci cubes including the number of edges, the average degree of a vertex, the degree sequence and the number of hypercubes they contain.

  • articleNo Access

    Hamiltonian Cycle Embeddings in Faulty Hypercubes Under the Forbidden Faulty Set Model

    In this paper, we study the fault-tolerant capability of hypercubes with respect to the hamiltonian property based on the concept of forbidden faulty sets. We show, with the assumption that each vertex is incident with at least three fault-free edges, that an n-dimensional hypercube contains a fault-free hamiltonian cycle, even if there are up to (4n13) edge faults. Moreover, we give an example to show that the result is optimal with respect to the number of edge faults tolerated.

  • articleNo Access

    Subgraph-based Strong Menger Connectivity of Hypercube and Exchanged Hypercube

    Large scale multiprocessor systems or multicomputer systems, taking interconnection networks as underlying topologies, have been widely used in the big data era. Fault tolerance is becoming an essential attribute in multiprocessor systems as the number of processors is getting larger. A connected graph G is called strong Menger (edge) connected if, for any two distinct vertices u and v, there are min{dG(u),dG(v)} vertex (edge)-disjoint paths between them. Exchanged hypercube EH(s,t), as a variant of hypercube Qn, remains lots of preferable fault tolerant properties of hypercube. In this paper, we show that QnQk(1kn1) and EH(s,t)Qk(2kmin{s,t}) are strong Menger (edge) connected, respectively. Moreover, as a by-product, for dual cube Dn=EH(n1,n1), one popular generalization of hypercube, DnQk is also showed to be strong Menger (edge) connected, where 1kn1, n3.

  • articleNo Access

    Alternate Lucas Cubes

    We introduce alternate Lucas cubes, a new family of graphs designed as an alternative for the well known Lucas cubes. These interconnection networks are subgraphs of Fibonacci cubes and have a useful fundamental decomposition similar to the one for Fibonacci cubes. The vertices of alternate Lucas cubes are constructed from binary strings that are encodings of Lucas representation of integers. As well as ordinary hypercubes, Fibonacci cubes and Lucas cubes, alternate Lucas cubes have several interesting structural and enumerative properties. In this paper we study some of these properties. Specifically, we give the fundamental decomposition giving the recursive structure, determine the number of edges, number of vertices by weight, the distribution of the degrees; as well as the properties of induced hypercubes, q-cube polynomials and maximal hypercube polynomials. We also obtain the irregularity polynomials of this family of graphs, determine the conditions for Hamiltonicity, and calculate metric properties such as the radius, diameter, and the center.

  • articleNo Access

    Network Orientation

    This paper analyses how the symmetry of a processor network influences the existence of a solution for the network orientation problem. The orientation of hypercubes and tori is the problem of assigning labels to each link of each processor, in such a way that a sense of direction is given to the network. In this paper the problem of network orientation for these two topologies is studied under the assumption that the network contains a single leader, under the assumption that the processors possess unique identities, and under the assumption that the network is anonymous. The distinction between these three models is considered fundamental in distributed computing.

    It is shown that orientations can be computed by deterministic algorithms only when either a leader or unique identities are available. Orientations can be computed for anonymous networks by randomized algorithms, but only when the number of processors is known. When the number of processors is not known, even randomized algorithms cannot compute orientations for anonymous processor networks.

    Lower bounds on the message complexity of orientation and algorithms achieving these bounds are given.