Loading [MathJax]/jax/output/CommonHTML/jax.js
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

Bestsellers

Handbook of Machine Learning
Handbook of Machine Learning

Volume 1: Foundation of Artificial Intelligence
by Tshilidzi Marwala
Handbook on Computational Intelligence
Handbook on Computational Intelligence

In 2 Volumes
edited by Plamen Parvanov Angelov

 

  • articleNo Access

    CACHE MISS ANALYSIS OF 2D STENCIL CODES WITH TILED TIME LOOP

    Stencil codes such as the Jacobi, Gauß-Seidel, and red-black Gauß-Seidel kernels are among the most time-consuming routines in many scientific and engineering applications. The performance of these codes critically depends on an efficient usage of caches, and can be improved by tiling. Several tiling schemes have been suggested in the literature; this paper gives an overview and comparison. Then, in the main part, we prove a lower bound on the number of cold and capacity misses. Finally, we analyze a particular tiling scheme, and show that it is off the lower bound by a factor of at most ten. Our results show up limitations to the speedup that can be gained by future research.

  • articleNo Access

    POLLY — PERFORMING POLYHEDRAL OPTIMIZATIONS ON A LOW-LEVEL INTERMEDIATE REPRESENTATION

    The polyhedral model for loop parallelization has proved to be an effective tool for advanced optimization and automatic parallelization of programs in higher-level languages. Yet, to integrate such optimizations seamlessly into production compilers, they must be performed on the compiler's internal, low-level, intermediate representation (IR). With Polly, we present an infrastructure for polyhedral optimizations on such an IR. We describe the detection of program parts amenable to a polyhedral optimization (so-called static control parts), their translation to a Z-polyhedral representation, optimizations on this representation and the generation of optimized IR code. Furthermore, we define an interface for connecting external optimizers and present a novel way of using the parallelism they introduce to generate SIMD and OpenMP code. To evaluate Polly, we compile the PolyBench 2.0 benchmarks fully automatically with PLuTo as external optimizer and parallelizer. We can report on significant speedups.

  • articleNo Access

    The Relation Between Diamond Tiling and Hexagonal Tiling

    Iterative stencil computations are important in scientific computing and more also in the embedded and mobile domain. Recent publications have shown that tiling schemes that ensure concurrent start provide efficient ways to execute these kernels. Diamond tiling and hybrid-hexagonal tiling are two tiling schemes that enable concurrent start. Both have different advantages: diamond tiling has been integrated in a general purpose optimization framework and uses a cost function to choose among tiling hyperplanes, whereas the greater flexibility with tile sizes for hybrid-hexagonal tiling has been exploited for effective generation of GPU code.

    In this paper we undertake a comparative study of these two tiling approaches and propose a hybrid approach that combines them. We analyze the effects of tile size and wavefront choices on tile-level parallelism, and formulate constraints for optimal diamond tile shapes. We then extend, for the case of two dimensions, the diamond tiling formulation into a hexagonal tiling one, which offers both the flexibility of hexagonal tiling and the generality of the original diamond tiling implementation. We also show how to compute tile sizes that maximize the compute-to-communication ratio, and apply this result to compare the best achievable ratio and the associated synchronization overhead for diamond and hexagonal tiling.

  • articleNo Access

    Combining Retiming and Scheduling Techniques for Loop Parallelization and Loop Tiling

    Tiling is a technique used for exploiting medium-grain parallelism in nested loops. It relies on a first step that detects sets of permutable nested loops. All algorithms developed so far consider the statements of the loop body as a single block, in other words, they are not able to take advantage of the structure of dependences between different statements. In this paper, we overcame this limitation by showing how the structure of the reduced dependence graph can be taken into account for detecting more permutable loops. Our method combines graph retiming techniques and graph scheduling techniques. It can be viewed as an extension of Wolf and Lam's algorithm to the case of loops with multiple statements. Loan independent dependences play a particular role in our study, and we show how the way we handle them can be useful for fine-grain loop parallelization as well.

  • articleNo Access

    Compile-Time Cache Performance Prediction and Its Application to Tiling

    Tiling has been used by parallelizing compilers to define fine-grain parallel tasks and to optimize cache performance. In this paper we present a novel compile-time technique, called miss-driven cache simulation, for determining loop tile sizes that achieve the highest cache hit-rate. The widening disparity between a processor's peak instruction rate and main memory access time in modern computer systems makes this kind of optimization increasingly important for overall program efficiency.

    Our simulation technique generates only those references of a loop nest that may generate a cache memory miss and processes them on an architecturally accurate cache model at compile-time. Processing only a small portion of the memory reference trace of a program yields simulation speeds in the millions of memory references per second on workstations, with statistics of misses per reference and inter-reference interference counts gathered at the same time. These simulation speeds and statistics allow for the accurate analysis of the impact of cache optimizations at compile-time.

    We discuss the results of applying this method to guide loop tiling for such commonly used computational kernels as matrix multiplication and Jacobi iteration for various cache parameters.

  • articleNo Access

    On Tiling as a Loop Transformation

    This paper is a follow-up Irigoin and Triolet's earlier work and our recent work on tiling. In this paper, tiling is discussed in terms of its effects on the dependences between tiles, the dependences within a tile and the required dependence test for legality. A necessary and sufficient condition is given for enforcing the data dependences of the program, while Irigion and Triolet's atomic tile constraint is only sufficient. A condition is identified under which both Irigoin and Triolet's and our constraints are equivalent. The results of this paper are discussed in terms of their impact on dependence abstractions suitable for legality test and on tiling to optimise a certain given goal.

  • articleNo Access

    SIROMONEY ARRAY GRAMMARS AND APPLICATIONS

    The Siromoney matrix model is a simple and elegant model for describing two-dimensional digital picture languages. The notion of attaching indices to nonterminals in a generative grammar, introduced and investigated by Aho. is considered in the vertical phase of a Siromoney matrix grammar (SMG). The advantage of this study is that the new model retains the simplicity and elegance of SMG but increases the generative power and enables us to describe pictures not generable by SMG. Besides certain closure properties and hierarchy results. applications of these two-dimensional grammars to describe tilings, polyominoes, distorted patterns and parquet deformations are studied.

  • articleNo Access

    BASIC PUZZLE LANGUAGES

    The emptiness problem for non-overlapping Basic puzzle grammars is shown to be decidable. An alternate proof of the decidability of the non-overlapping feature for basic puzzle grammars is given. Hierarchy among the various classes of puzzle languages is also established.

  • articleNo Access

    COMPUTATIONAL GEOMETRY COLUMN 47

    A remarkable theorem is described: "It is possible to tile the plane with non-overlapping squares using exactly one square of each integral dimension". Thus, one can "square the plane".

  • articleNo Access

    ON THE MATABILITY OF POLYGONS

    Interpolating a piecewise-linear triangulated surface between two polygons lying in parallel planes has attracted a lot of attention in the literature over the past three decades. This problem is the simplest variant of interpolation between parallel slices, which may contain multiple polygons with unrestricted geometries and/or topologies. Its solution has important applications to medical imaging, digitization of objects, geographical information systems, and more. Practically all currently-known methods for surface reconstruction from parallel slices assume a priori the existence of a non-self-intersecting triangulated surface defined over the vertices of the two polygons, which connects them. Gitlin et al. were the first to specify a nonmatable pair of polygons. In this paper we provide proof of the nonmatability of a “simpler” pair of polygons, which is less complex than the example given by Gitlin et al. Furthermore, we provide a family of polygon pairs with unbounded complexity, which we believe to be nonmatable. We also give a few sufficient conditions for polygon matability.

  • articleNo Access

    POSSIBILITIES AND IMPOSSIBILITIES IN SQUARE-TILING

    A set of natural numbers tiles the plane if a square-tiling of the plane exists using exactly one square of sidelength n for every n in the set. From Ref. 8 we know that ℕ itself tiles the plane. From that and Ref. 9 we know that the set of even numbers tiles the plane while the set of odd numbers doesn't. In this paper we explore the nature of this property. We show, for example, that neither tiling nor non-tiling is preserved by superset. We show that a set with one or three odd numbers may tile the plane—but a set with two odd numbers can't.

    We find examples of both tiling and non-tiling sets that can be partitioned into tiling sets, non-tiling sets or a combination. We show that any set growing faster than the Fibonacci numbers cannot tile the plane.

  • articleNo Access

    Tiling the Plane with a Set of Ten Polyominoes

    There exists a linear algorithm to decide whether a polyomino tessellates the plane by translation only. On the other hand, the problem of deciding whether a set of 11 or more polyominoes can tile the plane by translation is undecidable. We narrow the gap between decidable and undecidable by showing that it remains undecidable for a set of 10 polyominoes, which partially solves a conjecture posed by Ollinger.

  • chapterNo Access

    BASIC PUZZLE LANGUAGES

    The emptiness problem for non-overlapping Basic puzzle grammars is shown to be decidable. An alternate proof of the decidability of the non-overlapping feature for basic puzzle grammars is given. Hierarchy among the various classes of puzzle languages is also established.

  • chapterNo Access

    SIROMONEY ARRAY GRAMMARS AND APPLICATIONS

    The Siromoney matrix model is a simple and elegant model for describing two-dimensional digital picture languages. The notion of attaching indices to nonterminals in a generative grammar, introduced and investigated by Aho, is considered in the vertical phase of a Siromoney matrix grammar (SMG). The advantage of this study is that the new model retains the simplicity and elegance of SMG but increases the generative power and enables us to describe pictures not generable by SMG. Besides certain closure properties and hierarchy results, applications of these two-dimensional grammars to describe tilings, polyominoes, distorted patterns and parquet deformations are studied.