Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Many applications which require high speed evaluation of polynomials and exponentials of polynomials can now be implemented in the hardware very efficiently because of the advances in VLSI technology. Several fast algorithms have been proposed in the recent past for the efficient evaluation of polynomials and exponentials of polynomials for equispaced arguments on uniprocessor systems. In this paper, we consider the problem of organizing this evaluation on VLSI chips in the form of systolic arrays. We present linear fault tolerant systolic arrays which can evaluate the polynomials and exponentials of polynomials of any degree for a large number of equispaced points. These organizations have the main advantage that the interconnections between the processing elements are very regular and simple, and hence are very appropriate for VLSI implementation.
Starting with a system of uniform recurrence equations (source UREs) for a systolic design, we provide a method that allows a specification of control signals by another system of UREs (control UREs), which has no notion of time and space. Necessary and sufficient conditions for the correctness of the control UREs are stated. The standard space-time mapping technique is extended so that systolic arrays with a description of both data and control flow are directly synthesised from the source and control UREs. Optimisations of control signals are discussed. An example is provided.
In this paper, we derive array processors which are synthesized from uniform but non-local DAGs. We introduce limited broadcast facilities that can be adjusted to cope with current integration constraints. Our target problem is the Algebraic Path Problem, whose implementation on systolic architectures has been studied by numerous authors. We show how to compare, to classify and to improve several proposed architectures.
In this paper, we propose a general method for mapping Affine Recurrence Equations (ARE's) on a Multi-Rate Array (MRA). We show that the mapping is valid if and only if it maps the problem domain on the null space of the matrix (D − I) along the range space of this matrix (D is the dependency matrix, I is the identity matrix). We also prove that in the resulting array, variables are flowing at different speeds.
We are interested in methods which compute the inverse of a triangular matrix A of order n by solving the n linear systems Ax=ei, i=1,…, n, where ei is the i-th element of the canonical basis of Rn. More precisely, we consider the dependence graph associated with algorithms where the entries of matrix A are read only once and used in pipeline for the solution of these systems. We exhibit a new scheduling which induces an algorithm with time complexity T*=2n−1. The number n2/8+O(n) of processors required by this scheduling improves the best previously known bound n2/6+O(n), and is quite close to the lower bound n2/8.5+O(n).
New type of m-ary systolic arrays called reversible systolic arrays is introduced in this paper. The m-ary quantum systolic architectures' realizations and computations of the new type of systolic arrays are also introduced. A systolic array is an example of a single-instruction multiple-data (SIMD) machine in which each processing element (PE) performs a single simple operation. Systolic devices provide inexpensive but massive computation power, and are cost-effective, high-performance, and special-purpose systems that have wide range of applications such as in solving several regular and compute-bound problems containing repetitive multiple operations on large arrays of data. Similar to the classical case, information in a reversible and quantum systolic circuit flows between cells in a pipelined fashion, and communication with the outside world occurs only at the boundary cells. Since basic PEs used in the construction of arithmetic systolic arrays are the add–multiply cells, the results introduced in this paper are general and apply to a very wide range of add–multiply-based systolic arrays. Since the reduction of power consumption is a major requirement for the circuit design in future technologies, such as in quantum computing, the main features of several future technologies will include reversibility. Consequently, the new systolic circuits can play an important task in the design of future circuits that consume minimal power. It is also shown that the new systolic arrays maintain the high level of regularity while exhibiting the new fundamental bijectivity (reversibility) and quantum superposition properties. These new properties will be essential in performing super-fast arithmetic-intensive computations that are fundamental in several future applications such as in multi-dimensional quantum signal processing (QSP).
The implementation of regular iterative algorithms (RIAs) in important scientific fields such as image processing, computer arithmetic, cryptography and their implementation in processor arrays architectures, has been extensively studied over the last three decades. Numerous design methodologies and tools have been proposed, mostly targeting custom very large scale integration (VLSI) chips. The advent of field-programmable gate arrays (FPGAs) has attracted many researchers to incorporate previously acquired knowledge and experience in designing VLSI chips, to this new technology. This paper addresses the issue of the implementation of regular algorithms into FPGAs and presents a novel design tool for the implementation of RIAs, formulated as dependence graphs (DGs), on systolic arrays. Furthermore, a platform scheme for the systolic arrays hardware realization is proposed.
This paper analyzes the design automation of embedded Systolic Array Processors (SAPs), into large scale Field Programmable Gate Array (FPGA) devices. SAPs are hardware implementations of a class of iterative, high-level language algorithms, for applications where the high-speed of processing has the principal meaning of a design. Embedding SAPs onto FPGAs is a complex process. The optimization phase in this process reduces the SAP significantly, thus less FPGA area is occupied by the embedded design, without any loss in the final performance. The present paper examines the effect of Projection Vectors (PVs) and Task Scheduling Vectors (TSVs) on the optimization process. Two optimization approaches are examined, namely technology mapping using FlowMap and Flowpack algorithms and optimization via logic synthesis using Xilinx Synthesis Tool. The multiplication of matrices, with entries being up to 32-bit integer vectors, has been taken as a sample space for the experiments conducted. The results, confirm that the selection of PV and TSV greatly affects the number of input/output signal connections of the FPGA, while the selection of an optimization approach affects the final number of logic resources occupied on the targeted device.
The objective of this paper is to discuss systolic arrays (SAs) that are suitable for regular three-nested loop algorithms implementation and which enable the possibility of high dependability calculations for the SAs obtained in this way. This is made by considering the different possible values of flow period of processor for SAs synthesized on adaptable algorithms. Therefore, the algorithm for two matrix multiplication is one typical adaptable algorithm obtained results illustrated in the end of this paper by the examples of two rectangular matrix multiplication realized with the so called flowing and hexagonal two dimensional 2D SAs of planar SAs group.