Keyword: Prefix Computation : Search

Anywhere

Advanced Search

SEARCH GUIDE

Results: 1 - 4of4

Follow results:

refine search

Filters

per page:

Sort: Relevance

Context for search term 1Search term 1*

All Dates

LastSelect static range

Custom Range

Select starting monthSelect starting year

Select ending monthSelect ending year

Advanced

Search name	Searched On	Run search
Keyword: Benchmark Problems (3)	25 Mar 2025	Run
Keyword: Prefix Computation (4)	25 Mar 2025	Run
Keyword: Surrogate Function (1)	25 Mar 2025	Run
Keyword: Bound State Energy (3)	25 Mar 2025	Run
Keyword: Structural Defects (6)	25 Mar 2025	Run

articleNo Access
DATA MOVEMENT TECHNIQUES ON RECONFIGURABLE MESHES, WITH APPLICATIONS
- STEPHAN OLARIU,
- JAMES L. SCHWING, and
- JINGYUAN ZHANG
International Journal of High Speed Computing01 Jun 1994
Preview Abstract
Data movement operations are central to many efficient algorithms for parallel machines constructed as interconnection networks of processors. The purpose of this work is to present a number of basic data movement techniques for reconfigurable meshes. These include computing the prefix maxima of a sequence of real numbers and computing the prefix sums of a binary sequence. The data movement techniques presented in this paper have far-reaching applications. Among them, we present a new sorting algorithm and an efficient algorithm to find all maximal elements of a planar set of points.
articleNo Access
AN IMPROVED PARALLEL PREFIX ALGORITHM ON OTIS-MESH
- PRASANTA K. JANA and
- BHABANI P. SINHA
Parallel Processing Letters01 Dec 2006
Preview Abstract
Wang and Sahni [4] reported two parallel algorithms for N-point prefix computation on an N-processor OTIS-Mesh optoelectronic computer. The overall time complexity for both SIMS and MIMD models of their first algorithm was shown to be (8 N^1/4 - 1) electronic moves and 2 OTIS moves. This was further reduced to (7 N^1/4 - 1) electronic moves and 2 OTIS moves in their second algorithm. We present here an improved parallel algorithm for N-point prefix computation on an N-processor OTIS-Mesh, which needs (5.5 N^1/4 + 3) electronic moves and 2 OTIS moves. Our algorithm is based on the general theme of parallel prefix algorithm proposed in [4] but following the data distribution and local prefix computation similar to that of [1].
articleNo Access
OPTIMIZATION OF LINKED LIST PREFIX COMPUTATIONS ON MULTITHREADED GPUS USING CUDA
- ZHENG WEI and
- JOSEPH JAJA
Parallel Processing Letters01 Dec 2012
Preview Abstract
We present a number of optimization techniques to compute prefix sums on linked lists and implement them on the multithreaded GPUs Tesla C1060, Tesla C2050, and GTX480 using CUDA. Prefix computations on linked structures involve in general highly irregular fine grain memory accesses that are typical of many computations on linked lists, trees, and graphs. While the current generation of GPUs provides substantial computational power and extremely high bandwidth memory accesses, they may appear at first to be primarily geared toward streamed, highly data parallel computations. In this paper, we introduce an optimized multithreaded GPU algorithm for prefix computations through a randomization process that reduces the problem to a large number of fine-grain computations. We map these fine-grain computations onto multithreaded GPUs in such a way that the processing cost per element is shown to be close to the best possible. Our experimental results show scalability for list sizes ranging from 1M nodes to 256M nodes, and significantly improve on the recently published parallel implementations of list ranking, including implementations on the Cell Processor, the MTA-8, and the NVIDIA GT200 and Fermi series. They also compare favorably to the performance of the best known CUDA algorithm for the scan operation on the Tesla C1060 and GTX480.
articleNo Access
Fast Parallel Algorithm for Prefix Computation in Multi-Mesh Architecture
- Amit Datta,
- Mallika De, and
- Bhabani P. Sinha
Parallel Processing Letters01 Dec 2017
Preview Abstract
A parallel algorithm for prefix computation on $N$ data elements mapped on a Multi Mesh (MM) network of $N = n^{4}$ processing elements is presented here. The time required by the proposed algorithm is significantly less than that by any of the existing algorithms for prefix computation on mesh-like architectures due to the specific interconnection pattern used in the MM network. The proposed technique requires $O (N^{1 / 4})$ time for data communication and $O (log N^{1 / 4})$ time for computation, when mapped on a MM network constituted by $N^{1 / 2}$ meshes, each of size $N^{1 / 4} \times N^{1 / 4}$ . The data communication time in the proposed algorithm is less than the prefix sum algorithm proposed in extended Multi Mesh. To be precise, instead of $(13 N^{1 / 4} - 5) τ$ communication time the proposed algorithm requires a data communication time of $7.5 N^{1 / 4} τ$ only. Moreover, the proposed parallel algorithm does not need any extra inter block links as used in the extended Multi Mesh.