Keyword: Load Imbalance : Search

Anywhere

Advanced Search

SEARCH GUIDE

Results: 1 - 2of2

Follow results:

refine search

Filters

per page:

Sort: Relevance

Context for search term 1Search term 1*

All Dates

LastSelect static range

Custom Range

Select starting monthSelect starting year

Select ending monthSelect ending year

Advanced

Search name	Searched On	Run search
Keyword: Load Imbalance (2)	5 Apr 2025	Run
Keyword: BMO Conjecture (1)	5 Apr 2025	Run
Keyword: CR Singularity (1)	5 Apr 2025	Run
Keyword: Nematodes Pest (1)	5 Apr 2025	Run
Keyword: Proton-induced (2)	5 Apr 2025	Run

articleFree Access
SPMSD: An Partitioning-Strategy for Parallel General Sparse Matrix-Matrix Multiplication on GPU
- Huanyu Cui,
- Nianbin Wang,
- Qilong Han, and
- Ye Wang
Parallel Processing Letters27 May 2024
Preview Abstract
SpGEMM (General Sparse Matrix-Matrix Multiplication) is one of the kernels of an algebraic multi-grid method, graph algorithm, and solving linear equations. Due to the non-uniformity of some sparse matrices, the existing parallel SpGEMM algorithms suffer from load imbalance, lead to a decrease in computational efficiency. This paper proposes a new algorithm, SPMSD (SpGEMM Based on Minimum Standard Deviation). The algorithm is developed based on a hash table and partition strategy. First, the number of intermediate results in the matrix is divided into multiple blocks based on a new partition strategy to ensure the minimum standard deviation among blocks. Second, the input matrix is transformed according to the result of the partition strategy. Finally, SPMSD performs the parallel computing of SpGEMM based on the advantages of fast insertion and also fast access storage of the hash table and the calculation process controls the insertion and merging of intermediate results according to the offset to avoid the shortage of atomic operations. These experiments indicate the execution of SPMSD is faster than the existing cuSPARSE libraries by 7.4x. Compared with the Out of Core method, SPMSD improves the computational performance by 1.2x, SPMSD memory utilization is decreased by 0.19x.
articleNo Access
ON PROCESSING MULTI-JOINS IN PARALLEL SYSTEMS
- KIAN-LEE TAN and
- HONGJUN LU
Parallel Processing Letters01 Dec 1991
Preview Abstract
In parallel systems, a number of joins from one or more queries can be executed either serially or in parallel. While serial execution assigns all processors to execute each join one after another, parallel execution distributes the joins to clusters formed by certain numbers of processors and executes them concurrently. However, data skew may result in load imbalance among processors executing the same join and some clusters may be overloaded with more expensive joins. As a result, the completion time will be much longer than what is expected. In this paper, we propose an algorithm to further minimize the completion time of concurrently executed multiple joins. For this algorithm, all the joins to be executed concurrently are decomposed into a set of tasks that are ordered according to decreasing task size. These tasks are dynamically acquired by available processors during execution. Our performance study shows that the proposed algorithm outperforms previously proposed approaches, especially when the number of processors increases, the relations are highly skewed and relation sizes are large.