Keyword: Pipelining : Search

Anywhere

Advanced Search

SEARCH GUIDE

Results: 1 - 4of4

Follow results:

refine search

Filters

per page:

Sort: Most Cited

Context for search term 1Search term 1*

All Dates

LastSelect static range

Custom Range

Select starting monthSelect starting year

Select ending monthSelect ending year

Advanced

Search name	Searched On	Run search
Keyword: Pipelining (4)	28 Mar 2025	Run
Keyword: Document Categorization (3)	28 Mar 2025	Run
Keyword: Accounting Transparency (1)	28 Mar 2025	Run
Keyword: Boundary Contour Method (1)	28 Mar 2025	Run
Keyword: Requirement Engineering (5)	28 Mar 2025	Run

articleNo Access
HIGH-PERFORMANCE MATHEMATICAL FUNCTIONS FOR SINGLE-CORE ARCHITECTURES
- MD. HAIDAR SHARIF
Journal of Circuits, Systems and Computers01 Apr 2014
Preview Abstract
Nowadays high-performance computing (HPC) architectures are designed to resolve assorted sophisticated scientific as well as engineering problems across an ever intensifying number of HPC and professional workloads. Application and computation of key trigonometric functions sine and cosine are in all spheres of our daily life, yet fairly time consuming task in high-performance numerical simulations. In this paper, we have delivered a detailed deliberation of how the micro-architecture of single-core Itanium^® and Alpha 21264/21364 processors as well as the manual optimization techniques improve the computing performance of several mathematical functions. On describing the detailed algorithm and its execution pattern on the processor, we have confirmed that the processor micro-architecture side by side manual optimization techniques ameliorate computing performance significantly as compared to not only the standard math library's built-in functions with compiler optimizing options but also Intel^® Itanium^® library's highly optimized mathematical functions.
articleNo Access
A Fault Tolerant Parallelism Approach for Implementing High-Throughput Pipelined Advanced Encryption Standard
- Hadi Mardani Kamali and
- Shaahin Hessabi
Journal of Circuits, Systems and Computers21 Jun 2016
Preview Abstract
Advanced Encryption Standard (AES) is the most popular symmetric encryption method, which encrypts streams of data by using symmetric keys. The current preferable AES architectures employ effective methods to achieve two important goals: protection against power analysis attacks and high-throughput. Based on a different architectural point of view, we implement a particular parallel architecture for the latter goal, which is capable of implementing a more efficient pipelining in field-programmable gate array (FPGA). In this regard, all intermediate registers which have a role for unrolling the main loop will be removed. Also, instead of unrolling the main loop of AES algorithm, we implement pipelining structure by replicating nonpipelined AES architectures and using an auto-assigner mechanism for each AES block. By implementing the new pipelined architecture, we achieve two valuable advantages: (a) solving single point of failure problem when one of the replicated parts is faulty and (b) deploying the proposed design as a fault tolerant AES architecture. In addition, we put emphasis on area optimization for all four AES main functions to reduce the overhead associated with AES block replication. The simulation results show that the maximum frequency of our proposed AES architecture is 675.62MHz, and for AES128 the throughput is 86.5Gbps which is 30.9% better than its closest existing competitor.
articleNo Access
Improved Synthesis of Generalized Parallel Counters on FPGAs Using Only LUTs
- Burhan Khurshid
Journal of Circuits, Systems and Computers23 Aug 2017
Preview Abstract
Generalized parallel counters (GPCs) are frequently used to construct high speed compressor trees on field programmable gate arrays (FPGAs). The introduction of fast carry-chain in FPGAs has greatly improved the performance of these elements. Evidently, a large number of GPCs have been proposed in literature that use a combination of look-up tables (LUTs) and carry-chains. In this paper, we take an alternate approach and try to eliminate the carry-chain from the GPC structure. We present a heuristic that aims at synthesizing GPCs on FPGAS using only the general LUT fabric. The resultant GPCs are then easily pipelined by placing registers at the output node of each LUT. We have used our heuristic on various GPCs reported in prior work. Our heuristic successfully eliminates the carry-chain from the GPC structure with an increase in LUT count in some GPCs. Experimentation using Xilinx FPGAs shows that filter systems constructed using our GPCs show an improvement in speed and power performance and a comparable area performance.
articleNo Access
Design and Implementation of Face Detection Architecture for Heterogeneous System-on-Chip
- Nidhi Panda and
- Supratim Gupta
Journal of Circuits, Systems and Computers13 Aug 2022
Preview Abstract
The seminal work of Viola and Jones for automatic face detection is widely used in many human–computer interaction and computer vision applications. On analyzing the existing face detection architectures, we observed that integral image calculation, feature computation in cascaded classifier, and recursive scanning of image with sliding window at multiple scales are the major reasons which increase the memory and time complexity of the algorithm. Therefore, in this paper, we have proposed a hardware–software co-design of Viola–Jones face detector for System-on-Chip (SoC). In the proposed architecture, integral image computation and cascaded classifier sub-modules are implemented on the hardware — Programmable Logic FPGA (PL-FPGA), while the image scaling and nonmaximum suppression sub-modules are implemented on the software — Processing System ARM (PS-ARM). Concepts of pipelining, folding, and parallel processing are effectively utilized to produce an optimum design architecture. The proposed architecture has been tested on PYNQ-Z1 board. The implementation results in a processing speed of $95$ fps with PL and PS clocks of $100$ MHz and $650$ MHz, respectively, for an image of QVGA resolution. Results analysis demonstrates that the proposed architecture has minimum resource requirement as compared to state-of-the-art implementations, which facilitates and promotes the usage of resource-constrained low-cost ZYNQ SoC for face detection.