Loading [MathJax]/jax/output/CommonHTML/jax.js
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  • articleNo Access

    Automatic Data Layout Transformations in the ExaStencils Code Generator

    Performance optimizations should focus not only on the computations of an application, but also on the internal data layout. A well-known problem is whether a struct of arrays or an array of structs results in a higher performance for a particular application. Even though the switch from the one to the other is fairly simple to implement, testing both transformations can become laborious and error-prone. Additionally, there are more complex data layout transformations, such as a color splitting for multi-color kernels in the domain of stencil codes, that are manually difficult. As a remedy, we propose new flexible layout transformation statements for our domain-specific language ExaSlang that support arbitrary affine transformations. Since our code generator applies them automatically to the generated code, these statements enable the simple adaptation of the data layout without the need for any other modifications of the application code. This constitutes a big advance in the ease of testing and evaluating different memory layout schemes in order to identify the best.

  • articleNo Access

    Reconfigurable Hardware Generation of Multigrid Solvers with Conjugate Gradient Coarse-Grid Solution

    Not only in the field of high-performance computing (HPC), field programmable gate arrays (FPGAs) are a soaringly popular accelerator technology. However, they use a completely different programming paradigm and tool set compared to central processing units (CPUs) or even graphics processing units (GPUs), adding extra development steps and requiring special knowledge, hindering widespread use in scientific computing. To bridge this programmability gap, domain-specific languages (DSLs) are a popular choice to generate low-level implementations from an abstract algorithm description. In this work, we demonstrate our approach for the generation of numerical solver implementations based on the multigrid method for FPGAs from the same code base that is also used to generate code for CPUs using a hybrid parallelization of MPI and OpenMP. Our approach yields in a hardware design that can compute up to 11 V-cycles per second with an input grid size of 4096×4096 and solution on the coarsest using the conjugate gradient (CG) method on a mid-range FPGA, beating vectorized, multi-threaded execution on an Intel Xeon processor.

  • articleNo Access

    CONSTRUCTION OF DO LOOPS FROM SYSTEMS OF AFFINE CONSTRAINTS

    Most parallelization techniques for formula loop nests are based on reindexation. Reindexation yields a new iteration space, which is a convex integer polyhedron defined by a set of affine constraints. Parallel code generation thus needs to scan all the integer points of this convex, thereby requiring the construction of a new formula loop nest. We detail an algorithm to this purpose, which relies on a parametrized version of the Dual Simplex. We show how the resulting loop nest and especially the loop bounds can be kept simple, thus reducing the control overhead of parallelization to a minimum.

  • articleNo Access

    Communication Generation for Block-Cyclic Distributions

    Data-parallel languages such as High Performance Fortran, Vienna Fortran and Fortran D include directives such as alignment and distribution that describe how data and computation are mapped onto the processors in a distributed-memory multiprocessor. A compiler for HPF that generates code for each processor has to compute the sequence of local memory addresses accessed by each processor and the sequence of send and receives for a given processor to access non-local data. In this paper, we present a novel approach for the generation of communication sets that exploits a pttern of send-receive index pairs. In addition, we present an algorithm for code generation. Experimental results demonstrate the viability of this technique.

  • articleNo Access

    On Tiling as a Loop Transformation

    This paper is a follow-up Irigoin and Triolet's earlier work and our recent work on tiling. In this paper, tiling is discussed in terms of its effects on the dependences between tiles, the dependences within a tile and the required dependence test for legality. A necessary and sufficient condition is given for enforcing the data dependences of the program, while Irigion and Triolet's atomic tile constraint is only sufficient. A condition is identified under which both Irigoin and Triolet's and our constraints are equivalent. The results of this paper are discussed in terms of their impact on dependence abstractions suitable for legality test and on tiling to optimise a certain given goal.

  • articleNo Access

    RESOURCE-FOCUSED TOOLCHAIN FOR RAPID PROTOTYPING OF EMBEDDED SYSTEMS

    This paper introduces the RaPTEX toolchain and its use for rapid prototyping and evaluation of embedded communication systems. This toolchain is unique for several reasons. First, by using static code analysis techniques, it is able to predict both the typical case and bounds for resource usage, such as computational, memory (both static and dynamic), and energy requirements. Second, it provides a graphical user interface with configurable software building blocks which allows easy creation and customization of protocol stacks. Third, it targets low-cost, low-energy hardware, allowing the creation of low-cost systems. We demonstrate the RaPTEX toolchain by evaluating different design options for an experimental ultrasonic communication system for biotelemetry in extremely shallow waters. The power, size, mass, and cost constraints of this application make it critical to pack as much processing into the available resources as possible. The RaPTEX toolchain analyzes resource use, enabling the system to safely operate closer to the edge of the resource envelope. The toolchain also helps users with the rapid prototyping of communication protocols by providing users with quick feedback on resource requirements. We demonstrate the use and output of the toolchain. We compare the accuracy of its predictions against measurements of the real hardware.

  • articleNo Access

    AUTOMATIC PARALLEL CODE GENERATION FOR NUFFT DATA TRANSLATION ON MULTICORES

    The nonuniform FFT (NuFFT) is widely used in many applications. Focusing on the most time-consuming part of the NuFFT computation, the data translation step, in this paper, we develop an automatic parallel code generation tool for data translation targeting emerging multicores. The key components of this tool are two scalable parallelization strategies, namely, the source-driven parallelization and the target-driven parallelization. Both these strategies employ equally sized geometric tiling and binning to improve data locality while trying to balance workloads across the cores through dynamic task allocation. They differ in the partitioning and scheduling schemes used to guarantee mutual exclusion in data updates. This tool also consists of a code generator and a code optimizer for the data translation. We evaluated our tool on a commercial multicore machine for both 2D and 3D inputs under different sample distributions with large data set sizes. The results indicate that both parallelization strategies have good scalability as the number of cores and the number of dimensions of data space increase. In particular, the target-driven parallelization outperforms the other when samples are nonuniformly distributed. The experiments also show that our code optimizations can bring about 32%–43% performance improvement to the data translation step of NuFFT.

  • articleNo Access

    Linking Design Model with Code

    With the growing in size and complexity of modern computer systems, the need for improving the quality at all stages of software development has become a critical issue. The current software production has been largely dependent on manual code development. Despite the slow development process, the errors introduced by the programmers contribute to a substantial portion of defects in the final software product. Model-driven engineering (MDE), despite having many advantages, is often overlooked by programmers due to lack of proper understanding and training in the matter. This paper investigates the advantages and disadvantages of MDE and looks at research results showing the adoption rates of design models. It analyzes different tools used for automated code generation and displays the reasons that led to technical decisions such as the programming language or design model used. In light of the findings, an educational tool, namely Lorini, was developed to provide automated code generation from the design models. The implemented tool consists of a plug-in for the Astah framework aimed at teaching Java programming to students through UML diagrams. It features instantaneous code generation from three types of UML diagrams, code-diagram matching, a feedback panel for error displays and on-the-fly compilation and execution of the resulting program. We also explore the possibility of generating assertion constraints from the design model and use them to verify the implementation. Evaluation of the tool indicated it to be successful with unique educational features and intuitive to use.

  • articleNo Access

    Native Code Generation as a Service

    With the widespread use of mobile applications in daily life, it has become crucial for enterprise software companies to quickly develop these applications for multiple platforms. Cross-platform mobile application development is one of the most adopted solutions for rapid development. Since most of these solutions do not generate native code for the underlying platform, the artefacts generally do not satisfy the requirements defined at the beginning of the project. This study designed and implemented a native code generation framework called Nativator built as a cloud service. The framework, which is capable of producing native code for iOS and Android platforms using web-based user interfaces, was implemented based on an open source compiler platform called “Roslyn”. Four case studies were performed to analyze the execution performance of the applications built with the proposed framework. The experimental results demonstrated that the execution performance of the applications built with Nativator is comparable with the applications generated via the state-of-the-art mobile application development framework called Xamarin. Because this framework was implemented as a cloud service, it has several advantages over traditional approaches such as access from anywhere, no installation and flexible and more resources from cloud infrastructure.

  • articleNo Access

    Code Generation with Hybrid of Structural and Semantic Features Retrieval

    Due to the growing need for faster software delivery, code generation has attracted more and more attention, since it could improve code maintainability by providing suggestions for coding. In the model of generating program source code from natural language (NL), the most effective method is to generate an intermediate architecture (such as Abstract Syntax Tree) combined with a deep learning model. However, these models have the following drawbacks: (1) The data structural information is underutilized and the correlation between samples is not considered. (2) Lack of the ability to memorize large and complex structures, so that complex codes cannot be generated correctly. To address these issues, we propose HRCODE model, a code generation architecture based on Hybrid of structural and semantic features Retrieval CODE model. We transform the NL description into an intermediate structure with structural features. Then, the NL and the intermediate structure are embedded into a vector through weight mixing, and we calculate the similarity score between each vector to retrieve the most relevant samples. Finally, the new input is brought into the PLBART model to generate code. Experiments show that HRCODE is at least 4.7% higher than the state-of-the-art models in the ACC metric and at least 10.3% higher in the BLEU-4 score. We have released our code at https://github.com/jesokang/HRCODE.

  • articleNo Access

    Generation of C++ Code from Isabelle/HOL Specification

    Automatic code generation plays an important role in ensuring the reliability and correctness of software programs. Reliable programs can be obtained automatically from verified program specifications by code generators. The target languages of the existing code generators are mainly functional languages, which are relatively less used than C/C++. As C/C++ is widely used in the industry and many fundamental software facilities and the correctness verification of C/C++ programs is difficult and cumbersome, this paper provides an automatic conversion framework that allows to generate C++ implementation from verified Isabelle/HOL specifications. The framework is characterized by combining the verification convenience of Isabelle/HOL and the efficiency of C++. Since the correctness of the functional Isabelle/HOL specification can be guaranteed by interactive proofs, the correctness of the relevant generated C++ implementation can also be maintained.

  • articleNo Access

    CodeGen-Search: A Code Generation Model Incorporating Similar Sample Information

    Code generation has a positive significance in supporting software development, reducing labor intensity, and improving development efficiency. Some scholars use similar code information to enhance the quality of code generation. However, to improve the efficiency and accuracy of programming in daily development tasks, developers often search for similar samples as references. They get the code’s syntactic structure and semantic information from similar samples to assist in programming development. Inspired by this, we argue that similar samples are helpful for code generation. This paper proposes a CodeGen-Search model to improve code generation quality by incorporating similar samples. To fully utilize the information of similar samples, the model adopts the “pre-training + fine-tuning” pattern. The model uses a minimum edit distance algorithm to find some similar samples with natural language (NL), and uses different encoders to extract the features of the NL and the code in similar samples. Experimental results show that our model efficiently improves the quality of the generated code. Compared to the state-of-the-art model, the CodeGen-Search model improves the BLEU by 1.5%, the Rough by 0.8% on the HS dataset, and the StrAcc by 0.5% on the ATIS dataset.

  • articleNo Access

    Assessing the Use of GitHub Copilot on Students of Engineering of Information Systems

    This study examines the impact of AI programming assistants like GitHub Copilot and ChatGPT on software engineering efficiency, an area that has seen limited empirical research. We experimentally evaluated the performance of programmers (n=16) in Python coding tasks with and without AI assistance, measuring time-to-completion and feature implementation. Results indicate that participants utilizing AI assistance completed tasks significantly faster (p = 0.033) and implemented more required features (p = 0.012) compared to those relying solely on unaided coding. These findings offer empirical insights into the integration of AI tools in software development workflows, highlighting their potential to enhance efficiency without compromising code quality or completeness, with implications for organizational pipelines and practitioner skills. Responses to exit surveys suggest that participants without IA tools assistance encountered frustrations related to code recall, time constraints, and problem-solving, while assisted participants reported no negative experiences, focusing instead on successful completion of tasks within the allotted time.

  • articleNo Access

    MAPPING REFERENCE CODE TO IRREGULAR DSPS WITHIN THE RETARGETABLE, OPTIMIZING COMPILER COGEN(T)

    Generating high quality code for embedded processors is made difficult by irregular architectures and highly encoded parallel instructions. Rather than dealing with the target machine at every stage of the compilation, a promising new methodology employs generic algorithms to optimize code for an idealized abstraction of the true target machine. This code, called reference code, is then mapped to the real instruction set by enhanced genetic algorithms. One perturbs the original schedule to find a number of alternative (parallel) instruction sequences, and the other evolves feasible register assignments, if possible, for each sequence. This paper describes the strategy for mapping idealized code into actual code. The COGEN(T) system employs this methodology to produce good code for different commercial DSPs and ASIPs.

  • articleNo Access

    A MEMETIC ALGORITHM FOR PERFORMING MEMORY ASSIGNMENT IN DUAL-BANK DSPS

    To increase memory bandwidth, many programmable Digital-Signal Processors (DSPs) employ two on-chip data memories. This architectural feature supports higher memory bandwidth by allowing multiple data memory accesses to occur in parallel. Exploiting dual memory banks, however, is a challenging problem for compilers. This, in part, is due to the instruction-level parallelism, small numbers of registers, and highly specialized register capabilities of most DSPs. In this paper, we present a new methodology based on a Memetic Algorithm (MA) for assigning data to dual-bank memories. Our approach is global, and integrates several important issues in memory assignment within a single model. Special effort is made to identify those data objects that could potentially benefit from an assignment to a specific memory, or perhaps duplication in both memories. Our computational results show that the MA is able to achieve a 54% reduction in the number of memory cycles and a reduction in the range of 7%–42% in the total number of cycles when tested with well-known DSP kernels and applications. Our computational results also show that, when compared with the Genetic Algorithm in Ref. 3, the memetic algorithm is able to find solutions that, on average, have 7%–20% less cost, with the biggest improvements being found for larger problem instances.