To combat PCB design reverse engineering and protect intellectual property, a method of obfuscation is developed. Noise generating traces are placed alongside the true circuit, separated by false non-conductive vias visually indistinguishable from regular ones. If the false vias are copied as real conductive vias in illegitimate replications, signals will be distorted, leading to poor performance. The deliberate placement of the generators separated by false vias is tested using a machine learning algorithm (MLA) employing a signal integrity database to determine the most vulnerable traces in the PCB based on their physical properties and interconnections.
We report the reconstruction of the topology of gene regulatory network in human tissues. The results show that the connectivity of the regulatory gene network is characterized by a scale-free distribution. This result supports the hypothesis that scale-free networks may represent the common blueprint for gene regulatory networks.
It has long been interest to control the transfer of population between specified quantum states and protect the coherence of the system at the same time. In this paper, we investigate a scheme to improve the strategy of state transfer for open quantum systems using no-knowledge measurement-based feedback control and reverse engineering. In order to ensure that the system can process information effectively, we first design the control pulse in advance from the perspective of population and coherence and then verify it through numerical simulations. The research results show that, based on the designed control pulse, we can indeed drive the system from any initial state to the desired target state, and the coherence of the system can be effectively protected during the state transition.
This research revolves around the ventilation system of range hoods and discusses the optimization of the flow field from the perspectives of reverse engineering (RE), the Taguchi method (TM), and computer aided engineering (CAE). These were integrated to develop an impeller system with an optimized air discharge volume. Analyses show that the arc length of the blades and the deflection angle of the impellers are the most prominent factors affecting the volume of the air discharge. A maximum air discharge volume was achieved during the experiment with 54 blades and a deflection angle of 65∘, which is an enhancement of 17.5% in comparison with that of the original design. In an environment with shortened product life cycles, the results from this research are expected to improve the air discharge efficiency of impellers and effectively reduce the time needed for the development of ventilation systems.
In this paper we present a method and the related tool for analysing Web site code in order to automatically reconstruct the underlying logical interaction design. Such design is represented through task models that describe how activities should be performed to reach users' goals. The models also include a specification of the objects that should be manipulated to accomplish such tasks. We also discuss how the result of this reverse engineering process can be provided as input to a number of tools for various purposes (model analysis, usability evaluation, user interface redesign for different interactive platforms).
Comprehension of an object-oriented (OO) system, its design and use of OO features such as aggregation, generalisation and other forms of association is a difficult task to undertake without the original design documentation for reference. In this paper, we describe the collection of high-level class metrics from the UML design documentation of five industrial-sized C++ systems. Two of the systems studied were libraries of reusable classes. Three hypotheses were tested between these high-level features and the low-level class features of a number of class methods and attributes in each of the five systems. A further two conjectures were then investigated to determine features of key classes in a system and to investigate any differences between library-based systems and the other systems studied in terms of coupling.
Results indicated that, for the three application-based systems, no clear patterns emerged for hypotheses relating to generalisation. There was, however, a clear (positive) statistical significance for all three systems studied between aggregation, other types of association and the number of methods and attributes in a class. Key classes in the three application-based systems tended to contain large numbers of methods, attributes, and associations, significant amounts of aggregation but little inheritance. No consistent, identifiable key features could be found in the two library-based systems; both showed a distinct lack of any form of coupling (including inheritance) other than through the C++ friend facility.
For software with nontrivial size and complexity, it is not feasible to manually perform architecture reconstruction. Therefore it is essential for the software architecture miner who is mining architecture from the existing software to have a well-defined software architecture reconstruction process that helps incorporate as much tool use as possible at the appropriate steps of architecture reconstruction. There are some existing software architecture reconstruction frameworks but they do not provide guidelines on how to systematically utilize tools to produce architecture views for a reconstruction purpose. In this paper, we propose a framework for tool-based software architecture reconstruction. This framework consists of a generic process for software architecture reconstruction and the steps to derive from it a concrete tool-based process to be used for actual architecture reconstruction. The architecture miner can use this framework to analyze source code for modifying source code as well as to reconstruct software architecture from source code.
The quality of a software system highly depends on its architectural design. High quality software systems typically apply expert design experience which has been captured as design patterns. As demonstrated solutions to recurring problems, design patterns help to reuse expert experience in software system design. They have been extensively applied in the industry. Mining the instances of design patterns from the source code of software systems can assist in the understanding of the systems and the process of re-engineering them. More importantly, it also helps to trace back to the original design decisions, which are typically missing in legacy systems. This paper presents a review on current techniques and tools for mining design patterns from source code or design of software systems. We classify different approaches and analyze their results in a comparative study. We also examine the disparity of the discovery results of different approaches and analyze possible reasons with some insight.
Developing and maintaining reliable object-oriented software requires a precise understanding of how individual classes must be used. Unfortunately, for many systems, especially those that are large, the available documentation is inadequate. Developers are left with incomplete information concerning the allowable set of call sequences that each class can accommodate. Techniques for reverse engineering this information and presenting it to developers in an intellectually scalable manner are critical.
In this paper, we present four contributions to address this challenge. First, we describe a runtime trace collection system for large C++ applications. Second, we present a methodology for reverse engineering interface protocols from collected trace data. Third, we present a scalable, tunable algorithm for generating compact specifications of these protocols. Finally, we present a detailed case study involving the Mozilla Necko library. We consider popular applications in common use constructed using this library. The results are promising both in terms of the performance of the approach and the utility of the identified protocols.
One of the most powerful tools in the hacker's reverse engineering arsenal is the virtual machine. These systems provide a simple mechanism for executing code in an environment in which the program can be carefully monitored and controlled, allowing attackers to subvert copy protection and access trade secrets. One of the challenges for anti-reverse engineering tools is how to protect software within such an untrustworthy environment. From the perspective of a running program, detecting an emulated environment is not trivial: the attacker can emulate the result of different operations with arbitrarily high fidelity. This paper demonstrates a mechanism that is able to detect even carefully constructed virtual environments by focusing on the stochastic variation of system call timings. A statistical technique for detecting emulated environments is presented, which uses a model of normal system call behavior to successfully identify two commonly used virtual environments under realistic conditions.
To better understand and exploit the knowledge necessary to comprehend and evolve an existing system, different models can be extracted from it. Models represent the extracted information at various abstraction levels, and are useful to document, maintain, and reengineer the system. The Knowledge Discovery Metamodel (KDM) has been defined by the object management group as a meta-model supporting a large share of reverse engineering activities. Its specification has also been adopted by the ISO in 2012. This paper explores and describes alternative meta-models proposed in the literature to support reverse engineering, program comprehension, and software evolution activities. We focus on the similarity and differences of the alternative meta-models with KDM, trying to understand the potentials of reciprocal information interchange. We describe KDM and other five meta-models, plus their extensions available in the literature and their diffusion in the reverse engineering community. We also investigate the approaches using KDM and the five meta-models. In the paper, we underline the limited reuse of models for reverse engineering, and identify potential directions for future related research, to enhance the existing models and ease the exchange of information among them.
A substantial effort, in general, is required for understanding APIs of application frameworks. High-quality API documentation may alleviate the effort, but the production of such documentation still poses a major challenge for modern frameworks. To facilitate the production of framework instantiation documentation, we hypothesize that the framework code itself and the code of existing instantiations provide useful information. However, given the size and complexity of existent code, automated approaches are required to assist the documentation production. Our goal is to assess an automated approach for constructing relevant documentation for framework instantiation based on source code analysis of the framework itself and of existing instantiations. The criterion for defining whether documentation is relevant would be to compare the documentation with an traditional framework documentation, considering the time spent and correctness during instantiation activities, information usefulness, complexity of the activity, navigation, satisfaction, information localization and clarity. We propose an automated approach for constructing relevant documentation for framework instantiation based on source code analysis of the framework itself and of existing instantiations. The proposed approach generates documentation in a cookbook style, where the recipes are programming activities using the necessary API elements driven by the framework features. We performed an empirical study, consisting of three experiments with 44 human subjects executing real framework instantiations aimed at comparing the use of the proposed cookbooks to traditional manual framework documentation (baseline). Our empirical assessment shows that the generated cookbooks performed better or, at least, with non-significant difference when compared to the traditional documentation, evidencing the effectiveness of the approach.
In Model Driven Software Engineering (MDSE), Action Language for Foundational UML (ALF) is a new standard for specifying the structure and behavior of a system textually. To update/transform existing systems with respect to advance business needs and/or by the change in the dependent technology, this standard can play a vital role in reverse engineering a system for technology change. In this paper, using ALF, we propose a reverse engineering approach for transforming object oriented system. Our work is the first attempt to use ALF in reverse engineering. Using a case study (an ATM system) of significant size developed in C++, we validate the feasibility of our approach. In this paper, to support our approach by a computer application, we created a tool CPP2ALF; this tool converts the C++ code to srcML code by using a third party srcML-tool and then generates the ALF code by using the generated srcML code.
Binary code analysis is vital in source code unavailable cases, such as malware analysis and software vulnerability mining. Its first step could be function identification. Most function identification methods are based on function prologs/epilogs. However, functions may not have standard prologs/epilogs. To identify these functions, we need to use other methods. One approach is to identify return instructions first and then identify the start of a function. Currently, the multi-layer perceptron model is exploited to identify and validate a return instruction at a specific location. On this basis, a new approach is proposed to improve accuracy and provide more details. Specifically, a return instruction is classified into three classes: (1) false return instruction, (2) true return instruction inner a function but not the last instruction, and (3) true return instruction at the end of a function. The evaluation is performed on 5782 real-world binaries. Meanwhile, common classifiers including fully connected neural network, Two-layer Bidirectional Recurrent Neural Network (TBRNN), Two-layer Bidirectional Gate Recurrent Unit (TBGRU), Two-layer Bidirectional Long Short-term Memory Network (TBLSTM), Decision Tree, Random Forest, XGBoost, and Support Vector Machine (SVM) are evaluated on the same data set. The result shows that TBLSTM achieves an accuracy of 99.78%, which is higher than that of other classifiers in the evaluation, including the state-of-the-art tool IDA Pro 7.7.
As software is increasingly used to control safety-critical systems, correctness becomes paramount. Formal methods in software development provide many benefits in the forward engineering aspect of software development. Reverse engineering is the process of constructing a high-level representation of a system from existing lower level instanti-ations of that system. Reverse engineering of program code into formal specifications facilitates the utilization of the benefits of formal methods in projects where formal methods may not have previously been used, thus facilitating the maintenance of safety-critical systems.
Understanding the behavior of distributed applications is a very challenging task due to the complexity of these applications. To manage complexity, the top-down use of suitable abstraction hierarchies is frequently proposed. Given the complexity of distributed applications, manually deriving such abstraction hierarchies is not realistic. The execution of distributed applications is typically analyzed using an event-based approach. This paper discusses one tool that groups more primitive events into abstract events to derive a hierarchy of abstract events automatically. Ideally, these abstractions should reveal logical units of an application and their relations. To explore the abstraction hierarchies derived, an existing prototype visualization tool was modified to provide abstract visualizations. A user can navigate through these abstraction hierarchies, displaying an execution at various levels of abstraction. Examples of such abstract visualizations are given and discussed. In general, the abstractions derived automatically represent meaningful parts of the application: they can be interpreted in terms of the application domain. While the abstraction tool does not necessarily derive the best possible abstraction hierarchies in all cases, it performs the bulk of the work and provides good initial abstractions which can subsequently be refined manually.
Querying source code interactively for information is a critical task in reverse engineering of software. However, current source code query systems succeed in handling only small subsets of the wide range of queries possible on code, trading generality and expressive power for ease of implementation and practicality. We attribute this to the absence of clean formalisms for modeling and querying source code. In this paper, we present an algebraic framework (Source Code Algebra or SCA) that forms the basis of our source code query system. The benefits of using SCA include the integration of structural and flow information into a single source code data model, the ability to process high-level source code queries (command-line, graphical, relational, or pattern-based) by expressing them as equivalent SCA expressions, the use of SCA itself as a powerful low-level source code query language, and opportunities for query optimization. We present the SCA’s data model and operators and show that a variety of source code queries can be easily expressed using them. An algebraic model of source code addresses the issues of conceptual integrity, expressive power, and performance of a source code query system within a unified framework.
Program understanding can be enhanced using reverse engineering technologies. The understanding process is heavily dependent on both individuals and their specific cognitive abilities, and on the set of facilities provided by the program understanding environment. Unfortunately, most reverse engineering tools provide a fixed palette of extraction, selection, and organization techniques. This paper describes a programmable approach to reverse engineering. The approach uses a scripting language that enables users to write their own routines for common reverse engineering activities, such as graph layout, metrics, and subsystem decomposition, thereby extending the capabilities of the reverse engineering toolset to better suit their needs. A programmable environment supported by this approach subsumes existing reverse engineering systems by being able to simulate facets of each one.
A technique for reverse engineering graphical user interfaces (GUIs) produced with Xtoolkit source code is presented. Two independent graphical representations are automatically generated to assist GUI programmers in the development, testing, maintenance, and reengineering of X-based GUI source code. This capability to generate both structural and behavioral views has the potential to provide major improvements in the comprehensibility of X source code. Whereas generating widget instance trees to describe the structure of an X interface is common, the automatic generation of dialogue state diagrams to describe the behavior of an X interface is unique to our technique. The intent of this paper is to provide insight into the functional details of our automated reverse engineering process for the benefit of other reverse engineering researchers and programming tool developers.
Integrating application domain knowledge into reverse engineering is an important step to overcome the shortcomings of conventional reverse engineering approaches that are based exclusively on information derivable from source code. In this paper, we show the basic concepts of a program transformation process from a conventional to an object-oriented architecture which incorporates extraneous higher-level knowledge in its process. To which degree this knowledge might stem from some general domain knowledge, and to which extent it needs to be introduced as application dependent knowledge by a human expert is discussed. The paper discusses these issues in the context of the architectural transformation of legacy systems to an object-oriented architecture.
Please login to be able to save your searches and receive alerts for new content matching your search criteria.