Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Reducing the variance between expectation and execution of software processes is an essential activity for software development, in which the Causal Analysis is a conventional means of detecting problems in the software process. However, significant effort may be required to identify the problems of software development. Defect prevention prevents the problems from occurring, thus lowering the effort required in defect detection and correction. The prediction model is a conventional means of predicting the problems of subsequent process actions, where the prediction model can be built from the performed actions. This study proposes a novel approach that applies the Intertransaction Association Rule Mining techniques to the records of performed actions in order to discover the patterns that are likely to cause high severity defects. The discovered patterns can then be applied to predict the subsequent actions that may result in high severity defects.
Real-world software systems are becoming larger, more complex, and much more unpredictable. Software systems face many risks in their life cycles. Software practitioners strive to improve software quality by constructing defect prediction models using metric (feature) selection techniques. Finding faulty components in a software system can lead to a more reliable final system and reduce development and maintenance costs. This paper presents an empirical study of six commonly used filter-based software metric rankers and our proposed ensemble technique using rank ordering of the features (mean or median), applied to three large software projects using five commonly used learners. The classification accuracy was evaluated in terms of the AUC (Area Under the ROC (Receiver Operating Characteristic) Curve) performance metric. Results demonstrate that the ensemble technique performed better overall than any individual ranker and also possessed better robustness. The empirical study also shows that variations among rankers, learners and software projects significantly impacted the classification outcomes, and that the ensemble method can smooth out performance.
Software defect prediction is an acknowledged approach used to achieve better product quality and to better utilize resources needed for that purpose. One known method for predicting the number of defects is to apply case-based reasoning (CBR). In this paper, different attribute weighting techniques for CBR-based defect prediction are analyzed. One of the weighting techniques used in this work, Sensitivity Analysis based on Neural Networks (SANN), is based on sensitivity analysis of the impact of attributes as part of neural network analysis. Neural networks are applicable when there are non-linear and complicated relationships among the attributes. Since weighting plays a key role in the CBR model, using an efficient weight calculation method can change the results. The results of SANN are compared with applying uniform weights and weights gained from Multiple Linear Regression (MLR).
Evaluation of the accuracy of the overall method for applying the three different weighting techniques is done over five data sets, comprising about 5000 modules from NASA. Two quality measures are applied: Average Absolute Error (AAE) and Average Relative Error (ARE). In addition to the variation of weighting techniques, the impact of varying the number of nearest neighbors is studied.
The three main results of the empirical analysis are: (i) In the majority of cases, SANN achieves the most accurate results; (ii) uniform weighting performs better than the MLR-based weighting heuristic; and (iii) there is no significant preference pattern for defining the number of similar objects used for prediction in CBR.
Recent research has shown the value of social metrics for defect prediction. Yet many repositories lack the information required for a social analysis. So, what other means exist to infer how developers interact around their code? One option is static code metrics that have already demonstrated their usefulness in analyzing change in evolving software systems. But do they also help in defect prediction? To address this question we selected a set of static code metrics to determine what classes are most "active" (i.e., the classes where the developers spend much time interacting with each other's design and implementation decisions) in 33 open-source Java systems that lack details about individual developers. In particular, we assessed the merit of these activity-centric measures in the context of "inspection optimization" — a technique that allows for reading the fewest lines of code in order to find the most defects. For the task of inspection optimization these activity measures perform as well as (usually, within 4%) a theoretical upper bound on the performance of any set of measures. As a result, we argue that activity-centric static code metrics are an excellent predictor for defects.
The basic measurements for software quality control and management are the various project and software metrics collected at various states of a software development life cycle. The software metrics may not all be relevant for predicting the fault proneness of software components, modules, or releases. Thus creating the need for the use of feature (software metric) selection. The goal of feature selection is to find a minimum subset of attributes that can characterize the underlying data with results as well as, or even better than the original data when all available features are considered. As an example of inter-disciplinary research (between data science and software engineering), this study is unique in presenting a large comparative study of wrapper-based feature (or attribute) selection techniques for building defect predictors. In this paper, we investigated thirty wrapper-based feature selection methods to remove irrelevant and redundant software metrics used for building defect predictors. In this study, these thirty wrappers vary based on the choice of search method (Best First or Greedy Stepwise), leaner (Naïve Bayes, Support Vector Machine, and Logistic Regression), and performance metric (Overall Accuracy, Area Under ROC (Receiver Operating Characteristic) Curve, Area Under the Precision-Recall Curve, Best Geometric Mean, and Best Arithmetic Mean) used in the defect prediction model evaluation process. The models are trained using the three learners and evaluated using the five performance metrics. The case study is based on software metrics and defect data collected from a real world software project.
The results demonstrate that Best Arithmetic Mean is the best performance metric used within the wrapper. Naïve Bayes performed significantly better than Logistic Regression and Support Vector Machine as a wrapper learner on slightly and less imbalanced datasets. We also recommend Greedy Stepwise as a search method for wrappers. Moreover, comparing to models built with full datasets, the performances of defect prediction models can be improved when metric subsets are selected through a wrapper subset selector.
Cross-project defect prediction trains a prediction model using historical data from source projects and applies the model to target projects. Most previous efforts assumed the cross-project data have the same metrics set, which means the metrics used and the size of metrics set are the same. However, this assumption may not hold in practical scenarios. In addition, software defect datasets have the class-imbalance problem which increases the difficulty for the learner to predict defects. In this paper, we advance canonical correlation analysis by deriving a joint feature space for associating cross-project data. We also propose a novel support vector machine algorithm which incorporates the correlation transfer information into classifier design for cross-project prediction. Moreover, we take different misclassification costs into consideration to make the classification inclining to classify a module as a defective one, alleviating the impact of imbalanced data. The experimental results show that our method is more effective compared to state-of-the-art methods.
Defect prediction aims to estimate software reliability via learning from historical defect data. Cross-company defect prediction (CCDP) is a practical way that trains a prediction model by exploiting one or multiple projects of a source company and then applies the model to the target company. Unfortunately, larger irrelevant cross-company (CC) data usually makes it difficult to build a CCDP model with high performance. To address such issues, this paper proposes a data filtering method based on agglomerative clustering (DFAC) for CCDP. First, DFAC combines within-company (WC) instances and CC instances and uses agglomerative clustering algorithm to group these instances. Second, DFAC selects subclusters which consist of at least one WC instance, and collects the CC instances in the selected subclusters into a new CC data. Compared with existing data filter methods, the experiment results from 15 public PROMISE datasets show that DFAC increases the pd value, reduces the pf value and achieves higher G-measure value.
We present a study of 600 Java software networks with the aim of characterizing the relationship among their defectiveness and community metrics. We analyze the community structure of such networks, defined as their topological division into subnetworks of densely connected nodes. A high density of connections represents a higher level of cooperation between classes, so a well-defined division in communities could indicate that the software system has been designed in a modular fashion and all its functionalities are well separated. We show how the community structure can be an indicator of well-written, high quality code by retrieving the communities of the analyzed systems and by ranking their division in communities through the built-in metric called modularity. We found that the software systems with highest modularity possess the majority of bugs, and tested whether this result is related to some confounding effect. We found two power laws relating the maximum defect density with two different metrics: the number of detected communities inside a software network and the clustering coefficient. We finally found a linear correlation between clustering coefficient and number of communities. Our results can be used to make predictive hypotheses about software defectiveness of future releases of the analyzed systems.
Software metrics (features or attributes) are collected during the software development cycle. Metric selection is one of the most important preprocessing steps in the process of building defect prediction models and may improve the final prediction result. However, the addition or removal of program modules (instances or samples) can alter the subsets chosen by a feature selection technique, rendering the previously-selected feature sets invalid. Very limited research have been done considering both stability (or robustness) and defect prediction model performance together in the software engineering domain, despite the importance of both aspects when choosing a feature selection technique. In this paper, we test the stability and classification model performance of eighteen feature selection techniques as the magnitude of change to the datasets and the size of the selected feature subsets are varied. All experiments were conducted on sixteen datasets from three real-world software projects. The experimental results demonstrate that Gain Ratio shows the least stability while two different versions of ReliefF show the most stability, followed by the PRC- and AUC-based threshold-based feature selection techniques. Results also show that the signal-to-noise ranker performed moderately in terms of robustness and was the best ranker in terms of model performance. Finally, we conclude that while for some rankers, stability and classification performance are correlated, this is not true for other rankers, and therefore performance according to one scheme (stability or model performance) cannot be used to predict performance according to the other.
In order to predict software defects, this paper proposes a software defect prediction method based on complex network analysis. Most of the existing evaluation methods are based on undirected, unweighted network, failing to reflect the real situation of complex software. First, method proposed in this paper abstracts software system as directed weighted network on the granularity of class. Then, based on PageRank algorithm, this paper proposes KeyNodeRank algorithm to calculate node importance in global network. Node importance can be used to predict defects of software system. Experimental results show that the proposed method has a higher accuracy in predicting software defect. It has important significance for locating software defects, testing, improving software quality and software maintenance.
The fuzzy classification plays an important role to predict defect of software modules. In this paper, the fuzzy measure (FM) is used to improve the predict accuracy and capability by acquiring all possible interactions among metrics and apply Choquet integral (CI) for classifying in n dimensional space and automatic searching the least misclassification rate based on distance. To implement the model, we also need to determine the unknown parameters, and which is implemented using genetic algorithm (GA) on the training data. The proposed model is tested on the four NASA software projects. The results indicate that the predict performance of proposed model is better than other predict models.