Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

SEARCH GUIDE  Download Search Tip PDF File

  • articleNo Access

    FEEDFORWARD NEURAL NETWORK MODELS FOR HANDLING CLASS OVERLAP AND CLASS IMBALANCE

    This paper proposes a framework for training feedforward neural network models capable of handling class overlap and imbalance by minimizing an error function that compensates for such imperfections of the training set. A special case of the proposed error function can be used for training variance-controlled neural networks (VCNNs), which are developed to handle class overlap by minimizing an error function involving the class-specific variance (CSV) computed at their outputs. Another special case of the proposed error function can be used for training class-balancing neural networks (CBNNs), which are developed to handle class imbalance by relying on class-specific correction (CSC). VCNNs and CBNNs are compared with conventional feedforward neural networks (FFNNs), quantum neural networks (QNNs), and resampling techniques. The properties of VCNNs and CBNNs are illustrated by experiments on artificial data. Various experiments involving real-world data reveal the advantages offered by VCNNs and CBNNs in the presence of class overlap and class imbalance.

  • articleNo Access

    Improved Overlap-based Undersampling for Imbalanced Dataset Classification with Application to Epilepsy and Parkinson’s Disease

    Classification of imbalanced datasets has attracted substantial research interest over the past decades. Imbalanced datasets are common in several domains such as health, finance, security and others. A wide range of solutions to handle imbalanced datasets focus mainly on the class distribution problem and aim at providing more balanced datasets by means of resampling. However, existing literature shows that class overlap has a higher negative impact on the learning process than class distribution. In this paper, we propose overlap-based undersampling methods for maximizing the visibility of the minority class instances in the overlapping region. This is achieved by the use of soft clustering and the elimination threshold that is adaptable to the overlap degree to identify and eliminate negative instances in the overlapping region. For more accurate clustering and detection of overlapped negative instances, the presence of the minority class at the borderline areas is emphasized by means of oversampling. Extensive experiments using simulated and real-world datasets covering a wide range of imbalance and overlap scenarios including extreme cases were carried out. Results show significant improvement in sensitivity and competitive performance with well-established and state-of-the-art methods.

  • articleNo Access

    Response to Discussion on “Improved Overlap-Based Undersampling for Imbalanced Dataset Classification with Application to Epilepsy and Parkinson’s Disease,”

    In the paper Improved Overlap-Based Undersampling for Imbalanced Dataset Classification with Application to Epilepsy and Parkinson’s Disease, the authors introduced two new methods that address the class overlap problem in imbalanced datasets. The methods involve identification and removal of potentially overlapped majority class instances. Extensive evaluations were carried out using 136 datasets and compared against several state-of-the-art methods. Results showed competitive performance with those methods, and statistical tests proved significant improvement in classification results. The discussion on the paper related to the behavioral analysis of class overlap and method validation was raised by Fernández. In this article, the response to the discussion is delivered. Detailed clarification and supporting evidence to answer all the points raised are provided.

  • articleNo Access

    An Empirical Study of the Impact of Class Overlap on the Performance and Interpretability of Cross-Version Defect Prediction

    The class overlap problem refers to instances from different categories heavily overlapping in the feature space. This issue is one of the challenges in improving the performance of software defect prediction (SDP). Currently, the studies on the impact of class overlap on SDP mainly focused on within-project defect prediction and cross-project defect prediction. Moreover, the existing class overlap instances cleaning methods are not suitable for cross-version defect prediction. In this paper, we propose a class overlap instances cleaning method based on the Ratio of K-nearest neighbors with the Same Label (RKSL). This method removes instances with the abnormal neighbor ratio in the training set. Based on the RKSL method, we investigate the impact of class overlap on the performance and interpretability of the cross-version defect prediction model. The experiment results show that class overlap can affect the performance of cross-version defect prediction models significantly. The RKSL method can handle the class overlap problem in defect datasets, but it may impact the interpretability of models. Through the analysis of feature changes, we consider that class overlap instances cleaning can assist models in identifying more important features.