Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  Bestsellers

  • articleNo Access

    dSubSign: Classification of Instance-Feature Data Using Discriminative Subgraphs as Class Signatures

    Applications like customer identification from their peculiar purchase patterns require class-wise discriminative feature subsets called as class signatures for classification. If the classifiers like KNN, SVM, etc. which require to work with a complete feature set, are applied to such applications, then the entire feature set may introduce errors in the classification. Decision tree classifier generates class-wise prominent feature subsets and hence, can be employed for such applications. However, all of these classifiers fail to model the relationship between features present in vector data. Thus, we propose to model the features and their interrelationships as graphs. Graphs occur naturally in protein molecules, chemical compounds, etc. for which several graph classifiers exist. However, multivariate data do not exhibit the graphs naturally. Thus, the proposed work focuses on (1) modeling multivariate data as graphs and (2) obtaining class-wise prominent subgraph signatures which are then used to train classifiers like SVM for decision making. The proposed method dSubSign can also classify multivariate data with missing values without performing imputation or case deletion. The performance analysis of both real-world and synthetic datasets shows that the accuracy of dSubSign is either higher or comparable to other existing methods.

  • articleNo Access

    USING COLLABORATIVE FILTERING FOR DEALING WITH MISSING VALUES IN NUCLEAR SAFEGUARDS EVALUATION

    Nuclear safeguards evaluation aims to verify that countries are not misusing nuclear programs for nuclear weapons purposes. Experts of the International Atomic Energy Agency (IAEA) carry out an evaluation process in which several hundreds of indicators are assessed according to the information obtained from different sources, such as State declarations, on-site inspections, IAEA non-safeguards databases and other open sources. These assessments are synthesized in a hierarchical way to obtain a global assessment. Much information and many sources of information related to nuclear safeguards are vague, imprecise and ill-defined. The use of the fuzzy linguistic approach has provided good results to deal with such uncertainties in this type of problems. However, a new challenge on nuclear safeguards evaluation has attracted the attention of researchers. Due to the complexity and vagueness of the sources of information obtained by IAEA experts and the huge number of indicators involved in the problem, it is common that they cannot assess all of them appearing missing values in the evaluation, which can bias the nuclear safeguards results. This paper proposes a model based on collaborative filtering (CF) techniques to impute missing values and provides a trust measure that indicates the reliability of the nuclear safeguards evaluation with the imputed values.

  • articleNo Access

    Missing Values and Learning of Fuzzy Rules

    In this paper a technique is proposed to tolerate missing values based on a system of fuzzy rules for classification. The presented method is mathematically solid but nevertheless easy and efficient to implement. Three possible applications of this methodology are outlined: the classification of patterns with an incomplete feature vector, the completion of the input vector when a certain class is desired, and the training or automatic construction of a fuzzy rule set based on incomplete training data. In contrast to a static replacement of the missing values, here the evolving model is used to predict the most possible values for the missing attributes. Benchmark datasets are used to demonstrate the capability of the presented approach in a fuzzy learning environment.

  • articleNo Access

    Possibility Clustering Algorithm for Incomplete Data Based on a Deep Computing Model

    Clustering is an essential part of data analytics and in Wireless Sensor Networks (WSN). It becomes a problem for causes such as insufficient, unavailable, or compromised data in the face of uncertainties. A solution to tackle the instability of clusters due to missed values has been proposed. The fundamental theory determines whether to incorporate an entity into a group if it is not clear and probable. One of the main issues is identifying requirements for three forms of decision definition, including an entity in a cluster, removing an object from a group, or delaying a decision (defer) to involve or rule out a group. Current studies do not adequately discuss threshold identification and use their fixed values implicitly. This work explores using the game theory-based Possibility Clustering Algorithm for Incomplete Data (PCA-ID) framework to address this problem. In specific, a game theory will be described in which thresholds are determined based on a balance between the groups’ precision and generic characteristics. The points calculated are used to elicit judgments for the grouping of unknown objects. Experimental findings on the deep learning datasets show that the PCA-ID increases the overall quality considerably while maintaining comparable precision levels in competition with similar systems.

  • articleNo Access

    Data discretization impact on deep learning for missing value imputation of continuous data

    In various fields of information examination, for example, AI, profoundly getting the hang of missing information is a typical issue. Missing qualities should be tended to since they can adversely affect the exactness and adequacy of prescient models. This research investigates how data discretization affects deep learning methods for filling the missing values in datasets with continuous features. They provide a unique method for imputing missing values using deep neural networks (DNNs) called extravagant expectation maximization-deep neural network (EEM-DNN). This approach discretizes continuous features into separate intervals initially. This is justified by treating the issue of missing value imputation as a classification work, with the missing values being considered a distinct class. A DNN, designed explicitly for imputation, is then trained using the discretized data. The expectation maximization concepts are incorporated into the network architecture, and as a result, the network iteratively improves its imputation predictions. They run comprehensive experiments on several datasets from different fields to gauge the efficacy of the suggested strategy. The effectiveness of EEM-DNN is compared to that of other imputation approaches, such as traditional imputation techniques and deep learning methods without data discretization. Our findings show that data discretization significantly enhances imputation accuracy. In terms of imputation accuracy and prediction performance on downstream tasks, the EEM-DNN method regularly performs better than alternative methods. It also examines if various discretization techniques affect the overall imputation process. They find that the trade-off between bias and variance in imputed data depends on the discretization method selected. This highlights the significance of choosing a suitable discretization approach depending on the unique properties of the dataset.

  • articleNo Access

    NOVEL ENSEMBLE TECHNIQUES FOR REGRESSION WITH MISSING DATA

    In this paper, we consider the problem of missing data, and develop an ensemble-network model for handling the missing data. The proposed method is based on utilizing the inherent uncertainty of the missing records in generating diverse training sets for the ensemble's networks. Specifically we generate the missing values using their probability distribution function. We repeat this procedure many times thereby creating a number of complete data sets. A network is trained for each of these data sets, thereby obtaining an ensemble of networks. Several variants are proposed, and we show analytically that one of these variants is superior to the conventional mean-substitution approach for the limit of large training set. Simulation results confirm the general superiority of the proposed methods compared to the conventional approaches.

  • articleNo Access

    Missing value imputation in time series using Singular Spectrum Analysis

    This paper introduces a new algorithm for gap filling in univariate time series by using SSA. In this algorithm, the data before the missing values and the data after the missing values (in reverse order) are treated as two separate time series. Then using the recurrent SSA forecasting algorithm, two estimations of the missing values are obtained, one from the data before the missing values and one from the data after the missing values. Finally, using bootstrap resampling and a given weighting scheme, based on sample variances, these two estimates are combined to produce a unique estimation for missing values.

  • chapterNo Access

    THE ISSUE OF MISSING VALUES, THEIR PRESENCE AND MANAGEMENT: A RELEVANT DEMONSTRATION OF DATA ANALYSIS IN MARKETING USING CaRBS

    Missing values are an often-alleged incumbency to the effectiveness of successful data analysis. Their presence able to be explained or not may be the issue, the very least acknowledged. This study discusses the extant issues of the presence of the missing values in data analysis, with particular attention to their management, including imputation. Following this discussion, the nascent Classification and Ranking Belief Simplex (CaRBS) system for data analysis (object classification) is presented which has the distinction of not requiring the a priori consideration (management) of any missing values present. Moreover, they are treated as ignorant values and retained in the analysis, a facet of CaRBS being associated with the notion of uncertain reasoning. A problem on the classification of standard and economy food products is considered, with knowledge on their inherent nutrient levels used in their discernment. The visualisation of the intermediate and final results offered by the CaRBS system allows a clear demonstration of the effects of the presence of missing values, within an object classification context.

  • chapterNo Access

    CHARACTERIZING AND COMPLETING NON-RANDOM MISSING VALUES

    Many methods deal with missing values, mainly focused on their completion. However, they complete indifferently all the missing values regardless of their origin, i.e., they assume that all the missing values occur randomly in a dataset. In this paper, we show that many missing values do not stem from randomness, We use the relationships within the data and define four types of missing values. The characterization is made for each missing value. We claim that such a local characterization enables us perceptive techniques to deal with missing values according to their origins. Next, we show how this typology is suitable for completing the missing values: it considers the non-randomness appearance of the missing values and suggests the values of completion.