Please login to be able to save your searches and receive alerts for new content matching your search criteria.
In one-class classification one tries to describe a class of target data and to distinguish it from all other possible outlier objects. Obvious applications are areas where outliers are very diverse or very difficult or expensive to measure, such as in machine diagnostics or in medical applications. In order to have a good distinction between the target objects and the outliers, good representation of the data is essential. The performance of many one-class classifiers critically depends on the scaling of the data and is often harmed by data distributions in (nonlinear) subspaces. This paper presents a simple preprocessing method which actively tries to map the data to a spherical symmetric cluster and is almost insensitive to data distributed in subspaces. It uses techniques from Kernel PCA to rescale the data in a kernel feature space to unit variance. This transformed data can now be described very well by the Support Vector Data Description, which basically fits a hypersphere around the data. The paper presents the methods and some preliminary experimental results.
Since in the feature space the eigenvector is a linear combination of all the samples from the training sample set, the computational efficiency of KPCA-based feature extraction falls as the training sample set grows. In this paper, we propose a novel KPCA-based feature extraction method that assumes that an eigenvector can be expressed approximately as a linear combination of a subset of the training sample set ("nodes"). The new method selects maximally dissimilar samples as nodes. This allows the eigenvector to contain the maximum amount of information of the training sample set. By using the distance metric of training samples in the feature space to evaluate their dissimilarity, we devised a very simple and quite efficient algorithm to identify the nodes and to produce the sparse KPCA. The experimental result shows that the proposed method also obtains a high classification accuracy.
In many-body physics, renormalization techniques are used to extract aspects of a statistical or quantum state that are relevant at large scale, or for low energy experiments. Recent works have proposed that these features can be formally identified as those perturbations of the states whose distinguishability most resist coarse-graining. Here, we examine whether this same strategy can be used to identify important features of an unlabeled dataset. This approach indeed results in a technique very similar to kernel PCA (principal component analysis), but with a kernel function that is automatically adapted to the data, or “learned”. We test this approach on handwritten digits, and find that the most relevant features are significantly better for classification than those obtained from a simple Gaussian kernel.
The ability of cognition and recognition for complex environment is very important for a real autonomous robot. A new scene analysis method using kernel principal component analysis (kernel-PCA) for mobile robot based on multi-sonar-ranger data fusion is put forward. The principle of classification by principal component analysis (PCA), kernel-PCA, and the BP neural network (NN) approach to extract the eigenvectors which have the largest k eigenvalues are introduced briefly. Next the details of PCA, kernel-PCA and the BP NN method applied in the corridor scene analysis and classification for the mobile robots based on sonar data are discussed and the experimental results of those methods are given. In addition, a corridor-scene-classifier based on BP NN is discussed. The experimental results using PCA, kernel-PCA and the methods based on BP neural networks (NNs) are compared and the robustness of those methods are also analyzed. Such conclusions are drawn: in corridor scene classification, the kernel-PCA method has advantage over the ordinary PCA, and the approaches based on BP NNs can also get satisfactory results. The robustness of kernel-PCA is better than that of the methods based on BP NNs.