Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Image anomaly detection is an application-driven problem where the aim is to identify novel samples, which differ significantly from the normal ones. We here propose Pyramidal Image Anomaly DEtector (PIADE), a deep reconstruction-based pyramidal approach, in which image features are extracted at different scale levels to better catch the peculiarities that could help to discriminate between normal and anomalous data. The features are dynamically routed to a reconstruction layer and anomalies can be identified by comparing the input image with its reconstruction. Unlike similar approaches, the comparison is done by using structural similarity and perceptual loss rather than trivial pixel-by-pixel comparison. The proposed method performed at par or better than the state-of-the-art methods when tested on publicly available datasets such as CIFAR10, COIL-100 and MVTec.
The detection of damages in engineering structures by means of the changes in their vibration response is called structural health monitoring (SHM). It is a promising field but presents fundamental challenges. Accurate theoretical models of the structure are generally unfeasible, so data-based approaches are required. Indeed, only data from the undamaged condition are usually available, so the approach needs to be framed as novelty detection. Data are acquired from a network of sensors to measure local changes in the operating condition of the structures. In order to distinguish changes produced by damages from those caused by the environmental conditions, several physically meaningful features have been proposed, most of them in the frequency domain. Nevertheless, multiple measurement locations and the absence of a principled criterion to select among the potentially damage-sensitive features contribute to increase data dimensionality. Since high dimensionality affects the effectiveness of damage detection, we evaluate the effect of a dimensionality reduction approach in the diagnostic accuracy of damage detection.
In one-class classification one tries to describe a class of target data and to distinguish it from all other possible outlier objects. Obvious applications are areas where outliers are very diverse or very difficult or expensive to measure, such as in machine diagnostics or in medical applications. In order to have a good distinction between the target objects and the outliers, good representation of the data is essential. The performance of many one-class classifiers critically depends on the scaling of the data and is often harmed by data distributions in (nonlinear) subspaces. This paper presents a simple preprocessing method which actively tries to map the data to a spherical symmetric cluster and is almost insensitive to data distributed in subspaces. It uses techniques from Kernel PCA to rescale the data in a kernel feature space to unit variance. This transformed data can now be described very well by the Support Vector Data Description, which basically fits a hypersphere around the data. The paper presents the methods and some preliminary experimental results.
Detecting anomalous patterns in data is a relevant task in many practical applications, such as defective items detection in industrial inspection systems, cancer identification in medical images, or attacker detection in network intrusion detection systems. This paper focuses on detection of anomalous images, this is images that visually deviate from a reference set of regular data. While anomaly detection has been widely studied in the context of classical machine learning, the application of modern deep learning techniques in this field is still limited. We here propose a capsule-based network for anomaly detection in an extremely imbalanced fully supervised context: we assume that anomaly samples are available, but their amount is limited if compared to regular data. By using a variant of the standard CapsNet architecture, we achieved state-of-the-art results on the MNIST, F-MNIST and K-MNIST datasets.
Because of the scarcity and diversity of outliers, it is very difficult to design a robust outlier detector. In this paper, we first propose to use the maximum margin criterion to sift unknown outliers, which demonstrates superior performance. However, the resultant learning task is formulated as a Mixed Integer Programming (MIP) problem, which is computationally hard. Therefore, we alter the recently developed label generating technique, which efficiently solves a convex relaxation of the MIP problem of outlier detection. Specifically, we propose an effective procedure to find a largely violated labeling vector for identifying rare outliers from abundant normal patterns, and its convergence is also presented. Then, a set of largely violated labeling vectors are combined by multiple kernel learning methods to robustly detect outliers. Besides these, in order to further enhance the efficacy of our outlier detector, we also explore the use of maximum volume criterion to measure the quality of separation between outliers and normal patterns. This criterion can be easily incorporated into our proposed framework by introducing an additional regularization. Comprehensive experiments on toy and real-world data sets verify that the outlier detectors using the two proposed criteria outperform existing outlier detection methods. Furthermore, our models are employed to detect corporate credit risk and demonstrate excellent performance.
This paper describe a new concept of "cluster outlier-ness". In order to quantify it, we propose a relative isolation score named group outlier factor (GOF). GOF is a score, which is computed during a clustering process using self-organizing maps. The main difference between GOF and existing methods is that, being an outlier is not associated to a single pattern but to a cluster. Thus, an outlier factor (OF) with respect to each cluster is computed for each new sample and compared to the GOF score associated for each cluster. OF is used as a novelty detection classifier. This approach allows to identify meaningful outlier-clusters and detects novel patterns that previous approaches could not find. Experimental results and comparison studies show that the use of GOF sensibly improves the results in term of cluster-outlier and novelty detection.
The spread of real-time applications has led to a huge amount of data shared between users. This vast volume of data rapidly evolving over time is referred to as data stream. Clustering and processing such data poses many challenges to the data mining community. Indeed, traditional data mining techniques become unfeasible to mine such a continuous flow of data where characteristics, features, and concepts are rapidly changing over time. This paper presents a novel method for data stream clustering. In this context, major challenges of data stream processing are addressed, namely, infinite length, concept drift, novelty detection, and feature evolution. To handle these issues, the proposed method uses the Artificial Immune System (AIS) meta-heuristic. The latter has been widely used for data mining tasks and it owns the property of adaptability required by data stream clustering algorithms. Our method, called AIS-Clus, is able to detect novel concepts using the performance of the learning process of the AIS meta-heuristic. Furthermore, AIS-Clus has the ability to adapt its model to handle concept drift and feature evolution for textual data streams. Experimental results have been performed on textual datasets where efficient and promising results are obtained.
The Self-Organizing Map (SOM) is one of the most popular neural network methods. It is a powerful tool in visualization and analysis of high-dimensional data in various engineering applications. The SOM maps the data on a two-dimensional grid which may be used as a base for various kinds of visual approaches for clustering, correlation and novelty detection. In this chapter, we present novel methods that enhance the SOM based visualization in correlation hunting and novelty detection. These methods are applied to two industrial case studies: analysis of hot rolling of steel and continuous pulp process. A research software for fast development of SOM based tools is briefly described.