Please login to be able to save your searches and receive alerts for new content matching your search criteria.
In data cleaning, the process of detecting and correcting corrupt, inaccurate or irrelevant records from the record set is a tedious task. Particularly, the process of “outlier detection” occupies a significant role in data cleaning that removes or eliminates the outlier’s that exist in data. Traditionally, more efforts have been taken to remove the outliers, and one of the promising ways is customizing clustering models. In this manner, this paper intends to propose a new outlier detection model via enhanced k-means with outlier removal (E-KMOR), which assigns all outliers into a group naturally during the clustering process. For assigning the point to be outliers, a new intra-cluster based distance evaluation is employed. The main contribution of this paper is to select cluster centroid optimally through a newly proposed hybrid optimization algorithm termed particle updated lion algorithm (PU-LA), which hybrids the concepts of LA and particle swarm optimization (PSO), respectively. Thereby, the proposed work is named as E-KMOR-PU-LA. Finally, the efficacy of the proposed E-KMOR-PU-LA model is proved through a comparative analysis over conventional models by concerning runtime and accuracy.
Savitzky–Golay (S-G) filter is a method of local polynomial regression, and iterative filtering with S-G filter can be used to smooth out random noise and outliers of cloud noise in NDVI time series. It involves a continuous approximation to the upper envelope of NDVI time series. In this paper, the optimum-length of S-G filter was estimated based on Steinc’s unbiased risk estimator theory when S-G filtering was conducted iteratively, and the reconstruction result was presented. Reconstruction experiments on the simulated data and MODIS NDVI time series of the year 2010–2014 showed that the optimum-length S-G filter can outperform the fixed bandwidth S-G filter.