LINEAR AND NON-LINEAR DIMENSION REDUCTION ALGORITHMS AND THEIR APPLICATIONS IN DOCUMENT CLUSTERING
Three linear and non-linear dimension reduction algorithms, LSI, Isomap and SIE for processing high dimensional data in natural language documents are presented in this paper. Document data processed by the three algorithms is then clustered by bisecting K-means clustering algorithm. Experimental results show that the performance of SIE algorithm and LSI algorithm in document clustering are basically equivalent and better than the benchmark, while the performance of Isomap algorithm is worse than the benchmark.