No Access

Computing Department, Federal University of São Carlos, Rod. Washington Luis, km. 235, São Carlos, São Paulo, 13565-905, Brazil

E-mail Address: alexandre.levada@ufscar.br

https://doi.org/10.1142/S0218488525500096Cited by:0 (Source: Crossref)

Abstract

Distance metric learning and nonlinear dimensionality reduction are intrinsically related, since they are both different perspectives of the same fundamental problem: to learn compact and meaningful data representations for classification and visualization. In this paper, we propose a graph-based generalization of Semi-Supervised Dimensionality Reduction (SSDR) algorithm that uses stochastic distances (Kullback-Leibler, Bhattacharyya and Cauchy-Schwarz divergences) to compute the similarity between local multivariate Gaussian distributions along the K Nearest Neighbors (KNN) graph build from the samples in the input high-dimensional space. In summary, there are two variants of the proposed method: one which uses only a fraction of the labeled samples (10%) and another that also uses a clustering method (Gaussian Mixture Models) to estimate the labels of the minimum spanning tree of the KNN graph, incorporating more information into the process. Experimental results with several real datasets show that the proposed method is able to improve the classification accuracy of several supervised classifiers and also the quality of the obtained clusters (Silhouette Coefficients) in comparison to the regular SSDR algorithm, making it a viable alternative for pattern classification problems.

Keywords:

References

1. L. van der Maaten, E. Postma and J. van den Herik, “Dimensionality reduction: A comparative review”, Journal of Machine Learning Research 10 (2009) 66–71. Google Scholar
2. I. Jolliffe, Principal Component Analysis (Springer Verlag, New York, 2002). Google Scholar
3. P. Xanthopoulos, P. M. Pardalos and T. B. Trafalis, Linear Discriminant Analysis (Springer, New York, 2013), pp. 27–33. Crossref, Google Scholar
4. X. Zhu and A. B. Goldberg, Introduction to Semi-Supervised Learning (Morgan & Claypool, New York, 2009). Crossref, Google Scholar
5. D. Zhang, Z. H. Zhou and S. Chen, “Semi-supervised dimensionality reduction”, in Proceedings of the 2007 SIAM International Conference on Data Mining (SDM) (SIAM, 2007), pp. 629–634. Crossref, Google Scholar
6. B. Qian and I. Davidson, “Semi-supervised dimension reduction for multi-label classification”, in Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI Press, 2010), AAAI’10, pp. 569–574. Crossref, Google Scholar
7. D. R. Hardoon, S. Szedmak and J. Shawe-Taylor, “Canonical correlation analysis: An overview with application to learning methods”, Neural Computation 16 (2004) 2639–2664. Crossref, Web of Science, Google Scholar
8. A. Kimura, H. Kameoka, M. Sugiyama, T. Nakano, E. Maeda, H. Sakano and K. Ishiguro, “Semicca: Efficient semi-supervised learning of canonical correlations”, in 2010 20th International Conference on Pattern Recognition (2010), pp. 2933–2936. Crossref, Google Scholar
9. M. B. Blaschko, C. H. Lampert and A. Gretton, “Semi-supervised laplacian regularization of kernel canonical correlation analysis”, in Machine Learning and Knowledge Discovery in Databases, eds. W. Daelemans, B. Goethals and K. Morik (Springer, Berlin, Heidelberg, Berlin, Heidelberg, 2008), pp. 133–145. Crossref, Google Scholar
10. G. Lee and A. Madabhushi, “Semi-supervised graph embedding scheme with active learning (ssgeal): Classifying high dimensional biomedical data”, in Pattern Recognition in Bioinformatics (Springer Berlin/ Heidelberg, 2010), Vol. 6282 of Lecture Notes in Computer Science, pp. 207–218. Crossref, Google Scholar
11. G. Lee, D. E. R. Bucheli and A. Madabhushi, “Adaptive dimensionality reduction with semi-supervision (address): Classifying multi-attribute biomedical data”, PLOS ONE (2016), https://doi.org/10.1371/journal.pone.0159088. Web of Science, Google Scholar
12. Z. Long, H. Meng and M. Sioutis, “Semi-supervised dimensionality reduction by linear compression and stretching”, IEEE Access 8 (2020) 27308–27317. Crossref, Web of Science, Google Scholar
13. H. Wu and S. Prasad, “Semi-supervised dimensionality reduction of hyperspectral imagery using pseudo-labels”, Pattern Recognition 74 (2018) 212–224. Crossref, Web of Science, Google Scholar
14. J. B. Tenenbaum, V. de Silva and J. C. Langford, “A global geometric framework for nonlinear dimensionality reduction”, Science 290 (2000) 2319–2323. Crossref, Web of Science, Google Scholar
15. S. Roweis and L. Saul, “Nonlinear dimensionality reduction by locally linear embedding”, Science 290 (2000) 2323–2326. Crossref, Web of Science, Google Scholar
16. M. Belkin and P. Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation”, Neural Computation 15 (2003) 1373–1396. Crossref, Web of Science, Google Scholar
17. A. S. Shirkhorshidi, S. Aghabozorgi and T. Y. Wah, “A comparison study on similarity and dissimilarity measures in clustering continuous data”, PLOS ONE (2015), https://doi.org/10.1371/journal.pone.0144059. Crossref, Web of Science, Google Scholar
18. C. C. Aggarwal, A. Hinneburg and D. A. Keim, “On the surprising behavior of distance metrics in high dimensional space”, in Database Theory —ICDT 2001, eds. J. Van den Bussche and V. Vianu (Springer, Berlin Heidelberg, Berlin, Heidelberg, 2001), pp. 420–434. Crossref, Google Scholar
19. F. Nielsen and R. Nock, “A closed-form expression for the sharma–mittal entropy of exponential families”, Journal of Physics A: Mathematical and Theoretical 45 (2011) 032003. Crossref, Google Scholar
20. S. Kullback and R. A. Leibler, “On Information and Sufficiency”, The Annals of Mathematical Statistics 22 (1951) 79–86. Crossref, Web of Science, Google Scholar
21. A. Bhattacharyya, “On a measure of divergence between two statistical populations defined by their probability distributions”, Bulletin of the Calcutta Mathematical Society 35 (1943) 99–109. Google Scholar
22. K. Kampa, E. Hasanbelliu and J. C. Principe, “Closed-form cauchy-schwarz pdf divergence for mixture of gaussians”, in The 2011 International Joint Conference on Neural Networks (2011), pp. 2578–2585. Crossref, Google Scholar
23. R. Kashyap, “The perfect marriage and much more: Combining dimension reduction, distance measures and covariance”, Physica A: Statistical Mechanics and its Applications 536 (2019) 120938. Crossref, Web of Science, Google Scholar
24. F. Nielsen, K. Sun and S. Marchand-Maillet, “On holder projective divergences Entropy 19. Google Scholar
25. A. L. M. Levada, “Parametric pca for unsupervised metric learning”, Pattern Recognition Letters 135 (2020) 425–430. Crossref, Web of Science, Google Scholar
26. A. C. Neto and A. L. M. Levada, “Isomap-kl: a parametric approach for unsupervised metric learning”, in 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI) (2020), pp. 287–294. Crossref, Google Scholar
27. A. L. M. Levada and M. F. C. Haddad, “Entropic laplacian eigenmaps for unsupervised metric learning”, in 2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI) (2021), pp. 307–314. Crossref, Google Scholar
28. M. Friedman, “The use of ranks to avoid the assumption of normality implicit in the analysis of variance”, Journal of the American Statistical Association 32 (1937) 675–701. Crossref, Google Scholar
29. P. Nemenyi, “Distribution-free multiple comparisons”, Ph.D. thesis, Princeton University, 1963. Google Scholar
30. J. Demsar, “Statistical comparisons of classifiers over multiple data sets”, Journal of Machine Learning Research 7 (2006) 1–30. Web of Science, Google Scholar