A UNIFICATION OF COMPONENT ANALYSIS METHODS
Over the last century Component Analysis (CA) methods such as Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Canonical Correlation Analysis (CCA), k-means and Spectral Clustering (SC) have been extensively used as a feature extraction step for modeling, classification, visualization and clustering. CA techniques are appealing because many can be formulated as eigen-problems, offering great potential for learning linear and non-linear representations of data without local minima. However, the eigen-formulation often conceals important analytic and computational drawbacks of CA techniques, such as solving generalized eigen-problems with rank deficient matrices, lacking intuitive interpretation of normalization factors, and understanding relationships between CA methods.
This chapter proposes a unified framework to formulate many CA methods as a least-squares estimation problem. We show how PCA, LDA, CCA, k-means, spectral graph methods and kernel extensions correspond to a particular instance of a least-squares weighted kernel reduced rank regression (LS-WKRRR). The least-squares formulation allows better understanding of normalization factors, provides a clean framework to understand the communalities and differences between many CA methods, yields efficient optimization algorithms for many CA algorithms, suggest easy derivation for on-line learning methods, and provides an easier generalization of CA techniques. In addition, we derive weighted generalizations of PCA, LDA, SC and CCA (including kernel extensions).