Kernel Treelets
Abstract
A new method for hierarchical clustering of data points is presented. It combines treelets, a particular multiresolution decomposition of data, with a mapping on a reproducing kernel Hilbert space. The proposed approach, called kernel treelets (KT), uses this mapping to go from a hierarchical clustering over attributes (the natural output of treelets) to a hierarchical clustering over data. KT effectively substitutes the correlation coefficient matrix used in treelets with a symmetric and positive semi-definite matrix efficiently constructed from a symmetric and positive semi-definite kernel function. Unlike most clustering methods, which require data sets to be numeric, KT can be applied to more general data and yields a multiresolution sequence of orthonormal bases on the data directly in feature space. The effectiveness and potential of KT in clustering analysis are illustrated with some examples.