World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×
Spring Sale: Get 35% off with a min. purchase of 2 titles. Use code SPRING35. Valid till 31st Mar 2025.

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

UNSUPERVISED CLASSIFICATION OF TREE STRUCTURED OBJECTS

    https://doi.org/10.1142/9789814271820_0018Cited by:1 (Source: Crossref)
    Abstract:

    Recent developments in medical image analysis, phylogenetics and proteomics motivate the statistical analysis of populations of tree-structured data objects. In this context, unsupervised classification of trees arises as a challenging new area that depends on the careful development of novel mathematical framework. The discussion will center on statistical aspects of clustering in a framework where the tree data to be clustered has been sampled from some unknown probability distribution. Following Ref. 12, we will try to verify two conditions: appropriateness, the clustering of the data set should reveal some structure of the underlying data rather than model artifacts due to the random sampling process; and steadiness, the more sample points we have, the more reliable the clustering should be. We will argue about steadiness and reliability by showing an extension of the convergence properties for a class of non-parametric clustering algorithm: k-means, defined on different metric spaces of trees. We will explore the appropriateness of the clustering outputs of k-means on a real data set from proteomics, and we will comment the results from Ref. 1 on three real data sets of phylogenetic trees.