World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×
Spring Sale: Get 35% off with a min. purchase of 2 titles. Use code SPRING35. Valid till 31st Mar 2025.

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

Feature Word Vector Based on Short Text Clustering

    https://doi.org/10.1142/9789813146426_0061Cited by:1 (Source: Crossref)
    Abstract:

    A feature word vector based on short text clustering algorithm is proposed in this paper to solve the poor clustering of short text caused by sparse feature and quick updates of short text. Firstly, the formula for feature word extraction based on word part-of-speech (POS) weighting is defined and used to extract a feature word as short text. Secondly, the word vector that represents the semantics of the feature word was obtained through training in large-scale corpus with the Continuous Skip-gram Model. Finally, Word Mover’s Distance (WMD) was used to calculate similarity of short texts for short text clustering in the hierarchical clustering algorithm. The evaluation of four testing datasets revealed that the proposed algorithm is significantly superior to traditional clustering algorithms, with a mean F value of 55.43% on average higher than the second best method.