World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

INTELLIGENT NLP-DRIVEN TEXT CLASSIFICATION

    https://doi.org/10.1142/S0218213002000952Cited by:3 (Source: Crossref)

    Information Retrieval (IR) and NLP-driven Information Extraction (IE) are complementary activities. IR helps in locating specific documents within a huge search space (localization) while IE supports the localization of specific information within a document (extraction or explanation). In application scenarios both capabilities are usually needed. IE is important here, as it can enrich the IR inferences with motivating information. Works on Web-based IR suggest that embedding linguistic information (e.g. sense distinctions) at a suitable level within traditional quantitative approaches (e.g. query expansion as in [26]) is a promising approach. "Which linguistic level is best suited to which IR mechanism" is the interesting representational problem posed by the current research stage. This is also the central concern of this paper. A traditional method for efficient text categorization is here presented. Original features of the proposed model are a self-adapting parameterized weighting model and the use of linguistic information. The key idea is the integration of NLP methods within a robust and efficient TC framework. This allows to combine benefits of large scale and efficient IR with the richer expressivity closer to IE. In this paper we capitalize the systematic benchmarking resources available in TC to extensively derive empirical evidence about the above representational problem. The positive experimental results confirm that the proposed TC framework characterizes as a viable approach to intelligent text categorization on a large scale.