POLYNOMIAL NETWORKS VERSUS OTHER TECHNIQUES IN TEXT CATEGORIZATION
Abstract
Many techniques and algorithms for automatic text categorization had been devised and proposed in the literature. However, there is still much space for researchers in this area to improve existing algorithms or come up with new techniques for text categorization (TC). Polynomial Networks (PNs) were never used before in TC. This can be attributed to the huge datasets used in TC, as well as the technique itself which has high computational demands. In this paper, we investigate and propose using PNs in TC. The proposed PN classifier has achieved a competitive classification performance in our experiments. More importantly, this high performance is achieved in one shot training (noniteratively) and using just 0.25%–0.5% of the corpora features. Experiments are conducted on the two benchmark datasets in TC: Reuters-21578 and the 20 Newsgroups. Five well-known classifiers are experimented on the same data and feature subsets: the state-of-the-art Support Vector Machines (SVM), Logistic Regression (LR), the k-nearest-neighbor (kNN), Naive Bayes (NB), and the Radial Basis Function (RBF) networks.