An Approach for Arabic Text Categorization Using Association Rule Mining
Abstract
Text Categorization (TC) has become one of the major techniques for organizing and managing online information. Several studies proposed the so-called associative classification for databases and few of these studies are proposed to classify text documents into predefined categories based on their contents. In this paper a new approach is proposed for Arabic text categorization. The approach facilitates the discovery of association rules for building a classification model for Arabic text categorization. An apriori based algorithm is employed for association rule mining. To validate the proposed approach, several experiments were applied on a collection of Arabic documents. Three classification methods using association rules were compared in terms of their classification accuracy; the methods are: ordered decision list, weighted rules, and majority voting. The results showed that the majority voting method is the best in most of experiments achieving an accuracy of up to 87%. On the other hand, the weighted rule method was the worst in all experiments. Generally, the results of the experiments showed that association rule mining is a suitable method for building good classification models to categorize Arabic text.