Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  • articleNo Access

    COMBINATION OF MULTIPLE FEATURE SELECTION METHODS FOR TEXT CATEGORIZATION BY USING COMBINATORIAL FUSION ANALYSIS AND RANK-SCORE CHARACTERISTIC

    Effective feature selection methods are important for improving the efficiency and accuracy of text categorization algorithms by removing redundant and irrelevant terms from the corpus. Extensive research has been done to improve the performance of individual feature selection methods. However, it is always a challenge to come up with an individual feature selection method which would outperform other methods in most cases. In this paper, we explore the possibility of improving the overall performance by combining multiple individual feature selection methods. In particular, we propose a method of combining multiple feature selection methods by using an information fusion paradigm, called Combinatorial Fusion Analysis (CFA). A rank-score function and its associated graph, called rank-score graph, are adopted to measure the diversity of different feature selection methods. Our experimental results demonstrated that a combination of multiple feature selection methods can outperform a single method only if each individual feature selection method has unique scoring behavior and relatively high performance. Moreover, it is shown that the rank-score function and rank-score graph are useful for the selection of a combination of feature selection methods.

  • articleNo Access

    COMPARING SYSTEM SELECTION METHODS FOR THE COMBINATORIAL FUSION OF MULTIPLE RETRIEVAL SYSTEMS

    Combining multiple information retrieval (IR) systems has been shown to improve performance over individual systems. However, it remains a challenging problem to determine when and how a set of individual systems should to be combined. In this paper, we investigate these issues using combinatorial fusion analysis and five data sets provide by TREC 2, 3, 4, 5, and 6. In particular, we compare the performance of combining six IR systems selected by random choice vs. by performance measurement from these five TREC data sets. Two experiments are conducted, which include: (1) combination of two systems and their performance outcome in terms of performance ratio and cognitive diversity, and (2) combinatorial fusion of t-systems, t = 2 to 6, using both score and rank combinations and exploration of the effect of diversity on the performance outcome. It is demonstrated in both experiments that combination of two or more systems improves the performance more significantly when the systems are selected by performance evaluation than those selected by random choice. Our work provides a distinctive method of system selection for the combination of multiple retrieval systems.