Processing math: 100%
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  Bestsellers

  • articleNo Access

    Research on the Application of Decision Tree and Correlation Analysis Algorithm in College Students’ Physical Fitness Analysis

    With the advent of the big data era, data-driven decision-making and analysis are increasingly valued in various fields. Especially in the field of education, how to use big data technology to better understand student needs, optimize the education process, and improve education quality has become an important research topic. This paper will explore the application of decision trees and related analysis algorithms in the analysis of college students’ physical fitness, in order to provide scientific basis for improving the physical health level of college students. This paper studies the application of DT (decision tree) and correlation analysis algorithm in the analysis of college students’ physical fitness. In this paper, the method of big data and DM (data mining) is proposed to extract the rules contained in the data information, so as to directly provide auxiliary decision-making for physical fitness test and analysis. The research results show that through training the training set, a good classification accuracy rate is achieved, and through optimizing the depth, the accuracy rate can reach more than 85.033%. Using DM technology as a carrier, this paper digs into the rules behind the new knowledge of college students’ physical fitness, and digs out the previously unknown, implied and potentially useful information and knowledge.

  • articleNo Access

    The Influence of Football on Mental Health Quality Based on Data Mining Algorithm

    Data mining technology can solve the hidden rules of data when solving problems, which has great advantages. With the increasing maturity of data mining technology, its application in teaching is also more and more. This paper aims to analyze the specific impact of campus football activities on students’ mental health quality through the application of data mining algorithms, especially decision trees and association rule algorithms. On this basis, a data mining algorithm is applied to study the influence of campus football activities on students’ sports quality and mental health state. First, the present situation of campus football is expounded. First, the development status of campus football is described, and the research of data mining algorithms is summarized. This paper establishes an analysis model of the influence of campus football on students’ mental quality and uses a decision tree and association rule algorithm in a data mining algorithm to analyze students’ sports quality and mental health state. The algorithm in this article adopts the method of one-time scanning of the database, and after generating new frequent itemsets, the database continuously decreases. When the pruning severity is set to 30–60, higher accuracy can be achieved, and the pruning severity is set to 40 in the design. Subsequent operations do not require scanning again, which can occupy less space and reduce time complexity. The algorithm test shows that the design accuracy and accuracy of the algorithm have been improved, which can meet the needs. The designed analysis of students’ sports quality and mental health state based on the data mining algorithm can provide effective data for the decision-making of campus football activities.

  • articleNo Access

    Design of University Archives Business Data Push System Based on B/S Structure

    In view of the inaccurate results and poor real-time performance of college archives business data push, the college archives business data push system based on B/S structure is designed using ASP/NET, ADO.NET, and other technologies. The system consists of database management module, management maintenance module and query module. The database management module uses the archive business data classification method optimized based on the initial cluster center to classify and manage large-scale archive data. The management and maintenance module completes the encryption and decryption management of archive business data through key management technology and key verification. After the user enters a valid key to log in to the system, the query module uses the association rule-based file business data push method to extract the file data in the strong weighted association rule conditions from the database, feed it back to the front browser interface, and complete the file business data push. After testing, the average accuracy rate of this system for multiple file data push is greater than 0.95, and the maximum time consumption for file business data push after use is reduced by 748ms.

  • articleNo Access

    MINING NON-REDUNDANT ASSOCIATION RULES BASED ON CONCISE BASES

    Association rule mining has many achievements in the area of knowledge discovery. However, the quality of the extracted association rules has not drawn adequate attention from researchers in data mining community. One big concern with the quality of association rule mining is the size of the extracted rule set. As a matter of fact, very often tens of thousands of association rules are extracted among which many are redundant, thus useless. In this paper, we first analyze the redundancy problem in association rules and then propose a reliable exact association rule basis from which more concise nonredundant rules can be extracted. We prove that the redundancy eliminated using the proposed reliable association rule basis does not reduce the belief to the extracted rules. Moreover, this paper proposes a level wise approach for efficiently extracting closed itemsets and minimal generators — a key issue in closure based association rule mining.

  • articleNo Access

    EFFICIENT SHAPE-BASED IMAGE RETRIEVAL BASED ON GRAY RELATIONAL ANALYSIS AND ASSOCIATION RULES

    An improved shape based image retrieval strategy based on gray relational analysis and association rules is proposed. The choice of a suitable object representation and retrieval scheme is essential for efficient retrieval. In addition, a two-stage relevance feedback mechanism based on the GM(1, N) method and association rules is incorporated to improve the retrieval accuracy. The GM(1, N) method is used to build the re-query example for subsequent retrievals. The retrieval log files stored on the server are used for offline mining of association rules. The association rules mined from users' retrieval history can further reveal users' image searching behavior. The effectiveness of the proposed model is demonstrated on the FISH dataset.

  • articleNo Access

    Disease Risk Rule Analysis of the New Rural Cooperative Medical System in Beijing, China

    Objective: With data drawn from Beijing’s New Rural Cooperative Medical System (NRCMS), the rule characteristics of disease risks are mined in terms of risk factors and risk measurements aiming to discover valuable knowledge within the vast amounts of Beijing’s NRCMS data and provide administrators with a more scientific basis for decision making. Methodology:The association rule algorithm is utilized to recover both potentially valuable knowledge and decision-making information from Beijing’s NRCMS data. Results: The main objects of healthcare in Beijing from 2012 to 2014 include: circulatory diseases in patients 41 years of age or older, pediatric respiratory disease prevention, reproductive healthcare for women of childbearing age, and the prevention and treatment of diabetes in female patients; in county-level hospitals with a relatively low average level of consumption, injuries still resulted in high expenses; the primary post-NRCMS reimbursement level-3 and level-4 high-risk groups were patients of 41–65 years of age. Conclusion: According to the ranking of rule supports, the highest support rule in Beijing is the circulatory diseases of middle-aged patients, especially patients that are hospitalized in county-level medical institutions; the second highest support comes from the utilization of fertility services for women of childbearing age. Suggestions: It is recommended that the rule support rankings should be combined to actively and implement with emphasis major prevention and healthcare services, lower disease risk factors, control NRCMS reimbursement costs, promote the sustainable development of NRCMS, adopt a classified management of mined rules, and establish a decision-making knowledge base.

  • articleNo Access

    A New Intelligent Learning Diagnosis Method Constructed Based on Concept Map

    With the rapid development of Internet technology, online learning and online education are becoming more and more popular. Intelligent learning diagnosis has become an effective means to guarantee the quality of online learning, and has become a research hotspot in the direction of education informatization. Concept map is an intuitive visual tool that can discover the concepts poorly mastered by students, and provide useful clues for identifying learning disabilities of students. This paper proposed a learning diagnosis method constructed based on the concept map. First, it groups learners, then uses direct hashing and pruning (DHP) to generate association rules between the different concepts, and finally uses the DHP algorithm to produce and construct the concept map automatically to discover the concepts poorly mastered by students, and realize the automatic diagnosis of learning problems. Case studies that have been done in college mathematics classes in some universities of Lianyungang have fully verified the effectiveness of our method.

  • articleNo Access

    Efficient Mining of Data Streams Using Associative Classification Approach

    Data stream associative classification poses many challenges to the data mining community. In this paper, we address four major challenges posed, namely, infinite length, extraction of knowledge with single scan, processing time, and accuracy. Since data streams are infinite in length, it is impractical to store and use all the historical data for training. Mining such streaming data for knowledge acquisition is a unique opportunity and even a tough task. A streaming algorithm must scan data once and extract knowledge. While mining data streams, processing time, and accuracy have become two important aspects. In this paper, we propose PSTMiner which considers the nature of data streams and provides an efficient classifier for predicting the class label of real data streams. It has greater potential when compared with many existing classification techniques. Additionally, we propose a compact novel tree structure called PSTree (Prefix Streaming Tree) for storing data. Extensive experiments conducted on 24 real datasets from UCI repository and synthetic datasets from MOA (Massive Online Analysis) show that PSTMiner is consistent. Empirical results show that performance of PSTMiner is highly competitive in terms of accuracy and performance time when compared with other approaches under windowed streaming model.

  • articleNo Access

    DECISION TREES FOR PERNICIOUS PAGES DETECTION

    An application framework to perform web usage analysis using advanced data mining methodology is presented. The investigation proposes decision trees for web user behavior analysis. This includes prediction of user future actions and the typical pages leading to browsing termination. The widely known decision tree package C4.5 was used in this study. In the new area of web log mining decision trees showed reasonable computational performance and accuracy. Experiments showed that it is possible to predict future user actions with reasonable misclassification error as well as to find combinations of sequential pages resulting in browsing termination. In addition to this, decision trees generate human understandable rules which can be used to analyze further for web site improvement.

  • articleNo Access

    DISTRIBUTED MINING OF ASSOCIATION RULES BASED ON REDUCING THE SUPPORT THRESHOLD

    One of the most important data mining problems is learning association rules of the form "90% of the customers that purchase product x also purchase product y". Discovering association rules from huge volumes of data requires substantial processing power. In this paper we present an efficient distributed algorithm for mining association rules that reduces the time complexity in a magnitude that renders as suitable for scaling up to very large data sets. The proposed algorithm is based on partitioning the initial data set into subsets and processing each subset in parallel. The proposed algorithm can maintain the set of association rules that are extracted when applying an association rule mining algorithm to all the data, by reducing the support threshold during processing the subsets. The above are confirmed by empirical tests that we present and which also demonstrate the utility of the method.

  • articleNo Access

    Comprehensive Survey on Privacy Preserving Association Rule Mining: Models, Approaches, Techniques and Algorithms

    In recent years, a new research area known as privacy preserving data mining (PPDM) has emerged and captured the attention of many researchers interested in preventing the privacy violations that may occur during data mining. In this paper, we provide a review of studies on PPDM in the context of association rules (PPARM). This paper systematically defines the scope of this survey and determines the PPARM models. The problems of each model are formally described, and we discuss the relevant approaches, techniques and algorithms that have been proposed in the literature. A profile of each model and the accompanying algorithms are provided with a comparison of the PPARM models.

  • articleNo Access

    Mining Undominated Association Rules Through Interestingness Measures

    The increasing growth of databases raises an urgent need for more accurate methods to better understand the stored data. In this scope, association rules were extensively used for the analysis and the comprehension of huge amounts of data. However, the number of generated rules is too large to be efficiently analyzed and explored in any further process. In order to bypass this hamper, an efficient selection of rules has to be performed. Since selection is necessarily based on evaluation, many interestingness measures have been proposed. However, the abundance of these measures gave rise to a new problem, namely the heterogeneity of the evaluation results and this created confusion to the decision. In this respect, we propose a novel approach to discover interesting association rules without favoring or excluding any measure by adopting the notion of dominance between association rules. Our approach bypasses the problem of measure heterogeneity and unveils a compromise between their evaluations. Interestingly enough, the proposed approach also avoids another non-trivial problem which is the threshold value specification. Extensive carried out experiments on benchmark datasets show the benefits of the introduced approach.

  • articleNo Access

    Incremental Updates of Discovered Multi-Level Association Rules

    Update of the single- and multi-level association rules discovered in large databases is inherently costly. The straight forward approach of re-running the discovery algorithm on the entire updated database to re-discover the association rules is not cost-effective. An incremental algorithm FUP have been proposed for the update of discovered single-level association rules. In this study, we have shown that the incremental technique in FUP can be generalized to other data mining systems. An efficient algorithm MLUp has been proposed for the updating of discovered multi-level association rules. Our performance study shows that MLUp has a superior performance over the representative mining algorithm such as ML-T2 in updating discovered multi-level association rules.

  • articleNo Access

    MINING ASSOCIATION RULES FROM MARKET BASKET DATA USING SHARE MEASURES AND CHARACTERIZED ITEMSETS

    We propose the share-confidence framework for knowledge discovery from databases which addresses the problem of mining characterized association rules from market basket data (i.e., itemsets). Our goal is to not only discover the buying patterns of customers, but also to discover customer profiles by partitioning customers into distinct classes. We present a new algorithm for classifying itemsets based upon characteristic attributes extracted from census or lifestyle data. Our algorithm combines the A priori algorithm for discovering association rules between items in large databases, and the A O G algorithm for attribute-oriented generalization in large databases. We show how characterized itemsets can be generalized according to concept hierarchies associated with the characteristic attributes. Finally, we present experimental results that demonstrate the utility of the share-confidence framework.

  • articleNo Access

    CLOSED SET BASED DISCOVERY OF MAXIMAL COVERING RULES

    Many knowledge discovery tasks consist in mining databases. Nevertheless, there are cases in which a user is not allowed to access the database and can deal only with a provided fraction of knowledge. Still, the user hopes to find new interesting relationships. Surprisingly, a small number of patterns can be augmented into new knowledge so considerably that its analysis may become infeasible. In the article, we offer a method of inferring the concise lossless and sound representation of association rules in the form of maximal covering rules from a concise lossless representation of all derivable patterns. The respective algorithm is offered as well.

  • articleNo Access

    AN ALTERNATIVE APPROACH TO DISCOVER GRADUAL DEPENDENCIES

    In this paper we propose a new definition of gradual dependence as a special kind of association rule. We propose a way to adapt existing association rule mining algorithms for the new task of mining such dependencies, and we discuss about its complexity. Some experiments in a real database illustrate the usefulness of the approach.

  • articleNo Access

    FUSING FUZZY ASSOCIATION RULE-BASED CLASSIFIERS USING SUGENO INTEGRAL WITH ORDERED WEIGHTED AVERAGING OPERATORS

    The time or space complexity may considerably increase for a single classifier if all features are taken into account. Thus, it is reasonable to train a single classifier by partial features. Then, a set of multiple classifiers can be generated, and an aggregation of outputs from different classifiers is subsequently performed. The aim of this paper is to propose a classification system with a heuristic fusion scheme in which multiple fuzzy association rule-based classifiers with partial features are combined, and show the feasibility and effectiveness of fusing multiple classifiers through the Sugeno integral extended by ordered weighted averaging operators. In comparison with the Sugeno integral by computer simulations on the iris data and the appendicitis data show that the overall classification accuracy rate could be improved by the Sugeno integral with ordered weighted averaging operators. The experimental results further demonstrate that the proposed method performs well in comparison with other fuzzy or non-fuzzy classification methods.

  • articleNo Access

    PATTERN EXTRACTION FROM BAG DATABASES

    Many databases in real life involve pairs of the form (item,quantity). This kind of data can be characterized using the theory of bags. We present a general framework for extracting useful knowledge from bag databases using different types of patterns. Here we present different approaches for the task of discovering fuzzy association rules and gradual dependencies in bag databases.

  • articleNo Access

    STUDYING INTEREST MEASURES FOR ASSOCIATION RULES THROUGH A LOGICAL MODEL

    Many papers have addressed the task of proposing a set of convenient axioms that a good rule interestingness measure should fulfil. We provide a new study of the principles proposed until now by means of the logic model proposed by Hájek et al.14 In this model association rules can be viewed as general relations of two itemsets quantified by means of a convenient quantifier.28 Moreover, we propose and justify the addition of two new principles to the three proposed by Piatetsky-Shapiro.27 We also use the logic approach for studying the relation between the different classes of quantifiers and these axioms. We define new classes of quantifiers according to the notions of strong and very strong rules, and we present a quantifier based on the certainty factor measure,317 studying its most salient features.

  • articleNo Access

    NEW APPROACHES FOR DISCOVERING EXCEPTION AND ANOMALOUS RULES

    Mining association rules is a well known framework for extracting useful knowledge from databases. Despite their proven applicability there exist other approaches that also search for novel and useful information such us peculiarities, infrequent rules, exceptions or anomalous rules. The common feature of these proposals is the low support of such type of rules. So there is a necessity of finding efficient algorithms for extracting them.

    The principal objective of this paper is providing a unified framework for dealing with such kind of rules. In our case, we take advantage of an existing logic approach called GUHA. This model was first presented in the middle sixties by Hájek et al. and then has been developed by Rauch and others in the last decade.

    Following this line, this paper also offers some interesting issues. First, it provides a deep analysis of semantics and formulation of exception and anomalous rules. Second, we define the so called double rules as a new type of rules which in conjunction with exceptions and anomalies will describe in more detail the relationship between two sets of items. Third, we give new approaches for mining them and we propose an algorithm with reasonably good performance.