Processing math: 100%
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  • articleNo Access

    User click fraud detection method based on Top-Rank-k frequent pattern mining

    A user click fraud detection method based on Top-Rank-k frequent pattern mining algorithm is presented to solve the click fraud problem appearing in current online advertising. Firstly, this method combines the click frequency of event samples, calculates the real evaluation score of click stream, and the click stream density function and evaluation score expression under multi-dimensional variables, and further obtains the time complexity of the next user’s click fraud process. Secondly, according to the Top-Rank-k frequent pattern, the process of click fraud detection algorithm is designed, and the click fraud user is analyzed and obtained. The results show that this method has good efficiency and correctness, and is superior to other similar algorithms.

  • articleNo Access

    Revealing the Gap Between Skills of Students and the Evolving Skills Required by the Industry of Information and Communication Technology

    Along with the fast development in information and communication technology (ICT), job skills required by ICT industries are also evolving very rapidly. It becomes difficult for ICT students to assess the gap between their skills and such evolving skills. Even though schools perform periodical curriculum evaluations, the time gap between the evaluations causes the curriculum to get out-of-date easily since it is unable to cope with the tremendous and quick changes occurring in the industry. We propose novel solutions by introducing some measures and visualization tools to reveal such skills’ gap. Using evolutionary-based data mining, the skillsets mastered by students were collected from their study reports, while the frequent skillsets required by the industry were mined out from job adverts; and based on these skillsets the skill coverage of the students was approximated. The proposed solutions were then tested on data obtained from an Indonesian higher education institution since Indonesia implements competence-based curriculum in its education system. Experimental works show that the proposed approaches not only reveal and visualize the gap, but also monitor the changes in the skills requirements, which also help the school’s administrator while updating the curriculum.

  • articleNo Access

    MINING FREQUENT TEMPORAL PATTERNS IN INTERVAL SEQUENCES

    Recently a new type of data source came into the focus of knowledge discovery from temporal data: interval sequences. In contrast to event sequences, interval sequences contain labeled events with a temporal extension. However, existing algorithms for mining patterns from interval sequences proved to be far from satisfying our needs. In brief, we missed an approach that, at the same time, defines support as the number of pattern instances, allows input data that consists of more than one sequence, implements time constraints on a pattern instance, and counts multiple instances of a pattern within one interval sequence. In this paper we propose a new support definition which incorporates these properties. We also describe FSMSet, an algorithm that employs the new support definition, and demonstrate its performance on field data from the automotive business.

  • articleNo Access

    A Novel Model for Mining Frequent Patterns Based on Embedded Granular Computing

    For mining frequent patterns, it is very expensive for the Apriori mining model to read the database repeatedly, and a highly condensed data structure made the FP-growth mining model cost larger memory. In order to avoid the disadvantages of these data mining model, this paper proposes a novel data mining model for discovering frequent patterns, called a data mining model based on embedded granular computing, which is different from the Apriori model and the FP-growth model. The data mining model adopts efficiently dividing and conquering from granular computing, which can construct adaptively different hierarchical granules. To form the data mining model, an embedded granular computing model is proposed in this paper. The granular computing model is used in discovering frequent patterns, on the one hand, it avoids reading the database repeatedly via constructing the extended information granule, and lessen the calculated amount of support; on the other hand, it reduces the memory requirements by the attribute granule, where the search space can compress the memory space of data structure that make the method of generating the candidate become simple relatively; and it can divide the overlarge computing task into several easy operations via the attribute granule, namely, the embedded granular computing model could short the size of the search space from a super state to several sub-states. All experimental results show that the data mining model based on embedded granular computing is more reasonable and efficient than these classical models for mining frequent patterns under these different types of datasets. Otherwise, an extra discussion describes the performance trend of the model by a group of experiments.

  • articleNo Access

    A Novel Approach for Mining Time and Space Proximity-based Frequent Sequential Patterns from Trajectory Data

    Trajectory Data have been considered as a treasure for various hidden patterns which provide deeper understanding of the underlying moving objects. Several studies are focused to extract repetitive, frequent and group patterns. Conventional algorithms defined for Sequential Patterns Mining problems are not directly applicable for trajectory data. Space Partitioning strategies were proposed to capture space proximity first and then time proximity to discover the knowledge in the data. Our proposal addresses time proximity first by identifying trajectories which meet at a minimum of K time stamps in sequence. A novel tree structure is proposed to ease the process. Our method investigates space proximity using Mahalanobis distance (MD). We have used the Manhattan distance to form prior knowledge that helps the supervised learning-based MD to derive the clusters of trajectories along the true spreads of the objects. With the help of minsup threshold, clusters of frequent trajectories are found and then in sequence they form K length Sequential Patterns. Illustrative examples are provided to compare the MD metric with Euclidean distance metric, Synthetic dataset is generated and results are presented considering the various parameters such as number of objects, minsup, K value, number of hops in any trajectory and computational time. Experiments are done on available real-time dataset, taxi dataset, too. Sequential Patterns are proved to be worthy of knowledge to understand dynamics of the moving objects and to recommend the movements in constrained networks.

  • articleOpen Access

    Efficient Mining of Non-Redundant Periodic Frequent Patterns

    Periodic frequent patterns are frequent patterns which occur at periodic intervals in databases. They are useful in decision making where event occurrence intervals are vital. Traditional algorithms for discovering periodic frequent patterns, however, often report a large number of such patterns, most of which are often redundant as their periodic occurrences can be derived from other periodic frequent patterns. Using such redundant periodic frequent patterns in decision making would often be detrimental, if not trivial. This paper addresses the challenge of eliminating redundant periodic frequent patterns by employing the concept of deduction rules in mining and reporting only the set of non-redundant periodic frequent patterns. It subsequently proposes and develops a Non-redundant Periodic Frequent Pattern Miner (NPFPM) to achieve this purpose. Experimental analysis on benchmark datasets shows that NPFPM is efficient and can effectively prune the set of redundant periodic frequent patterns.