World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

Efficient Mining of Robust Closed Weighted Sequential Patterns Without Information Loss

    https://doi.org/10.1142/S0218213015500074Cited by:17 (Source: Crossref)

    Sequential pattern mining has become one of the most important topics in data mining. It has broad applications such as analyzing customer purchase data, Web access patterns, network traffic data, DNA sequencing, and so on. Previous studies have concentrated on reducing redundant patterns among the sequential patterns, and on finding meaningful patterns from huge datasets. In sequential pattern mining, closed sequential pattern mining and weighted sequential pattern mining are the two main approaches to perform mining tasks. This is because closed sequential pattern mining finds representative sequential patterns which show exactly the same knowledge as the complete set of frequent sequential patterns, and weight-based sequential pattern mining discovers important sequential patterns by considering the importance of each sequential pattern. In this paper, we study the problem of mining robust closed weighted sequential patterns by integrating two paradigms from large sequence databases. We first show that the joining order between the weight constraints and the closure property in sequential pattern mining leads to different sets of results. From our analysis of joining orders, we suggest robust closed weighted sequential pattern mining without information loss, and present how to discover representative important sequential patterns without information loss. Through performance tests, we show that our approach gives high performance in terms of efficiency, effectiveness, memory usage, and scalability.