World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

Website Structure Improvement Based on the Combination of Selected Web Structure and Web Usage Mining Methods

    https://doi.org/10.1142/S0219622018500402Cited by:5 (Source: Crossref)

    The different web mining methods and techniques can help to solve some typical issues of the contemporary websites, contribute to more effective personalization, improve a website structure and reorganize its web pages. However, only several papers tried to combine web structure and web usage mining (WUM) methods with this aim. The paper researches if and how the combination of selected web structure and WUM methods can identify misplaced web pages and how they can contribute to improving the website structure. The paper analyzes the relationship between the estimated importance of the web page from the web page creator’s point of view using the web structure mining method based on PageRank and visitors’ real perception of the importance of that individual web page using the WUM method based on sequence patterns analysis, which eliminates the problem with repeated visits of the same web page during one session. The results prove that the expected probability of accesses to the individual web page correlates with the observed visit rate obtained from the log files using the WUM method. Furthermore, the website can be improved based on the consequent application of the residual analysis on the obtained results. The applicability of the proposed combination of the web structure and WUM methods is presented on two case studies from different application domains of the contemporary web. As a result, the web pages, which are underestimated or overestimated by the web page creators, are successfully identified in both cases.