No Access

Collaborative Recommendations using Hierarchical Clustering based on K-d Trees and Quadtrees

Joydeep Das

http://orcid.org/0000-0002-6317-0029

The Heritage Academy, Kolkata, West Bengal, India

E-mail Address: joydeep.das@heritageit.edu

Search for more papers by this author

Subhashis Majumder

Department of Computer Science and Engineering, Heritage Institute of Technology, Kolkata, West Bengal, India

Search for more papers by this author

Prosenjit Gupta

Department of Computer Science, NIIT University, Neemrana, Rajasthan, India

Search for more papers by this author

, and

Kalyani Mali

Department of Computer Science and Engineering, University of Kalyani, Kalyani, West Bengal, India

Search for more papers by this author

https://doi.org/10.1142/S0218488519500284Cited by:20 (Source: Crossref)

Abstract

Majority of the e-commerce sites implement Recommender Systems (RS) to help users navigate through the large search space and assist their decision making process by suggesting products that the user may like. Collaborative Filtering (CF) is the most successful and widely used algorithm in the domain of RS. However, due to the exponential growth of the web in terms of both content and number of users, CF based RS face serious scalability issues. To alleviate this problem, we propose a clustering based CF approach using two hierarchical space partitioning data structures — K-d tree and Quadtree. We cluster or partition the users’ space of the system on the basis of user location and then use the resultant clusters for predicting ratings of a target user. Since the CF based recommendation algorithm is applied separately to the clusters and not on the entire rating data, it helps in bringing down the runtime of the algorithm substantially. We further measure spatial autocorrelation indices in the clusters to justify our clustering method. However, our objective is not only to reduce the runtime but also to maintain an acceptable recommendation quality. This requirement is rightly addressed by the proposed method which assures scalability, by processing very large datasets using the same computing resource. Moreover our proposed clustering scheme is oblivious of the underlying CF algorithm. Results from the extensive experiments conducted, show that our hierarchical clustering based recommendation approach reduces runtime of the standard CF algorithms by about 88%, 82%, 79% and 85% for MovieLens-100K, MovieLens-1M, Book-Crossing and TripAdvisor data respectively, while maintaining good recommendation quality.

Keywords: