Cleaned Similarity for Better Memory-Based Recommenders

05/17/2019
by   Farhan Khawar, et al.
0

Memory-based collaborative filtering methods like user or item k-nearest neighbors (kNN) are a simple yet effective solution to the recommendation problem. The backbone of these methods is the estimation of the empirical similarity between users/items. In this paper, we analyze the spectral properties of the Pearson and the cosine similarity estimators, and we use tools from random matrix theory to argue that they suffer from noise and eigenvalues spreading. We argue that, unlike the Pearson correlation, the cosine similarity naturally possesses the desirable property of eigenvalue shrinkage for large eigenvalues. However, due to its zero-mean assumption, it overestimates the largest eigenvalues. We quantify this overestimation and present a simple re-scaling and noise cleaning scheme. This results in better performance of the memory-based methods compared to their vanilla counterparts.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/31/2014

A Latent Source Model for Online Collaborative Filtering

Despite the prevalence of collaborative filtering in recommendation syst...
research
10/13/2022

Sapling Similarity outperforms other local similarity metrics in collaborative filtering

Many bipartite networks describe systems where a link represents a relat...
research
07/11/2018

The importance of being dissimilar in Recommendation

Similarity measures play a fundamental role in memory-based nearest neig...
research
11/24/2021

Combinations of Jaccard with Numerical Measures for Collaborative Filtering Enhancement: Current Work and Future Proposal

Collaborative filtering (CF) is an important approach for recommendation...
research
10/27/2020

Source Enumeration via RMT Estimator Based on Linear Shrinkage Estimation of Noise Eigenvalues Using Relatively Few Samples

Estimating the number of signals embedded in noise is a fundamental prob...
research
06/05/2017

SimDex: Exploiting Model Similarity in Exact Matrix Factorization Recommendations

We present SimDex, a new technique for serving exact top-K recommendatio...

Please sign up or login with your details

Forgot password? Click here to reset