Given Users Recommendations Based on Reviews on Yelp
In our project, we focus on NLP-based hybrid recommendation systems. Our data is from Yelp Data. For our hybrid recommendation system, we have two major components: the first part is to embed the reviews with the Bert model and word2vec model; the second part is the implementation of an item-based collaborative filtering algorithm to compute the similarity of each review under different categories of restaurants. In the end, with the help of similarity scores, we are able to recommend users the most matched restaurant based on their recorded reviews. The coding work is split into several parts: selecting samples and data cleaning, processing, embedding, computing similarity, and computing prediction and error. Due to the size of the data, each part will generate one or more JSON files as the milestone to reduce the pressure on memory and the communication between each part.
READ FULL TEXT