This project is a survey of neighborhood-based collaborative filtering techniques and their effectiveness in a movie recommendation system. Similarity metrics we are surveying include:
- Pearson Correlation Coefficient
- Spearman Rank Correlation Coefficient
- Mean-Squared Distance
- Cosine Similarity
All of these commands should be issued from the main directory of the repository.
pip install -r requirements.txt
python Code/runner.py --mode train --algorithm insert_algorithm_here --model-file algorithm's_name.model --data Data/ratings.csv
python Code/runner.py --mode test --algorithm insert_algorithm_here --model-file algorithm's_name.model --num-neighbors default_is_five --data Data/ratings.csv --predictions-file insert_algorithm_here.predictions
python Code/eval.py Data/ratings.csv algorithm's_name.predictions
python test.py
- pearson
- spearman
- mean_squared
- cosine
For a detailed understanding of the study, see the writeup attached to this repo.
- Establish Data Pipeline
- Setup prediction/testing environment
- Write evaluation script
- Write/test Pearson Correlation Coefficient
- Write/test Spearman Rank Correlation Coefficient
- Write/test Mean-Squared Distance
- Write/test Cosine Similarity
- Write final analysis