|
|
---
|
|
|
license: mit
|
|
|
tags:
|
|
|
- recommendation-system
|
|
|
- collaborative-filtering
|
|
|
- matrix-factorization
|
|
|
- movie-recommendations
|
|
|
- movielens
|
|
|
- machine-learning
|
|
|
library_name: scikit-learn
|
|
|
---
|
|
|
|
|
|
# DataSynthis_ML_JobTask
|
|
|
|
|
|
A powerful movie recommendation system using collaborative filtering and matrix factorization techniques on the MovieLens 100k dataset.
|
|
|
|
|
|
## Model Description
|
|
|
|
|
|
This model provides personalized movie recommendations using two state-of-the-art algorithms:
|
|
|
|
|
|
- **Collaborative Filtering (CF)**: Item-based similarity using cosine similarity
|
|
|
- **Matrix Factorization (SVD)**: Singular Value Decomposition for dimensionality reduction
|
|
|
|
|
|
## Dataset
|
|
|
|
|
|
- **MovieLens 100k**: 100,000 ratings from 943 users on 1,682 movies
|
|
|
- **User ID Range**: 1-943
|
|
|
- **Movie Count**: 1,682 unique movies
|
|
|
- **Rating Scale**: 1-5 stars
|
|
|
|
|
|
## Usage
|
|
|
|
|
|
### Python
|
|
|
|
|
|
```python
|
|
|
from model import predict
|
|
|
|
|
|
# Get recommendations using SVD (default)
|
|
|
recommendations = predict(user_id=1, n_recommendations=10, method="svd")
|
|
|
|
|
|
# Get recommendations using collaborative filtering
|
|
|
recommendations = predict(user_id=1, n_recommendations=10, method="cf")
|
|
|
|
|
|
print(recommendations)
|
|
|
```
|
|
|
|
|
|
### Parameters
|
|
|
|
|
|
- **user_id** (int): User ID between 1-943 (required)
|
|
|
- **n_recommendations** (int): Number of recommendations between 1-20 (default: 10)
|
|
|
- **method** (str): "svd" for matrix factorization or "cf" for collaborative filtering (default: "svd")
|
|
|
|
|
|
### Output
|
|
|
|
|
|
Returns a list of dictionaries with movie recommendations:
|
|
|
|
|
|
```json
|
|
|
[
|
|
|
{
|
|
|
"movie_id": 50,
|
|
|
"title": "Star Wars (1977)",
|
|
|
"predicted_rating": 4.5
|
|
|
},
|
|
|
{
|
|
|
"movie_id": 181,
|
|
|
"title": "Return of the Jedi (1983)",
|
|
|
"predicted_rating": 4.3
|
|
|
}
|
|
|
]
|
|
|
```
|
|
|
|
|
|
## Model Performance
|
|
|
|
|
|
- **SVD Method**: Fast predictions with good accuracy using 20 components
|
|
|
- **Collaborative Filtering**: More interpretable, based on item similarity
|
|
|
- **Cold Start Handling**: Graceful error handling for unknown users
|
|
|
|
|
|
## Technical Details
|
|
|
|
|
|
- **Framework**: Scikit-learn
|
|
|
- **Algorithms**: TruncatedSVD, Cosine Similarity
|
|
|
- **Data Processing**: Pandas for efficient matrix operations
|
|
|
- **Memory Efficient**: Optimized for large-scale recommendation tasks
|
|
|
|
|
|
## Installation
|
|
|
|
|
|
```bash
|
|
|
pip install pandas numpy scikit-learn
|
|
|
```
|
|
|
|
|
|
## Training
|
|
|
|
|
|
The model is pre-trained on the MovieLens 100k dataset. To retrain:
|
|
|
|
|
|
```python
|
|
|
from model import MovieRecommender
|
|
|
|
|
|
model = MovieRecommender()
|
|
|
model.load_data()
|
|
|
model.train()
|
|
|
model.save_model("movie_recommender.pkl")
|
|
|
```
|
|
|
|
|
|
## Citation
|
|
|
|
|
|
```bibtex
|
|
|
@misc{datasynthis_ml_jobtask,
|
|
|
title={DataSynthis ML JobTask: Movie Recommendation System},
|
|
|
author={tasdid25},
|
|
|
year={2025},
|
|
|
url={https://huggingface.co/tasdid25/DataSynthis_ML_JobTask}
|
|
|
}
|
|
|
```
|
|
|
|
|
|
## License
|
|
|
|
|
|
MIT License - see LICENSE file for details. |