tasdid25's picture
Upload folder using huggingface_hub
ecbefb3 verified
---
license: mit
tags:
- recommendation-system
- collaborative-filtering
- matrix-factorization
- movie-recommendations
- movielens
- machine-learning
library_name: scikit-learn
---
# DataSynthis_ML_JobTask
A powerful movie recommendation system using collaborative filtering and matrix factorization techniques on the MovieLens 100k dataset.
## Model Description
This model provides personalized movie recommendations using two state-of-the-art algorithms:
- **Collaborative Filtering (CF)**: Item-based similarity using cosine similarity
- **Matrix Factorization (SVD)**: Singular Value Decomposition for dimensionality reduction
## Dataset
- **MovieLens 100k**: 100,000 ratings from 943 users on 1,682 movies
- **User ID Range**: 1-943
- **Movie Count**: 1,682 unique movies
- **Rating Scale**: 1-5 stars
## Usage
### Python
```python
from model import predict
# Get recommendations using SVD (default)
recommendations = predict(user_id=1, n_recommendations=10, method="svd")
# Get recommendations using collaborative filtering
recommendations = predict(user_id=1, n_recommendations=10, method="cf")
print(recommendations)
```
### Parameters
- **user_id** (int): User ID between 1-943 (required)
- **n_recommendations** (int): Number of recommendations between 1-20 (default: 10)
- **method** (str): "svd" for matrix factorization or "cf" for collaborative filtering (default: "svd")
### Output
Returns a list of dictionaries with movie recommendations:
```json
[
{
"movie_id": 50,
"title": "Star Wars (1977)",
"predicted_rating": 4.5
},
{
"movie_id": 181,
"title": "Return of the Jedi (1983)",
"predicted_rating": 4.3
}
]
```
## Model Performance
- **SVD Method**: Fast predictions with good accuracy using 20 components
- **Collaborative Filtering**: More interpretable, based on item similarity
- **Cold Start Handling**: Graceful error handling for unknown users
## Technical Details
- **Framework**: Scikit-learn
- **Algorithms**: TruncatedSVD, Cosine Similarity
- **Data Processing**: Pandas for efficient matrix operations
- **Memory Efficient**: Optimized for large-scale recommendation tasks
## Installation
```bash
pip install pandas numpy scikit-learn
```
## Training
The model is pre-trained on the MovieLens 100k dataset. To retrain:
```python
from model import MovieRecommender
model = MovieRecommender()
model.load_data()
model.train()
model.save_model("movie_recommender.pkl")
```
## Citation
```bibtex
@misc{datasynthis_ml_jobtask,
title={DataSynthis ML JobTask: Movie Recommendation System},
author={tasdid25},
year={2025},
url={https://huggingface.co/tasdid25/DataSynthis_ML_JobTask}
}
```
## License
MIT License - see LICENSE file for details.