This work is an addition to the paper "SynEvaRec: A Framework for Evaluating Recommender Systems on Synthetic Data Classes" by Provalov et al.
This study proposes a novel method for evaluating and comparing recommender systems using synthetic user and item data and parametric synthetic user-item response (rating) functions. The method compares recommender systems on classes of synthetic data, oppositely to how it is usually done on particular real or synthetic datasets. The usage of classes particularly allows for managing the effects of the No Free Lunch theorem for recommender systems. Furthermore, we implement the method in the form of a flexible framework (that we call SynEvaRec) for conducting comparison experiments under different scenarios of synthetic data behaviour. Our experimental study shows that SynEvaRec helps to determine scenarios (e.g. in terms of data classes) where one recommender system is more preferable than another by means of recommendation quality. Moreover, the results turn to be rather stable over several synthetic dataset instances based on the same real-world dataset indicating the robustness of our method. The datasets, the framework implementation and the results related to our study are publicly available on GitHub.
We use two datasets in this study (/datasets
):
Datasets require pre-processing. The pre-processing is included in the corresponding notebooks ('restaurants_main.ipynb' in the case of Restaurants dataset, 'books_main.ipynb' in the case of Books dataset).
All of the synthetic data can be generated in our code using the jupyter notebooks (/notebooks
) with experiments.
Example for the Restaurants data generation:
syn_data_generator_rests = fit_syn_generator_rests(rests_df)
syn_rests_df = syn_data_generator_rests.sample(100)
To run the experiments for a certain dataset you need to use the notebooks in /notebooks
.
You can evaluate the quality of the following recommender systems now:
- NMF
- SVD
- kNN
If you want to enlarge this set by own model, you can add the desired model manually into /modules/models.py
. If you want to enlarge this set by the existing model, you can add the process of its training end testing in /modules/trainers.py
.
The process of recommender systems quality evaluation on the parametric data is realized in /modules/evaluator.py
.
The least RMSE values (the vertical axis) for the chosen target RSs for different
Python 3.8 + all of the required packages.
pip install poetry
poetry shell
poetry install
@inproceedings{provalov2021synevarec,
title={SynEvaRec: A Framework for Evaluating Recommender Systems on Synthetic Data Classes},
author={Provalov, Vladimir and Stavinova, Elizaveta and Chunaev, Petr},
booktitle={2021 International Conference on Data Mining Workshops (ICDMW)},
pages={55--64},
year={2021},
organization={IEEE}
}