Doing Data Right: How Lessons Learned Working with Conventional Data should Inform the Future of Synthetic Data for Recommender Systems

10/07/2021
by   Manel Slokom, et al.
0

We present a case that the newly emerging field of synthetic data in the area of recommender systems should prioritize `doing data right'. We consider this catchphrase to have two aspects: First, we should not repeat the mistakes of the past, and, second, we should explore the full scope of opportunities presented by synthetic data as we move into the future. We argue that explicit attention to dataset design and description will help to avoid past mistakes with dataset bias and evaluation. In order to fully exploit the opportunities of synthetic data, we point out that researchers can investigate new areas such as using data synthesize to support reproducibility by making data open, as well as FAIR, and to push forward our understanding of data minimization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/15/2023

Impression-Aware Recommender Systems

Novel data sources bring new opportunities to improve the quality of rec...
research
08/09/2020

Partially Synthetic Data for Recommender Systems: Prediction Performance and Preference Hiding

This paper demonstrates the potential of statistical disclosure control ...
research
04/07/2023

Beyond Privacy: Navigating the Opportunities and Challenges of Synthetic Data

Generating synthetic data through generative models is gaining interest ...
research
06/22/2022

Synthetic Data-Based Simulators for Recommender Systems: A Survey

This survey aims at providing a comprehensive overview of the recent tre...
research
04/14/2023

EvalRS 2023. Well-Rounded Recommender Systems For Real-World Deployments

EvalRS aims to bring together practitioners from industry and academia t...
research
08/31/2020

Beyond Our Behavior: The GDPR and Humanistic Personalization

Personalization should take the human person seriously. This requires a ...
research
05/26/2022

Sequential Nature of Recommender Systems Disrupts the Evaluation Process

Datasets are often generated in a sequential manner, where the previous ...

Please sign up or login with your details

Forgot password? Click here to reset