A new paper titled “Synthetic social data: trials and tribulations” challenges the idea that AI can replace human participants in social science research.
Authored by Guido Ivetta (FAMAF, Universidad Nacional de Córdoba) alongside Laura Moradbakhti and Rafael A. Calvo (Imperial College London), the study compares six major Large Language Models (LLMs) against real human data from the World Values Survey across the UK, Argentina, USA, and China.
Key Findings:
- Machine Bias is Real: 94.4% of the AI-generated responses were statistically different from the human benchmark, proving that current models do not accurately reflect human values.
- Humans Win: The study found that a random sample of just two human respondents provided a more accurate estimate of population views than the AI models in over 60% of cases.
- Quality over Quantity: The authors conclude that “machine bias” is a far greater threat to validity than the “sampling bias” of small human surveys.
The research sends a clear message to the field: despite the logistical challenges, there is currently no valid substitute for data gathered from actual humans.
More information can be found here: https://arxiv.org/abs/2510.19952

Leave a comment