Can AI serve as a substitute for human subjects in software engineering research?

Marco Gerosa, Bianca Trinkenreich, Igor Steinmacher, Anita Sarma

Research output: Contribution to journalArticlepeer-review


Research within sociotechnical domains, such as software engineering, fundamentally requires the human perspective. Nevertheless, traditional qualitative data collection methods suffer from difficulties in participant recruitment, scaling, and labor intensity. This vision paper proposes a novel approach to qualitative data collection in software engineering research by harnessing the capabilities of artificial intelligence (AI), especially large language models (LLMs) like ChatGPT and multimodal foundation models. We explore the potential of AI-generated synthetic text as an alternative source of qualitative data, discussing how LLMs can replicate human responses and behaviors in research settings. We discuss AI applications in emulating humans in interviews, focus groups, surveys, observational studies, and user evaluations. We discuss open problems and research opportunities to implement this vision. In the future, an integrated approach where both AI and human-generated data coexist will likely yield the most effective outcomes.

Original languageEnglish (US)
Article number13
JournalAutomated Software Engineering
Issue number1
StatePublished - May 2024


  • Foundation models
  • Large language models
  • Qualitative research
  • Software engineering

ASJC Scopus subject areas

  • Software


Dive into the research topics of 'Can AI serve as a substitute for human subjects in software engineering research?'. Together they form a unique fingerprint.

Cite this