Can ChatGPT emulate humans in software engineering surveys?

Igor Steinmacher, Jacob Mc Auley Penney, Katia Romero Felizardo, Alessandro F. Garcia, Marco A. Gerosa

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Context: There is a growing belief in the literature that large language models (LLMs), such as ChatGPT, can mimic human behavior in surveys. Gap: While the literature has shown promising results in social sciences and market research, there is scant evidence of its effectiveness in technical fields like software engineering. Objective: Inspired by previous work, this paper explores ChatGPT's ability to replicate findings from prior software engineering research. Given the frequent use of surveys in this field, if LLMs can accurately emulate human responses, this technique could address common methodological challenges like recruitment difficulties, representational shortcomings, and respondent fatigue. Method: We prompted ChatGPT to reflect the behavior of a 'mega-persona' representing the demographic distribution of interest. We replicated surveys from 2019 to 2023 from leading SE conferences, examining ChatGPT's proficiency in mimicking responses from diverse demographics. Results: Our findings reveal that ChatGPT can successfully replicate the outcomes of some studies, but in others, the results were not significantly better than a random baseline. Conclusions: This paper reports our results so far and discusses the challenges and potential research opportunities in leveraging LLMs for representing humans in software engineering surveys.

Original languageEnglish (US)
Title of host publicationProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2024
PublisherIEEE Computer Society
Pages414-419
Number of pages6
ISBN (Electronic)9798400710476
DOIs
StatePublished - Oct 24 2024
Event18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2024 - Barcelona, Spain
Duration: Oct 24 2024Oct 25 2024

Publication series

NameInternational Symposium on Empirical Software Engineering and Measurement
ISSN (Print)1949-3770
ISSN (Electronic)1949-3789

Conference

Conference18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2024
Country/TerritorySpain
CityBarcelona
Period10/24/2410/25/24

Keywords

  • Generative AI
  • Mega-Personas
  • Replication Study
  • Survey

ASJC Scopus subject areas

  • Computer Science Applications
  • Software

Fingerprint

Dive into the research topics of 'Can ChatGPT emulate humans in software engineering surveys?'. Together they form a unique fingerprint.

Cite this