TY - JOUR
T1 - Can AI serve as a substitute for human subjects in software engineering research?
AU - Gerosa, Marco
AU - Trinkenreich, Bianca
AU - Steinmacher, Igor
AU - Sarma, Anita
N1 - Publisher Copyright:
© 2024, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2024/5
Y1 - 2024/5
N2 - Research within sociotechnical domains, such as software engineering, fundamentally requires the human perspective. Nevertheless, traditional qualitative data collection methods suffer from difficulties in participant recruitment, scaling, and labor intensity. This vision paper proposes a novel approach to qualitative data collection in software engineering research by harnessing the capabilities of artificial intelligence (AI), especially large language models (LLMs) like ChatGPT and multimodal foundation models. We explore the potential of AI-generated synthetic text as an alternative source of qualitative data, discussing how LLMs can replicate human responses and behaviors in research settings. We discuss AI applications in emulating humans in interviews, focus groups, surveys, observational studies, and user evaluations. We discuss open problems and research opportunities to implement this vision. In the future, an integrated approach where both AI and human-generated data coexist will likely yield the most effective outcomes.
AB - Research within sociotechnical domains, such as software engineering, fundamentally requires the human perspective. Nevertheless, traditional qualitative data collection methods suffer from difficulties in participant recruitment, scaling, and labor intensity. This vision paper proposes a novel approach to qualitative data collection in software engineering research by harnessing the capabilities of artificial intelligence (AI), especially large language models (LLMs) like ChatGPT and multimodal foundation models. We explore the potential of AI-generated synthetic text as an alternative source of qualitative data, discussing how LLMs can replicate human responses and behaviors in research settings. We discuss AI applications in emulating humans in interviews, focus groups, surveys, observational studies, and user evaluations. We discuss open problems and research opportunities to implement this vision. In the future, an integrated approach where both AI and human-generated data coexist will likely yield the most effective outcomes.
KW - Foundation models
KW - Large language models
KW - Qualitative research
KW - Software engineering
UR - http://www.scopus.com/inward/record.url?scp=85182480046&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85182480046&partnerID=8YFLogxK
U2 - 10.1007/s10515-023-00409-6
DO - 10.1007/s10515-023-00409-6
M3 - Article
AN - SCOPUS:85182480046
SN - 0928-8910
VL - 31
JO - Automated Software Engineering
JF - Automated Software Engineering
IS - 1
M1 - 13
ER -