ChatGPT application in Systematic Literature Reviews in Software Engineering: an evaluation of its accuracy to support the selection activity

Katia Romero Felizardo, Márcia Sampaio Lima, Anderson Deizepe, Tayana Uchôa Conte, Igor Steinmacher

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Context: The Systematic Literature Review (SLR) process involves searching, selecting, and synthesizing relevant literature on a specific research topic for evidence-based decision-making in Software Engineering (SE). Due to the time-consuming of the SLR process, tool support is essential. Gap: ChatGPT is a significant advancement in Natural Language Processing (NLP), and it can potentially accelerate time-consuming and propone-error activities, such as the selection activity of the SLR process. Therefore, having a tool to assist in the selection process appears beneficial, and we argue that ChatGPT can facilitate the analysis of extensive studies, saving time and effort. Objective: We aim to evaluate the accuracy (i.e., studies correctly classified) of using ChatGPT-4.0 in SLR in SE, particularly to support the first stage, based on the title, abstract, and keywords. Method: We assessed the accuracy of utilizing ChatGPT for selecting studies, the first stage, to be included in two SLRs (SLR1 and SLR2), in contrast to the conventional method of reading the title and abstract. Results: The accuracy of ChatGPT supporting the initial selection activity was 75.3% (SLR1 - 101 correct selections: 48 inclusions and 53 exclusions; 33 incorrect selections: 17 inclusions and 16 exclusions) and 86.1% (SLR2 - 386 correct selections: 113 inclusions and 273 exclusions; 62 incorrect selections: 27 inclusions and 35 exclusions). Conclusions: Our accuracy results indicate that it is not advisable to completely outsource the selection process to ChatGPT. However, it could be valuable as a support tool, aiding novice researchers or even experienced ones when they are in doubt.

Original languageEnglish (US)
Title of host publicationProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2024
PublisherIEEE Computer Society
Pages25-36
Number of pages12
ISBN (Electronic)9798400710476
DOIs
StatePublished - Oct 24 2024
Event18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2024 - Barcelona, Spain
Duration: Oct 24 2024Oct 25 2024

Publication series

NameInternational Symposium on Empirical Software Engineering and Measurement
ISSN (Print)1949-3770
ISSN (Electronic)1949-3789

Conference

Conference18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2024
Country/TerritorySpain
CityBarcelona
Period10/24/2410/25/24

Keywords

  • ChatGPT
  • Selection of studies
  • Software Engineering
  • Systematic literature review

ASJC Scopus subject areas

  • Computer Science Applications
  • Software

Fingerprint

Dive into the research topics of 'ChatGPT application in Systematic Literature Reviews in Software Engineering: an evaluation of its accuracy to support the selection activity'. Together they form a unique fingerprint.

Cite this