TY - GEN
T1 - ChatGPT application in Systematic Literature Reviews in Software Engineering
T2 - 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2024
AU - Felizardo, Katia Romero
AU - Lima, Márcia Sampaio
AU - Deizepe, Anderson
AU - Conte, Tayana Uchôa
AU - Steinmacher, Igor
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/10/24
Y1 - 2024/10/24
N2 - Context: The Systematic Literature Review (SLR) process involves searching, selecting, and synthesizing relevant literature on a specific research topic for evidence-based decision-making in Software Engineering (SE). Due to the time-consuming of the SLR process, tool support is essential. Gap: ChatGPT is a significant advancement in Natural Language Processing (NLP), and it can potentially accelerate time-consuming and propone-error activities, such as the selection activity of the SLR process. Therefore, having a tool to assist in the selection process appears beneficial, and we argue that ChatGPT can facilitate the analysis of extensive studies, saving time and effort. Objective: We aim to evaluate the accuracy (i.e., studies correctly classified) of using ChatGPT-4.0 in SLR in SE, particularly to support the first stage, based on the title, abstract, and keywords. Method: We assessed the accuracy of utilizing ChatGPT for selecting studies, the first stage, to be included in two SLRs (SLR1 and SLR2), in contrast to the conventional method of reading the title and abstract. Results: The accuracy of ChatGPT supporting the initial selection activity was 75.3% (SLR1 - 101 correct selections: 48 inclusions and 53 exclusions; 33 incorrect selections: 17 inclusions and 16 exclusions) and 86.1% (SLR2 - 386 correct selections: 113 inclusions and 273 exclusions; 62 incorrect selections: 27 inclusions and 35 exclusions). Conclusions: Our accuracy results indicate that it is not advisable to completely outsource the selection process to ChatGPT. However, it could be valuable as a support tool, aiding novice researchers or even experienced ones when they are in doubt.
AB - Context: The Systematic Literature Review (SLR) process involves searching, selecting, and synthesizing relevant literature on a specific research topic for evidence-based decision-making in Software Engineering (SE). Due to the time-consuming of the SLR process, tool support is essential. Gap: ChatGPT is a significant advancement in Natural Language Processing (NLP), and it can potentially accelerate time-consuming and propone-error activities, such as the selection activity of the SLR process. Therefore, having a tool to assist in the selection process appears beneficial, and we argue that ChatGPT can facilitate the analysis of extensive studies, saving time and effort. Objective: We aim to evaluate the accuracy (i.e., studies correctly classified) of using ChatGPT-4.0 in SLR in SE, particularly to support the first stage, based on the title, abstract, and keywords. Method: We assessed the accuracy of utilizing ChatGPT for selecting studies, the first stage, to be included in two SLRs (SLR1 and SLR2), in contrast to the conventional method of reading the title and abstract. Results: The accuracy of ChatGPT supporting the initial selection activity was 75.3% (SLR1 - 101 correct selections: 48 inclusions and 53 exclusions; 33 incorrect selections: 17 inclusions and 16 exclusions) and 86.1% (SLR2 - 386 correct selections: 113 inclusions and 273 exclusions; 62 incorrect selections: 27 inclusions and 35 exclusions). Conclusions: Our accuracy results indicate that it is not advisable to completely outsource the selection process to ChatGPT. However, it could be valuable as a support tool, aiding novice researchers or even experienced ones when they are in doubt.
KW - ChatGPT
KW - Selection of studies
KW - Software Engineering
KW - Systematic literature review
UR - http://www.scopus.com/inward/record.url?scp=85210578268&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85210578268&partnerID=8YFLogxK
U2 - 10.1145/3674805.3686666
DO - 10.1145/3674805.3686666
M3 - Conference contribution
AN - SCOPUS:85210578268
T3 - International Symposium on Empirical Software Engineering and Measurement
SP - 25
EP - 36
BT - Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2024
PB - IEEE Computer Society
Y2 - 24 October 2024 through 25 October 2024
ER -