TY - GEN
T1 - Predicting Change Propagation from Repository Information
AU - Wiese, Igor Scaliante
AU - Ré, Reginaldo
AU - Steinmacher, Igor
AU - Kuroda, Rodrigo Takashi
AU - Oliva, Gustavo Ansaldi
AU - Gerosa, Marco Aurélio
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/11/11
Y1 - 2015/11/11
N2 - Change propagation occurs when a change in an artifact leads to changes in other artifacts. Previous research has used frequency of past changes between artifacts and different types of artifacts coupling to build prediction models of change propagation. To improve the accuracy of the prediction, we explored the combination of different data from software development repository, such as change requests, communication data, and artifacts modifications. This information can capture different dimensions of software development, what can lead to improvements on the accuracy of the models. We conducted an empirical study in four open source projects, namely Cassandra, Camel, Hadoop, and Lucene. Classifiers were constructed for each pair of artifacts that change together to predict if the change propagation between two files occurs in a certain change request. The models obtained values of area under the curve (AUC) of 0.849 on average. Furthermore, the sensitivity (recall) obtained is almost 4 times higher (57.06% vs. 15.70%) when compared our models to a baseline model built using association rules. With a reduced number of false positives, the models could be used in practice to help developers during software evolution.
AB - Change propagation occurs when a change in an artifact leads to changes in other artifacts. Previous research has used frequency of past changes between artifacts and different types of artifacts coupling to build prediction models of change propagation. To improve the accuracy of the prediction, we explored the combination of different data from software development repository, such as change requests, communication data, and artifacts modifications. This information can capture different dimensions of software development, what can lead to improvements on the accuracy of the models. We conducted an empirical study in four open source projects, namely Cassandra, Camel, Hadoop, and Lucene. Classifiers were constructed for each pair of artifacts that change together to predict if the change propagation between two files occurs in a certain change request. The models obtained values of area under the curve (AUC) of 0.849 on average. Furthermore, the sensitivity (recall) obtained is almost 4 times higher (57.06% vs. 15.70%) when compared our models to a baseline model built using association rules. With a reduced number of false positives, the models could be used in practice to help developers during software evolution.
KW - change coupling
KW - change propagation
KW - co-change
KW - prediction models
UR - http://www.scopus.com/inward/record.url?scp=84962215073&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84962215073&partnerID=8YFLogxK
U2 - 10.1109/SBES.2015.21
DO - 10.1109/SBES.2015.21
M3 - Conference contribution
AN - SCOPUS:84962215073
T3 - Proceedings - 29th Brazilian Symposium on Software Engineering, SBES 2015
SP - 100
EP - 109
BT - Proceedings - 29th Brazilian Symposium on Software Engineering, SBES 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 29th Brazilian Symposium on Software Engineering, SBES 2015
Y2 - 21 September 2015 through 25 September 2015
ER -