TY - GEN
T1 - How hard does mutation analysis have to be, anyway?
AU - Gopinath, Rahul
AU - Alipour, Amin
AU - Ahmed, Iftekhar
AU - Jensen, Carlos
AU - Groce, Alex
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2016/1/13
Y1 - 2016/1/13
N2 - Mutation analysis is considered the best method for measuring the adequacy of test suites. However, the number of test runs required for a full mutation analysis grows faster than project size, which is not feasible for real-world software projects, which often have more than a million lines of code. It is for projects of this size, however, that developers most need a method for evaluating the efficacy of a test suite. Various strategies have been proposed to deal with the explosion of mutants. However, these strategies at best reduce the number of mutants required to a fraction of overall mutants, which still grows with program size. Running, e.g., 5% of all mutants of a 2MLOC program usually requires analyzing over 100,000 mutants. Similarly, while various approaches have been proposed to tackle equivalent mutants, none completely eliminate the problem, and the fraction of equivalent mutants remaining is hard to estimate, often requiring manual analysis of equivalence. In this paper, we provide both theoretical analysis and empirical evidence that a small constant sample of mutants yields statistically similar results to running a full mutation analysis, regardless of the size of the program or similarity between mutants. We show that a similar approach, using a constant sample of inputs can estimate the degree of stubbornness in mutants remaining to a high degree of statistical confidence, and provide a mutation analysis framework for Python that incorporates the analysis of stubbornness of mutants.
AB - Mutation analysis is considered the best method for measuring the adequacy of test suites. However, the number of test runs required for a full mutation analysis grows faster than project size, which is not feasible for real-world software projects, which often have more than a million lines of code. It is for projects of this size, however, that developers most need a method for evaluating the efficacy of a test suite. Various strategies have been proposed to deal with the explosion of mutants. However, these strategies at best reduce the number of mutants required to a fraction of overall mutants, which still grows with program size. Running, e.g., 5% of all mutants of a 2MLOC program usually requires analyzing over 100,000 mutants. Similarly, while various approaches have been proposed to tackle equivalent mutants, none completely eliminate the problem, and the fraction of equivalent mutants remaining is hard to estimate, often requiring manual analysis of equivalence. In this paper, we provide both theoretical analysis and empirical evidence that a small constant sample of mutants yields statistically similar results to running a full mutation analysis, regardless of the size of the program or similarity between mutants. We show that a similar approach, using a constant sample of inputs can estimate the degree of stubbornness in mutants remaining to a high degree of statistical confidence, and provide a mutation analysis framework for Python that incorporates the analysis of stubbornness of mutants.
UR - http://www.scopus.com/inward/record.url?scp=84964910237&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84964910237&partnerID=8YFLogxK
U2 - 10.1109/ISSRE.2015.7381815
DO - 10.1109/ISSRE.2015.7381815
M3 - Conference contribution
AN - SCOPUS:84964910237
T3 - 2015 IEEE 26th International Symposium on Software Reliability Engineering, ISSRE 2015
SP - 216
EP - 227
BT - 2015 IEEE 26th International Symposium on Software Reliability Engineering, ISSRE 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 26th IEEE International Symposium on Software Reliability Engineering, ISSRE 2015
Y2 - 2 November 2015 through 5 November 2015
ER -