TY - GEN
T1 - Measuring Effectiveness of Mutant Sets
AU - Gopinath, Rahul
AU - Alipour, Amin
AU - Ahmed, Iftekhar
AU - Jensen, Carlos
AU - Groce, Alex
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/8/1
Y1 - 2016/8/1
N2 - Redundancy in mutants, where multiple mutants end up producing the same semantic variant of a program, is a major problem in mutation analysis. Hence, a measure of effectiveness that accounts for redundancy is an essential tool for evaluating mutation tools, new operators, and reduction techniques. Previous research suggests using the size of the disjoint mutant set as an effectiveness measure. We start from a simple premise: test suites need to be judged on both the number of unique variations in specifications they detect (as a variation measure), and also on how good they are at detecting hard-to-find faults (as a measure of thoroughness). Hence, any set of mutants should be judged by how well it supports these measurements. We show that the disjoint mutant set has two major inadequacies - the single variant assumption and the large test suite assumption - when used as a measure of effectiveness in variation. These stem from its reliance on minimal test suites. We show that when used to emulate hard to find bugs (as a measure of thoroughness), disjoint mutant set discards useful mutants. We propose two alternatives: one measures variation and is not vulnerable to either the single variant assumption or the large test suite assumption, the other measures thoroughness. We provide a benchmark of these measures using diverse tools.
AB - Redundancy in mutants, where multiple mutants end up producing the same semantic variant of a program, is a major problem in mutation analysis. Hence, a measure of effectiveness that accounts for redundancy is an essential tool for evaluating mutation tools, new operators, and reduction techniques. Previous research suggests using the size of the disjoint mutant set as an effectiveness measure. We start from a simple premise: test suites need to be judged on both the number of unique variations in specifications they detect (as a variation measure), and also on how good they are at detecting hard-to-find faults (as a measure of thoroughness). Hence, any set of mutants should be judged by how well it supports these measurements. We show that the disjoint mutant set has two major inadequacies - the single variant assumption and the large test suite assumption - when used as a measure of effectiveness in variation. These stem from its reliance on minimal test suites. We show that when used to emulate hard to find bugs (as a measure of thoroughness), disjoint mutant set discards useful mutants. We propose two alternatives: one measures variation and is not vulnerable to either the single variant assumption or the large test suite assumption, the other measures thoroughness. We provide a benchmark of these measures using diverse tools.
KW - Empirical Analysis
KW - Mutation Analysis
KW - Software Testing
KW - Theoretical Analysis
UR - http://www.scopus.com/inward/record.url?scp=84992184039&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84992184039&partnerID=8YFLogxK
U2 - 10.1109/ICSTW.2016.45
DO - 10.1109/ICSTW.2016.45
M3 - Conference contribution
AN - SCOPUS:84992184039
T3 - Proceedings - 2016 IEEE International Conference on Software Testing, Verification and Validation Workshops, ICSTW 2016
SP - 132
EP - 141
BT - Proceedings - 2016 IEEE International Conference on Software Testing, Verification and Validation Workshops, ICSTW 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 9th IEEE International Conference on Software Testing, Verification and Validation Workshops, ICSTW 2016
Y2 - 10 April 2016 through 15 April 2016
ER -