Can testedness be effectively measured?

Iftekhar Ahmed, Rahul Gopinath, Caius Brindescu, Alex Groce, Carlos Jensen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

37 Scopus citations

Abstract

Among the major questions that a practicing tester faces are deciding where to focus additional testing effort, and decid-ing when to stop testing. Test the least-Tested code, and stop when all code is well-Tested, is a reasonable answer. Many measures of "testedness" have been proposed; unfortunately, we do not know whether these are truly effective. In this paper we propose a novel evaluation of two of the most important and widely-used measures of test suite qual-ity. The first measure is statement coverage, the simplest and best-known code coverage measure. The second mea-sure is mutation score, a supposedly more powerful, though expensive, measure. We evaluate these measures using the actual criteria of interest: if a program element is (by these measures) well tested at a given point in time, it should require fewer fu-ture bug-fixes than a "poorly tested" element. If not, then it seems likely that we are not effectively measuring tested-ness. Using a large number of open source Java programs from Github and Apache, we show that both statement cov-erage and mutation score have only a weak negative corre-lation with bug-fixes. Despite the lack of strong correlation, there are statistically and practically significant differences between program elements for various binary criteria. Pro-gram elements (other than classes) covered by any test case see about half as many bug-fixes as those not covered, and a similar line can be drawn for mutation score thresholds. Our results have important implications for both software engineering practice and research evaluation.

Original languageEnglish (US)
Title of host publicationFSE 2016 - Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering
EditorsZhendong Su, Thomas Zimmermann, Jane Cleland-Huang
PublisherAssociation for Computing Machinery
Pages547-558
Number of pages12
ISBN (Electronic)9781450342186
DOIs
StatePublished - Nov 1 2016
Externally publishedYes
Event24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016 - Seattle, United States
Duration: Nov 13 2016Nov 18 2016

Publication series

NameProceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering
Volume13-18-November-2016

Conference

Conference24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016
Country/TerritoryUnited States
CitySeattle
Period11/13/1611/18/16

Keywords

  • Coverage criteria
  • Mutation testing
  • Sta-Tistical analysis
  • Test suite evaluation

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'Can testedness be effectively measured?'. Together they form a unique fingerprint.

Cite this