Facilitating bioinformatics reproducibility with QIIME 2 provenance Replay

Christopher R. Keefe, Matthew R. Dillon, Elizabeth Gehret, Chloe Herman, Mary Jewell, Colin V. Wood, Evan Bolyen, J. Gregory Caporaso

Research output: Contribution to journalArticlepeer-review


Study reproducibility is essential to corroborate, build on, and learn from the results of scientific research but is notoriously challenging in bioinformatics, which often involves large data sets and complex analytic workflows involving many different tools. Additionally, many biologists are not trained in how to effectively record their bioinformatics analysis steps to ensure reproducibility, so critical information is often missing. Software tools used in bioinformatics can automate provenance tracking of the results they generate, removing most barriers to bioinformatics reproducibility. Here we present an implementation of that idea, Provenance Replay, a tool for generating new executable code from results generated with the QIIME 2 bioinformatics platform, and discuss considerations for bioinformatics developers who wish to implement similar functionality in their software.

Original languageEnglish (US)
Article numbere1011676
JournalPLoS Computational Biology
Issue number11
StatePublished - Nov 2023

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Modeling and Simulation
  • Ecology
  • Molecular Biology
  • Genetics
  • Cellular and Molecular Neuroscience
  • Computational Theory and Mathematics


Dive into the research topics of 'Facilitating bioinformatics reproducibility with QIIME 2 provenance Replay'. Together they form a unique fingerprint.

Cite this