Phylomark, a tool to identify conserved phylogenetic markers from whole-genome alignments

Jason W. Sahl, Malcolm N. Matalka, David A. Rasko

Research output: Contribution to journalArticlepeer-review

39 Scopus citations

Abstract

The sequencing and analysis of multiple housekeeping genes has been routinely used to phylogenetically compare closely related bacterial isolates. Recent studies using whole-genome alignment (WGA) and phylogenetics from >100 Escherichia coli genomes has demonstrated that tree topologies from WGA and multilocus sequence typing (MLST) markers differ significantly. A nonrepresentative phylogeny can lead to incorrect conclusions regarding important evolutionary relationships. In this study, the Phylomark algorithm was developed to identify a minimal number of useful phylogenetic markers that recapitulate the WGA phylogeny. To test the algorithm, we used a set of diverse draft and complete E. coli genomes. The algorithm identified more than 100,000 potential markers of different fragment lengths (500 to 900 nucleotides). Three molecular markers were ultimately chosen to determine the phylogeny based on a low Robinson-Foulds (RF) distance compared to the WGA phylogeny. A phylogenetic analysis demonstrated that a more representative phylogeny was inferred for a concatenation of these markers compared to all other MLST schemes for E. coli. As a functional test of the algorithm, the three markers (genomic guided E. coli markers, or GIG-EM) were amplified and sequenced from a set of environmental E. coli strains (ECOR collection) and informatically extracted from a set of 78 diarrheagenic E. coli strains (DECA collection). In the instances of the 40-genome test set and the DECA collection, the GIG-EM system outperformed other E. coli MLST systems in terms of recapitulating the WGA phylogeny. This algorithm can be employed to determine the minimal marker set for any organism that has sufficient genome sequencing.

Original languageEnglish (US)
Pages (from-to)4884-4892
Number of pages9
JournalApplied and Environmental Microbiology
Volume78
Issue number14
DOIs
StatePublished - Jul 2012
Externally publishedYes

ASJC Scopus subject areas

  • Biotechnology
  • Food Science
  • Applied Microbiology and Biotechnology
  • Ecology

Fingerprint

Dive into the research topics of 'Phylomark, a tool to identify conserved phylogenetic markers from whole-genome alignments'. Together they form a unique fingerprint.

Cite this