Data from: When outgroups fail; phylogenomics of rooting the emerging pathogen, Coxiella burnetii

  • Talima Pearson (Northern Arizona University) (Contributor)
  • James S. Beckstrom-Sternberg (Contributor)
  • Matthew O'Neill (Contributor)
  • Paul Keim (Northern Arizona University) (Contributor)
  • Rachael A. Priestley (Contributor)
  • Sarah Schaack (Contributor)
  • Stephen M. Beckstrom-Sternberg (Northern Arizona University) (Contributor)
  • Mia D. Champion (Contributor)
  • Heidie M. Hornstra (Contributor)
  • Gilbert J. Kersh (Contributor)
  • J. Schupp (Contributor)
  • Jason W. Sahl (Contributor)
  • Robert F. Massung (Contributor)
  • James E. Samuel (Contributor)



Rooting phylogenies is critical for understanding evolution, yet the importance, intricacies and difficulties of rooting are often overlooked. For rooting, polymorphic characters among the group of interest (ingroup) must be compared to those of a relative (outgroup) that diverged before the last common ancestor (LCA) of the ingroup. Problems arise if an outgroup does not exist, is unknown, or is so distant that few characters are shared, in which case duplicated genes originating before the LCA can be used as proxy outgroups to root diverse phylogenies. Here, we describe a genome-wide expansion of this technique that can be used to solve problems at the other end of the evolutionary scale: where ingroup individuals are all very closely related to each other, but the next closest relative is very distant. We used shared orthologous single nucleotide polymorphisms (SNPs) from 10 whole genome sequences of Coxiella burnetii, the causative agent of Q fever in humans, to create a robust, but unrooted phylogeny. To maximize the number of characters informative about the rooting, we searched entire genomes for polymorphic duplicated regions where orthologs of each paralog could be identified so that the paralogs could be used to root the tree. Recent radiations, such as those of emerging pathogens, often pose rooting challenges due to a lack of ingroup variation and large genomic differences with known outgroups. Using a phylogenomic approach, we created a robust, rooted phylogeny for C. burnetii.,Supplementary Information S1S1. SNP matrix of 11,286 orthologous SNPs discovered among ten C. burnetii genomes.SupportinglnformationS2S2. Maximum parsimony tree using seven parsimony informative (synapomorphic) SNPs with loci shared with closest outgroup species. Only one SNP locus (out of 11,386 total SNPs among C. burnetii genomes) was present in all genomes. The remaining six loci were found by relaxing the requirement that all loci are shared among all C. burnetii genomes as they were not present in the Q177 genome. All seven SNP loci were present in Pseudomonas syringae and Legionella pneumophila. Four of the seven loci were present in Ricketsiella grylli. Consistency index = 1.0. Numbers on branches indicate bootstrap support percentages from 1000 bootstrap replicates.SupportingInformationS3S3. Rooted maximum parsimony tree of C. burnetii MST genotypes (1-34 and Dugway) after Hornstra et al. (Hornstra, Priestley, et al. 2011). Whole genome sequences are mapped onto the tree and the major genomic groups of C. burnetii are identified.,
Date made availableSep 1 2013

Cite this