Abstract
We describe an approach for genotyping bacterial strains from low coverage genome datasets, including metagenomic data from complex samples. Sequence reads from unknown samples are aligned to a reference genome where the allele states of known SNPs are determined. The Whole Genome Focused Array SNP Typing (WG-FAST) pipeline can identify unknown strains with much less read data than is needed for genome assembly. To test WG-FAST, we resampled SNPs from real samples to understand the relationship between low coverage metagenomic data and accurate phylogenetic placement. WG-FAST can be downloaded from https://github.com/jasonsahl/wgfast.
Original language | English (US) |
---|---|
Article number | 52 |
Journal | Genome Medicine |
Volume | 7 |
Issue number | 1 |
DOIs | |
State | Published - Jun 9 2015 |
ASJC Scopus subject areas
- Molecular Medicine
- Molecular Biology
- Genetics
- Genetics(clinical)