TY - JOUR
T1 - The simple fool's guide to population genomics via RNA-Seq
T2 - An introduction to high-throughput sequencing data analysis
AU - De Wit, Pierre
AU - Pespeni, Melissa H.
AU - Ladner, Jason T.
AU - Barshis, Daniel J.
AU - Seneca, François
AU - Jaris, Hannah
AU - Therkildsen, Nina Overgaard
AU - Morikawa, Megan
AU - Palumbi, Stephen R.
PY - 2012/11
Y1 - 2012/11
N2 - High-throughput sequencing technologies are currently revolutionizing the field of biology and medicine, yet bioinformatic challenges in analysing very large data sets have slowed the adoption of these technologies by the community of population biologists. We introduce the 'Simple Fool's Guide to Population Genomics via RNA-seq' (SFG), a document intended to serve as an easy-to-follow protocol, walking a user through one example of high-throughput sequencing data analysis of nonmodel organisms. It is by no means an exhaustive protocol, but rather serves as an introduction to the bioinformatic methods used in population genomics, enabling a user to gain familiarity with basic analysis steps. The SFG consists of two parts. This document summarizes the steps needed and lays out the basic themes for each and a simple approach to follow. The second document is the full SFG, publicly available at http://sfg.stanford.edu, that includes detailed protocols for data processing and analysis, along with a repository of custom-made scripts and sample files. Steps included in the SFG range from tissue collection to de novo assembly, blast annotation, alignment, gene expression, functional enrichment, SNP detection, principal components and FST outlier analyses. Although the technical aspects of population genomics are changing very quickly, our hope is that this document will help population biologists with little to no background in high-throughput sequencing and bioinformatics to more quickly adopt these new techniques.
AB - High-throughput sequencing technologies are currently revolutionizing the field of biology and medicine, yet bioinformatic challenges in analysing very large data sets have slowed the adoption of these technologies by the community of population biologists. We introduce the 'Simple Fool's Guide to Population Genomics via RNA-seq' (SFG), a document intended to serve as an easy-to-follow protocol, walking a user through one example of high-throughput sequencing data analysis of nonmodel organisms. It is by no means an exhaustive protocol, but rather serves as an introduction to the bioinformatic methods used in population genomics, enabling a user to gain familiarity with basic analysis steps. The SFG consists of two parts. This document summarizes the steps needed and lays out the basic themes for each and a simple approach to follow. The second document is the full SFG, publicly available at http://sfg.stanford.edu, that includes detailed protocols for data processing and analysis, along with a repository of custom-made scripts and sample files. Steps included in the SFG range from tissue collection to de novo assembly, blast annotation, alignment, gene expression, functional enrichment, SNP detection, principal components and FST outlier analyses. Although the technical aspects of population genomics are changing very quickly, our hope is that this document will help population biologists with little to no background in high-throughput sequencing and bioinformatics to more quickly adopt these new techniques.
KW - Bioinformatics
KW - De novo assembly
KW - Gene expression
KW - Population genomics
KW - RNA-Seq
KW - SNP detection
UR - http://www.scopus.com/inward/record.url?scp=84867503105&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867503105&partnerID=8YFLogxK
U2 - 10.1111/1755-0998.12003
DO - 10.1111/1755-0998.12003
M3 - Article
C2 - 22931062
AN - SCOPUS:84867503105
SN - 1755-098X
VL - 12
SP - 1058
EP - 1067
JO - Molecular Ecology Resources
JF - Molecular Ecology Resources
IS - 6
ER -