| Literature DB >> 22886560 |
André Altmann1, Peter Weber, Daniel Bader, Michael Preuss, Elisabeth B Binder, Bertram Müller-Myhsok.
Abstract
High-throughput DNA sequencing (HTS) is of increasing importance in the life sciences. One of its most prominent applications is the sequencing of whole genomes or targeted regions of the genome such as all exonic regions (i.e., the exome). Here, the objective is the identification of genetic variants such as single nucleotide polymorphisms (SNPs). The extraction of SNPs from the raw genetic sequences involves many processing steps and the application of a diverse set of tools. We review the essential building blocks for a pipeline that calls SNPs from raw HTS data. The pipeline includes quality control, mapping of short reads to the reference genome, visualization and post-processing of the alignment including base quality recalibration. The final steps of the pipeline include the SNP calling procedure along with filtering of SNP candidates. The steps of this pipeline are accompanied by an analysis of a publicly available whole-exome sequencing dataset. To this end, we employ several alignment programs and SNP calling routines for highlighting the fact that the choice of the tools significantly affects the final results.Entities:
Mesh:
Year: 2012 PMID: 22886560 DOI: 10.1007/s00439-012-1213-z
Source DB: PubMed Journal: Hum Genet ISSN: 0340-6717 Impact factor: 4.132