Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 A thesaurus of genetic variation for interrogation of repetitive genomic regions.

Literature DB >> 25820428

A thesaurus of genetic variation for interrogation of repetitive genomic regions.

Claudia Kerzendorfer¹, Tomasz Konopka², Sebastian M B Nijman³.

Abstract

Detecting genetic variation is one of the main applications of high-throughput sequencing, but is still challenging wherever aligning short reads poses ambiguities. Current state-of-the-art variant calling approaches avoid such regions, arguing that it is necessary to sacrifice detection sensitivity to limit false discovery. We developed a method that links candidate variant positions within repetitive genomic regions into clusters. The technique relies on a resource, a thesaurus of genetic variation, that enumerates genomic regions with similar sequence. The resource is computationally intensive to generate, but once compiled can be applied efficiently to annotate and prioritize variants in repetitive regions. We show that thesaurus annotation can reduce the rate of false variant calls due to mappability by up to three orders of magnitude. We apply the technique to whole genome datasets and establish that called variants in low mappability regions annotated using the thesaurus can be experimentally validated. We then extend the analysis to a large panel of exomes to show that the annotation technique opens possibilities to study variation in hereto hidden and under-studied parts of the genome.

Entities: Disease Gene Species

Mesh：

Year: 2015 PMID： 25820428 PMCID： PMC4446415 DOI： 10.1093/nar/gkv178

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

32 in total

1. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing.

Authors: Daniel C Koboldt; Qunyuan Zhang; David E Larson; Dong Shen; Michael D McLellan; Ling Lin; Christopher A Miller; Elaine R Mardis; Li Ding; Richard K Wilson
Journal: Genome Res Date: 2012-02-02 Impact factor: 9.043

2. JointSNVMix: a probabilistic model for accurate detection of somatic mutations in normal/tumour paired next-generation sequencing data.

Authors: Andrew Roth; Jiarui Ding; Ryan Morin; Anamaria Crisan; Gavin Ha; Ryan Giuliany; Ali Bashashati; Martin Hirst; Gulisa Turashvili; Arusha Oloumi; Marco A Marra; Samuel Aparicio; Sohrab P Shah
Journal: Bioinformatics Date: 2012-01-27 Impact factor: 6.937

3. ReviSTER: an automated pipeline to revise misaligned reads to simple tandem repeats.

Authors: Hongseok Tae; Kevin W McMahon; Robert E Settlage; Jasmin H Bavarva; Harold R Garner
Journal: Bioinformatics Date: 2013-05-15 Impact factor: 6.937

Review 4. Repetitive DNA and next-generation sequencing: computational challenges and solutions.

Authors: Todd J Treangen; Steven L Salzberg
Journal: Nat Rev Genet Date: 2011-11-29 Impact factor: 53.242

Review 5. Sequencing studies in human genetics: design and interpretation.

Authors: David B Goldstein; Andrew Allen; Jonathan Keebler; Elliott H Margulies; Steven Petrou; Slavé Petrovski; Shamil Sunyaev
Journal: Nat Rev Genet Date: 2013-06-11 Impact factor: 53.242

6. GENCODE: the reference human genome annotation for The ENCODE Project.

Authors: Jennifer Harrow; Adam Frankish; Jose M Gonzalez; Electra Tapanari; Mark Diekhans; Felix Kokocinski; Bronwen L Aken; Daniel Barrell; Amonida Zadissa; Stephen Searle; If Barnes; Alexandra Bignell; Veronika Boychenko; Toby Hunt; Mike Kay; Gaurab Mukherjee; Jeena Rajan; Gloria Despacio-Reyes; Gary Saunders; Charles Steward; Rachel Harte; Michael Lin; Cédric Howald; Andrea Tanzer; Thomas Derrien; Jacqueline Chrast; Nathalie Walters; Suganthi Balasubramanian; Baikang Pei; Michael Tress; Jose Manuel Rodriguez; Iakes Ezkurdia; Jeltje van Baren; Michael Brent; David Haussler; Manolis Kellis; Alfonso Valencia; Alexandre Reymond; Mark Gerstein; Roderic Guigó; Tim J Hubbard
Journal: Genome Res Date: 2012-09 Impact factor: 9.043

A thesaurus of genetic variation for interrogation of repetitive genomic regions.

1. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing.

2. JointSNVMix: a probabilistic model for accurate detection of somatic mutations in normal/tumour paired next-generation sequencing data.

3. ReviSTER: an automated pipeline to revise misaligned reads to simple tandem repeats.

Review 4. Repetitive DNA and next-generation sequencing: computational challenges and solutions.

Review 5. Sequencing studies in human genetics: design and interpretation.

6. GENCODE: the reference human genome annotation for The ENCODE Project.

7. Genomic dark matter: the reliability of short read mapping illustrated by the genome mappability score.

8. Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and genome analyzer systems.

9. An integrated encyclopedia of DNA elements in the human genome.

10. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples.

1. Comparison of genetic variants in matched samples using thesaurus annotation.

2. A pan-cancer landscape of somatic mutations in non-unique regions of the human genome.