Literature DB >> 22178994

Optimized filtering reduces the error rate in detecting genomic variants by short-read sequencing.

Joke Reumers1, Peter De Rijk, Hui Zhao, Anthony Liekens, Dominiek Smeets, John Cleary, Peter Van Loo, Maarten Van Den Bossche, Kirsten Catthoor, Bernard Sabbe, Evelyn Despierre, Ignace Vergote, Brian Hilbush, Diether Lambrechts, Jurgen Del-Favero.   

Abstract

Distinguishing single-nucleotide variants (SNVs) from errors in whole-genome sequences remains challenging. Here we describe a set of filters, together with a freely accessible software tool, that selectively reduce error rates and thereby facilitate variant detection in data from two short-read sequencing technologies, Complete Genomics and Illumina. By sequencing the nearly identical genomes from monozygotic twins and considering shared SNVs as 'true variants' and discordant SNVs as 'errors', we optimized thresholds for 12 individual filters and assessed which of the 1,048 filter combinations were effective in terms of sensitivity and specificity. Cumulative application of all effective filters reduced the error rate by 290-fold, facilitating the identification of genetic differences between monozygotic twins. We also applied an adapted, less stringent set of filters to reliably identify somatic mutations in a highly rearranged tumor and to identify variants in the NA19240 HapMap genome relative to a reference set of SNVs.

Entities:  

Mesh:

Year:  2011        PMID: 22178994     DOI: 10.1038/nbt.2053

Source DB:  PubMed          Journal:  Nat Biotechnol        ISSN: 1087-0156            Impact factor:   54.908


  49 in total

1.  TRANSFAC: an integrated system for gene expression regulation.

Authors:  E Wingender; X Chen; R Hehl; H Karas; I Liebich; V Matys; T Meinhardt; M Prüss; I Reuter; F Schacherer
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Whole-genome sequencing and comprehensive variant analysis of a Japanese individual using massively parallel sequencing.

Authors:  Akihiro Fujimoto; Hidewaki Nakagawa; Naoya Hosono; Kaoru Nakano; Tetsuo Abe; Keith A Boroevich; Masao Nagasaki; Rui Yamaguchi; Tetsuo Shibuya; Michiaki Kubo; Satoru Miyano; Yusuke Nakamura; Tatsuhiko Tsunoda
Journal:  Nat Genet       Date:  2010-10-24       Impact factor: 38.330

3.  novoSNP, a novel computational tool for sequence variation discovery.

Authors:  Stefan Weckx; Jurgen Del-Favero; Rosa Rademakers; Lieve Claes; Marc Cruts; Peter De Jonghe; Christine Van Broeckhoven; Peter De Rijk
Journal:  Genome Res       Date:  2005-03       Impact factor: 9.043

4.  Distribution and intensity of constraint in mammalian genomic sequence.

Authors:  Gregory M Cooper; Eric A Stone; George Asimenos; Eric D Green; Serafim Batzoglou; Arend Sidow
Journal:  Genome Res       Date:  2005-06-17       Impact factor: 9.043

5.  A Hidden Markov Model approach to variation among sites in rate of evolution.

Authors:  J Felsenstein; G A Churchill
Journal:  Mol Biol Evol       Date:  1996-01       Impact factor: 16.240

6.  Retroviral RNA identified in the cerebrospinal fluids and brains of individuals with schizophrenia.

Authors:  H Karlsson; S Bachmann; J Schröder; J McArthur; E F Torrey; R H Yolken
Journal:  Proc Natl Acad Sci U S A       Date:  2001-04-10       Impact factor: 11.205

7.  A comprehensive catalogue of somatic mutations from a human cancer genome.

Authors:  Erin D Pleasance; R Keira Cheetham; Philip J Stephens; David J McBride; Sean J Humphray; Chris D Greenman; Ignacio Varela; Meng-Lay Lin; Gonzalo R Ordóñez; Graham R Bignell; Kai Ye; Julie Alipaz; Markus J Bauer; David Beare; Adam Butler; Richard J Carter; Lina Chen; Anthony J Cox; Sarah Edkins; Paula I Kokko-Gonzales; Niall A Gormley; Russell J Grocock; Christian D Haudenschild; Matthew M Hims; Terena James; Mingming Jia; Zoya Kingsbury; Catherine Leroy; John Marshall; Andrew Menzies; Laura J Mudie; Zemin Ning; Tom Royce; Ole B Schulz-Trieglaff; Anastassia Spiridou; Lucy A Stebbings; Lukasz Szajkowski; Jon Teague; David Williamson; Lynda Chin; Mark T Ross; Peter J Campbell; David R Bentley; P Andrew Futreal; Michael R Stratton
Journal:  Nature       Date:  2009-12-16       Impact factor: 49.962

8.  A highly annotated whole-genome sequence of a Korean individual.

Authors:  Jong-Il Kim; Young Seok Ju; Hansoo Park; Sheehyun Kim; Seonwook Lee; Jae-Hyuk Yi; Joann Mudge; Neil A Miller; Dongwan Hong; Callum J Bell; Hye-Sun Kim; In-Soon Chung; Woo-Chung Lee; Ji-Sun Lee; Seung-Hyun Seo; Ji-Young Yun; Hyun Nyun Woo; Heewook Lee; Dongwhan Suh; Seungbok Lee; Hyun-Jin Kim; Maryam Yavartanoo; Minhye Kwak; Ying Zheng; Mi Kyeong Lee; Hyunjun Park; Jeong Yeon Kim; Omer Gokcumen; Ryan E Mills; Alexander Wait Zaranek; Joseph Thakuria; Xiaodi Wu; Ryan W Kim; Jim J Huntley; Shujun Luo; Gary P Schroth; Thomas D Wu; HyeRan Kim; Kap-Seok Yang; Woong-Yang Park; Hyungtae Kim; George M Church; Charles Lee; Stephen F Kingsmore; Jeong-Sun Seo
Journal:  Nature       Date:  2009-07-08       Impact factor: 49.962

9.  L1 retrotransposition in neurons is modulated by MeCP2.

Authors:  Alysson R Muotri; Maria C N Marchetto; Nicole G Coufal; Ruth Oefner; Gene Yeo; Kinichi Nakashima; Fred H Gage
Journal:  Nature       Date:  2010-11-18       Impact factor: 49.962

10.  A small-cell lung cancer genome with complex signatures of tobacco exposure.

Authors:  Erin D Pleasance; Philip J Stephens; Sarah O'Meara; David J McBride; Alison Meynert; David Jones; Meng-Lay Lin; David Beare; King Wai Lau; Chris Greenman; Ignacio Varela; Serena Nik-Zainal; Helen R Davies; Gonzalo R Ordoñez; Laura J Mudie; Calli Latimer; Sarah Edkins; Lucy Stebbings; Lina Chen; Mingming Jia; Catherine Leroy; John Marshall; Andrew Menzies; Adam Butler; Jon W Teague; Jonathon Mangion; Yongming A Sun; Stephen F McLaughlin; Heather E Peckham; Eric F Tsung; Gina L Costa; Clarence C Lee; John D Minna; Adi Gazdar; Ewan Birney; Michael D Rhodes; Kevin J McKernan; Michael R Stratton; P Andrew Futreal; Peter J Campbell
Journal:  Nature       Date:  2009-12-16       Impact factor: 49.962

View more
  100 in total

1.  Loss of function, missense, and intronic variants in NOTCH1 confer different risks for left ventricular outflow tract obstructive heart defects in two European cohorts.

Authors:  Emmi Helle; Aldo Córdova-Palomera; Tiina Ojala; Priyanka Saha; Praneetha Potiny; Stefan Gustafsson; Erik Ingelsson; Michael Bamshad; Deborah Nickerson; Jessica X Chong; Euan Ashley; James R Priest
Journal:  Genet Epidemiol       Date:  2018-12-04       Impact factor: 2.135

2.  NEFL E396K mutation is associated with a novel dominant intermediate Charcot-Marie-Tooth disease phenotype.

Authors:  José Berciano; Antonio García; Kristien Peeters; Elena Gallardo; Els De Vriendt; Ana L Pelayo-Negro; Jon Infante; Albena Jordanova
Journal:  J Neurol       Date:  2015-04-01       Impact factor: 4.849

3.  High-throughput DNA sequencing errors are reduced by orders of magnitude using circle sequencing.

Authors:  Dianne I Lou; Jeffrey A Hussmann; Ross M McBee; Ashley Acevedo; Raul Andino; William H Press; Sara L Sawyer
Journal:  Proc Natl Acad Sci U S A       Date:  2013-11-15       Impact factor: 11.205

4.  De novo loss-of-function mutations in CHD2 cause a fever-sensitive myoclonic epileptic encephalopathy sharing features with Dravet syndrome.

Authors:  Arvid Suls; Johanna A Jaehn; Angela Kecskés; Yvonne Weber; Sarah Weckhuysen; Dana C Craiu; Aleksandra Siekierska; Tania Djémié; Tatiana Afrikanova; Padhraig Gormley; Sarah von Spiczak; Gerhard Kluger; Catrinel M Iliescu; Tiina Talvik; Inga Talvik; Cihan Meral; Hande S Caglayan; Beatriz G Giraldez; José Serratosa; Johannes R Lemke; Dorota Hoffman-Zacharska; Elzbieta Szczepanik; Nina Barisic; Vladimir Komarek; Helle Hjalgrim; Rikke S Møller; Tarja Linnankivi; Petia Dimova; Pasquale Striano; Federico Zara; Carla Marini; Renzo Guerrini; Christel Depienne; Stéphanie Baulac; Gregor Kuhlenbäumer; Alexander D Crawford; Anna-Elina Lehesjoki; Peter A M de Witte; Aarno Palotie; Holger Lerche; Camila V Esguerra; Peter De Jonghe; Ingo Helbig
Journal:  Am J Hum Genet       Date:  2013-10-24       Impact factor: 11.025

5.  A Benchmark Study on Error Assessment and Quality Control of CCS Reads Derived from the PacBio RS.

Authors:  Xiaoli Jiao; Xin Zheng; Liang Ma; Geetha Kutty; Emile Gogineni; Qiang Sun; Brad T Sherman; Xiaojun Hu; Kristine Jones; Castle Raley; Bao Tran; David J Munroe; Robert Stephens; Dun Liang; Tomozumi Imamichi; Joseph A Kovacs; Richard A Lempicki; Da Wei Huang
Journal:  J Data Mining Genomics Proteomics       Date:  2013-07-31

6.  Epidermal Growth Factor Receptor (EGFR) Pathway Biomarkers in the Randomized Phase III Trial of Erlotinib Versus Observation in Ovarian Cancer Patients with No Evidence of Disease Progression after First-Line Platinum-Based Chemotherapy.

Authors:  Evelyn Despierre; Ignace Vergote; Ryan Anderson; Corneel Coens; Dionyssios Katsaros; Fred R Hirsch; Bram Boeckx; Marileila Varella-Garcia; Annamaria Ferrero; Isabelle Ray-Coquard; Els M J J Berns; Antonio Casado; Diether Lambrechts; Antonio Jimeno
Journal:  Target Oncol       Date:  2015-12       Impact factor: 4.493

7.  Quality control metrics improve repeatability and reproducibility of single-nucleotide variants derived from whole-genome sequencing.

Authors:  W Zhang; V Soika; J Meehan; Z Su; W Ge; H W Ng; R Perkins; V Simonyan; W Tong; H Hong
Journal:  Pharmacogenomics J       Date:  2014-11-11       Impact factor: 3.550

Review 8.  Reference standards for next-generation sequencing.

Authors:  Simon A Hardwick; Ira W Deveson; Tim R Mercer
Journal:  Nat Rev Genet       Date:  2017-06-19       Impact factor: 53.242

Review 9.  The role of replicates for error mitigation in next-generation sequencing.

Authors:  Kimberly Robasky; Nathan E Lewis; George M Church
Journal:  Nat Rev Genet       Date:  2013-12-10       Impact factor: 53.242

10.  Comprehensive genomic analysis of rhabdomyosarcoma reveals a landscape of alterations affecting a common genetic axis in fusion-positive and fusion-negative tumors.

Authors:  Jack F Shern; Li Chen; Juliann Chmielecki; Jun S Wei; Rajesh Patidar; Mara Rosenberg; Lauren Ambrogio; Daniel Auclair; Jianjun Wang; Young K Song; Catherine Tolman; Laura Hurd; Hongling Liao; Shile Zhang; Dominik Bogen; Andrew S Brohl; Sivasish Sindiri; Daniel Catchpoole; Thomas Badgett; Gad Getz; Jaume Mora; James R Anderson; Stephen X Skapek; Frederic G Barr; Matthew Meyerson; Douglas S Hawkins; Javed Khan
Journal:  Cancer Discov       Date:  2014-01-23       Impact factor: 39.397

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.