Literature DB >> 21920048

Next-generation human genetics.

Abstract

The field of human genetics is being reshaped by exome and genome sequencing. Several lessons are evident from observing the rapid development of this area over the past 2 years, and these may be instructive with respect to what we should expect from 'next-generation human genetics' in the next few years.

Entities: Disease Gene Species

Mesh：

Year: 2011 PMID： 21920048 PMCID： PMC3308046 DOI： 10.1186/gb-2011-12-9-408

Source DB: PubMed Journal: Genome Biol ISSN： 1474-7596 Impact factor: 13.583

In 2005, two publications introduced methods for massively parallel DNA sequencing [1,2], marking the beginning of a dizzying free-fall in sequencing costs that continues today with no obvious end in sight. To enable the flexible application of these 'next-generation' technologies in the context of human genetics, our group and others have developed new methods for the parallel and programmable capture of complex subsets of the human genome at a cost and scale that is commensurate with the power of new sequencing technologies [3]. These methods facilitate the next-generation sequencing of specific subsets of the genome in many individuals for the same cost as whole-genome sequencing of a single individual. An effective compromise between the competing goals of genome-wide comprehensiveness and cost-control was realized in the concept of 'exome sequencing', that is, the capture and sequencing of the approximately 1% of the human genome that is protein coding [4,5]. The contents of this special issue of Genome Biology, as well as over 200 other publications since 2009 whose abstracts contain the term 'exome', confirm the success of exome sequencing as a new and effective technological paradigm within human genetics. Exome sequencing has proven useful for identifying the molecular defects underlying single gene disorders, as well as some genetically heterogeneous disorders; for identifying genes that are recurrently mutated in various cancers; and for new insights with respect to human evolution and population genetics. Furthermore, even though exome sequencing only became broadly accessible in late 2009, well over 10,000 exomes have been sequenced to date. Consequently, what has been published thus far is likely to represent only a small fraction of collective body of work in progress that applies exome sequencing in diverse contexts. Today, the cost of whole-genome sequencing has fallen to a few thousand dollars, and exome sequencing is being declared in some quarters to be obsolete at the very moment when it has seemingly become pervasive. There is likely to be some truth to this. As the cost of whole-genome sequencing is falling to a level where it is broadly accessible, and as the cost differential between exome and genome sequencing is diminishing as well, there inevitably will be less motivation to bother with exome enrichment. However, although the 'exome versus genome' tension is of great practical relevance, I worry that it may distract us from other lessons that are evident from observing the rapid development of this field over the past 2 years. I attempt to summarize a few of these below, as they may be instructive with respect to what we should expect from 'next-generation human genetics' in the next few years.

High-yield genetics

Exome sequencing identifies approximately 20,000 variants [4], and genome sequencing identifies approximately 4,000,000 variants [6], per individual sequenced. New technologies have altered the nature of the starting point, but the fundamental problem for human geneticists remains the same: how to narrow to the single or few variants that are causal for a phenotype of interest. To date, nearly all successful studies applying exome sequencing to identify disease genes have adopted one of three paradigms for reducing search space. (1) For solving Mendelian disorders, a straightforward strategy initially proposed by our group involves exome sequencing of a small number of affected individuals, filtering of common variants by comparison to public SNP databases or unrelated controls, and prioritization of genes containing apparently rare, protein-altering variants in all or most affected individuals [4]. The major advantage of this approach is that it can be independent of linkage analysis, that is, it enables the identification of the molecular basis of a Mendelian disorder without requiring access to pedigrees of sufficient size to properly map the locus, or any pedigrees, for that matter (though pedigree information can still be useful, especially for genetically heterogeneous disorders [7,8]). For recessive disorders, particularly those occurring in consanguineous families, exome sequencing of just a single individual (that is, n = 2 in terms of affected chromosomes) followed by filtering of common variants may be sufficient to narrow to one or a few candidate genes [9]. (2) An alternative strategy involves exome sequencing of parent-child trios to identify the (approximately) one de novo coding mutation occurring per generation [10]. This may be particularly effective for Mendelian disorders where a dominant mode of transmission is suspected and proband(s) with unaffected parents are available. More notably, however, this paradigm is being successfully applied to approach complex neuropsychiatric disorders, including intellectual disability [10], autism [11] and schizophrenia [12]. Although mutations in hundreds of genes may contribute to each of these genetically and phenotypically heterogeneous disorders, the fact that de novo, large-effect coding mutations appear to underlie a sizable proportion of sporadic cases provides a highly efficient means for identifying candidate genes. (3) For cancer, a straightforward approach involves the pairwise comparison of exome sequences of tumor and normal tissue from the same individual to distinguish the handful of somatic coding mutations from a large background of inherited variants. Exome sequencing of relatively modest numbers of matched tumor-normal pairs can yield the identification of novel, recurrent driver mutations for specific types of cancer [13,14]. A shared and compelling aspect of each of these strategies is that they represent 'high-yield genetics', that is, the unambiguous identification of a novel disease gene(s) with exome sequencing of a relatively small number of samples and a correspondingly modest investment of resources. There is clearly a lot of low-hanging fruit still to be had, and further decreasing costs and increasing analytical sophistication will only increase the productivity of these paradigms. Furthermore, as the broader field shifts from sequencing exomes to sequencing genomes, these same strategies may prove to be the most 'high yield' for ascertaining the contribution of non-coding mutations to Mendelian disorders as well as to at least some common diseases, for example, neuropsychiatric disorders and cancer.

Power to the people

Hundreds of independent research groups have successfully implemented exome sequencing in the past 2 years. At least five factors contributed to this being possible: (1) the widespread purchase of next-generation sequencing instruments since 2005; (2) the availability of excellent open-source software for data analysis, for example, bwa [15] and samtools [16]; (3) the rapid development and commercialization of effective reagents for exome capture, for example, Agilent SureSelect, Nimblegen SeqCap; (4) a relatively low cost per sample (that is, capture reagents and one sequencing lane) such that the entry point cost for exome sequencing was historically much more accessible than that of genome sequencing; (5) the fact that such a large number of groups have samples on hand on which they are highly motivated to perform exome sequencing. Why does this broad base of participation matter? First, the learning curve for new technologies can be substantial. As a consequence of the perceived effectiveness, simplicity and affordability of exome sequencing, a much larger group of researchers has engaged and become competent with next-generation sequencing than might otherwise have been the case. Second, the field itself benefits tremendously from this 'democratization' of access and participation, in the sense that much of the innovation and nearly all of the discoveries have come from small groups working with next-generation sequencing for the first time. Notably, there are very few discoveries made by whole-genome sequencing to date that could not have been made more cost effectively by exome sequencing. However, many fewer groups have thus far taken on whole-genome sequencing, and it is possible that broader participation - in terms of the researchers and their samples - remains the missing ingredient.

Challenges and opportunities

Even with the rapid maturation of this field, there are a number of areas that are still, to varying degrees, a work-in-progress; these are described as follows. (1) Exome sequencing fails to solve a substantial proportion of presumably Mendelian phenotypes, even in model organisms where the genetics are crystal clear [17]. If we are to conceive of solving all of the Mendelian disorders for which the causative gene(s) remains unknown, understanding the basis of these failures will be critical. Analogously, there are types of cancer where exome sequencing has not been that successful, due perhaps to marked genetic heterogeneity or the fact that many of the underlying driver mutations may be structural or non-coding. (2) There is tremendous interest in understanding the contribution of rare variation to the genetic basis of common diseases. Many such studies have been initiated using exome sequencing, but are still ongoing as they require large sample sizes to achieve power. These studies will set the stage for understanding the contribution of all rare variants, coding and non-coding, to these same diseases via whole-genome sequencing. (3) The discrete prioritization of all protein-altering variation over all other variation has clearly proven useful, but is undeniably crude. As we shift from exomes to genomes, we incur a 100-fold increase in noise for an unknown gain in signal. We are desperately in need of more sophisticated methods for assigning more appropriate 'priors' to coding and non-coding variants alike. (4) To date, attempts to interpret 'personal exomes' or 'personal genomes' for clinically relevant facts have been mostly disappointing. If we are to be successful in deploying these tools in a clinical setting, we have a very long way to go in terms of predicting phenotype from genotype. We are only a few years into an incredible trajectory in which exome sequencing and genome sequencing are reshaping the landscape of human genetics. For some problems, it is clear that these technologies were exactly what were needed, and the application of high-yield paradigms by diverse research groups is leading to a plethora of rapid discoveries. For other problems, the removal of one rate-limiting step has only given way to a new rate-limiting step, and we are likely to have our work cut out for us for the foreseeable future.

Abbreviations

SNP: single-nucleotide polymorphism.

Competing interests

The author declares that they have no competing interests.

17 in total

Review 1. Target-enrichment strategies for next-generation sequencing.

Authors: Lira Mamanova; Alison J Coffey; Carol E Scott; Iwanka Kozarewa; Emily H Turner; Akash Kumar; Eleanor Howard; Jay Shendure; Daniel J Turner
Journal: Nat Methods Date: 2010-02 Impact factor: 28.547

2. A de novo paradigm for mental retardation.

Authors: Lisenka E L M Vissers; Joep de Ligt; Christian Gilissen; Irene Janssen; Marloes Steehouwer; Petra de Vries; Bart van Lier; Peer Arts; Nienke Wieskamp; Marisol del Rosario; Bregje W M van Bon; Alexander Hoischen; Bert B A de Vries; Han G Brunner; Joris A Veltman
Journal: Nat Genet Date: 2010-11-14 Impact factor: 38.330

3. Accurate multiplex polony sequencing of an evolved bacterial genome.

Authors: Jay Shendure; Gregory J Porreca; Nikos B Reppas; Xiaoxia Lin; John P McCutcheon; Abraham M Rosenbaum; Michael D Wang; Kun Zhang; Robi D Mitra; George M Church
Journal: Science Date: 2005-08-04 Impact factor: 47.728

4. Genome sequencing in microfabricated high-density picolitre reactors.

Authors: Marcel Margulies; Michael Egholm; William E Altman; Said Attiya; Joel S Bader; Lisa A Bemben; Jan Berka; Michael S Braverman; Yi-Ju Chen; Zhoutao Chen; Scott B Dewell; Lei Du; Joseph M Fierro; Xavier V Gomes; Brian C Godwin; Wen He; Scott Helgesen; Chun Heen Ho; Chun He Ho; Gerard P Irzyk; Szilveszter C Jando; Maria L I Alenquer; Thomas P Jarvie; Kshama B Jirage; Jong-Bum Kim; James R Knight; Janna R Lanza; John H Leamon; Steven M Lefkowitz; Ming Lei; Jing Li; Kenton L Lohman; Hong Lu; Vinod B Makhijani; Keith E McDade; Michael P McKenna; Eugene W Myers; Elizabeth Nickerson; John R Nobile; Ramona Plant; Bernard P Puc; Michael T Ronan; George T Roth; Gary J Sarkis; Jan Fredrik Simons; John W Simpson; Maithreyan Srinivasan; Karrie R Tartaro; Alexander Tomasz; Kari A Vogt; Greg A Volkmer; Shally H Wang; Yong Wang; Michael P Weiner; Pengguang Yu; Richard F Begley; Jonathan M Rothberg
Journal: Nature Date: 2005-07-31 Impact factor: 49.962

5. Exome sequencing identifies GRIN2A as frequently mutated in melanoma.

Authors: Xiaomu Wei; Vijay Walia; Jimmy C Lin; Jamie K Teer; Todd D Prickett; Jared Gartner; Sean Davis; Katherine Stemke-Hale; Michael A Davies; Jeffrey E Gershenwald; William Robinson; Steven Robinson; Steven A Rosenberg; Yardena Samuels
Journal: Nat Genet Date: 2011-04-15 Impact factor: 38.330

6. Genome-wide in situ exon capture for selective resequencing.

Authors: Emily Hodges; Zhenyu Xuan; Vivekanand Balija; Melissa Kramer; Michael N Molla; Steven W Smith; Christina M Middle; Matthew J Rodesch; Thomas J Albert; Gregory J Hannon; W Richard McCombie
Journal: Nat Genet Date: 2007-11-04 Impact factor: 38.330

7. Exome sequencing reveals VCP mutations as a cause of familial ALS.

Authors: Janel O Johnson; Jessica Mandrioli; Michael Benatar; Yevgeniya Abramzon; Vivianna M Van Deerlin; John Q Trojanowski; J Raphael Gibbs; Maura Brunetti; Susan Gronka; Joanne Wuu; Jinhui Ding; Leo McCluskey; Maria Martinez-Lage; Dana Falcone; Dena G Hernandez; Sampath Arepalli; Sean Chong; Jennifer C Schymick; Jeffrey Rothstein; Francesco Landi; Yong-Dong Wang; Andrea Calvo; Gabriele Mora; Mario Sabatelli; Maria Rosaria Monsurrò; Stefania Battistini; Fabrizio Salvi; Rossella Spataro; Patrizia Sola; Giuseppe Borghero; Giuliana Galassi; Sonja W Scholz; J Paul Taylor; Gabriella Restagno; Adriano Chiò; Bryan J Traynor
Journal: Neuron Date: 2010-12-09 Impact factor: 17.173

8. Exome sequencing identifies frequent mutation of the SWI/SNF complex gene PBRM1 in renal carcinoma.

Authors: Ignacio Varela; Patrick Tarpey; Keiran Raine; Dachuan Huang; Choon Kiat Ong; Philip Stephens; Helen Davies; David Jones; Meng-Lay Lin; Jon Teague; Graham Bignell; Adam Butler; Juok Cho; Gillian L Dalgliesh; Danushka Galappaththige; Chris Greenman; Claire Hardy; Mingming Jia; Calli Latimer; King Wai Lau; John Marshall; Stuart McLaren; Andrew Menzies; Laura Mudie; Lucy Stebbings; David A Largaespada; L F A Wessels; Stephane Richard; Richard J Kahnoski; John Anema; David A Tuveson; Pedro A Perez-Mancera; Ville Mustonen; Andrej Fischer; David J Adams; Alistair Rust; Waraporn Chan-on; Chutima Subimerb; Karl Dykema; Kyle Furge; Peter J Campbell; Bin Tean Teh; Michael R Stratton; P Andrew Futreal
Journal: Nature Date: 2011-01-19 Impact factor: 49.962

9. Targeted capture and massively parallel sequencing of 12 human exomes.

Authors: Sarah B Ng; Emily H Turner; Peggy D Robertson; Steven D Flygare; Abigail W Bigham; Choli Lee; Tristan Shaffer; Michelle Wong; Arindam Bhattacharjee; Evan E Eichler; Michael Bamshad; Deborah A Nickerson; Jay Shendure
Journal: Nature Date: 2009-08-16 Impact factor: 49.962

10. Accurate whole human genome sequencing using reversible terminator chemistry.

Authors: David R Bentley; Shankar Balasubramanian; Harold P Swerdlow; Geoffrey P Smith; John Milton; Clive G Brown; Kevin P Hall; Dirk J Evers; Colin L Barnes; Helen R Bignell; Jonathan M Boutell; Jason Bryant; Richard J Carter; R Keira Cheetham; Anthony J Cox; Darren J Ellis; Michael R Flatbush; Niall A Gormley; Sean J Humphray; Leslie J Irving; Mirian S Karbelashvili; Scott M Kirk; Heng Li; Xiaohai Liu; Klaus S Maisinger; Lisa J Murray; Bojan Obradovic; Tobias Ost; Michael L Parkinson; Mark R Pratt; Isabelle M J Rasolonjatovo; Mark T Reed; Roberto Rigatti; Chiara Rodighiero; Mark T Ross; Andrea Sabot; Subramanian V Sankar; Aylwyn Scally; Gary P Schroth; Mark E Smith; Vincent P Smith; Anastassia Spiridou; Peta E Torrance; Svilen S Tzonev; Eric H Vermaas; Klaudia Walter; Xiaolin Wu; Lu Zhang; Mohammed D Alam; Carole Anastasi; Ify C Aniebo; David M D Bailey; Iain R Bancarz; Saibal Banerjee; Selena G Barbour; Primo A Baybayan; Vincent A Benoit; Kevin F Benson; Claire Bevis; Phillip J Black; Asha Boodhun; Joe S Brennan; John A Bridgham; Rob C Brown; Andrew A Brown; Dale H Buermann; Abass A Bundu; James C Burrows; Nigel P Carter; Nestor Castillo; Maria Chiara E Catenazzi; Simon Chang; R Neil Cooley; Natasha R Crake; Olubunmi O Dada; Konstantinos D Diakoumakos; Belen Dominguez-Fernandez; David J Earnshaw; Ugonna C Egbujor; David W Elmore; Sergey S Etchin; Mark R Ewan; Milan Fedurco; Louise J Fraser; Karin V Fuentes Fajardo; W Scott Furey; David George; Kimberley J Gietzen; Colin P Goddard; George S Golda; Philip A Granieri; David E Green; David L Gustafson; Nancy F Hansen; Kevin Harnish; Christian D Haudenschild; Narinder I Heyer; Matthew M Hims; Johnny T Ho; Adrian M Horgan; Katya Hoschler; Steve Hurwitz; Denis V Ivanov; Maria Q Johnson; Terena James; T A Huw Jones; Gyoung-Dong Kang; Tzvetana H Kerelska; Alan D Kersey; Irina Khrebtukova; Alex P Kindwall; Zoya Kingsbury; Paula I Kokko-Gonzales; Anil Kumar; Marc A Laurent; Cynthia T Lawley; Sarah E Lee; Xavier Lee; Arnold K Liao; Jennifer A Loch; Mitch Lok; Shujun Luo; Radhika M Mammen; John W Martin; Patrick G McCauley; Paul McNitt; Parul Mehta; Keith W Moon; Joe W Mullens; Taksina Newington; Zemin Ning; Bee Ling Ng; Sonia M Novo; Michael J O'Neill; Mark A Osborne; Andrew Osnowski; Omead Ostadan; Lambros L Paraschos; Lea Pickering; Andrew C Pike; Alger C Pike; D Chris Pinkard; Daniel P Pliskin; Joe Podhasky; Victor J Quijano; Come Raczy; Vicki H Rae; Stephen R Rawlings; Ana Chiva Rodriguez; Phyllida M Roe; John Rogers; Maria C Rogert Bacigalupo; Nikolai Romanov; Anthony Romieu; Rithy K Roth; Natalie J Rourke; Silke T Ruediger; Eli Rusman; Raquel M Sanches-Kuiper; Martin R Schenker; Josefina M Seoane; Richard J Shaw; Mitch K Shiver; Steven W Short; Ning L Sizto; Johannes P Sluis; Melanie A Smith; Jean Ernest Sohna Sohna; Eric J Spence; Kim Stevens; Neil Sutton; Lukasz Szajkowski; Carolyn L Tregidgo; Gerardo Turcatti; Stephanie Vandevondele; Yuli Verhovsky; Selene M Virk; Suzanne Wakelin; Gregory C Walcott; Jingwen Wang; Graham J Worsley; Juying Yan; Ling Yau; Mike Zuerlein; Jane Rogers; James C Mullikin; Matthew E Hurles; Nick J McCooke; John S West; Frank L Oaks; Peter L Lundberg; David Klenerman; Richard Durbin; Anthony J Smith
Journal: Nature Date: 2008-11-06 Impact factor: 49.962

20 in total

Review 1. Understanding the genetics of coronary artery disease through the lens of noninvasive imaging.

Authors: Eunice Yang; Jose D Vargas; David A Bluemke
Journal: Expert Rev Cardiovasc Ther Date: 2012-01

2. On the future of genetic risk assessment.

Authors: Hans-Hilger Ropers
Journal: J Community Genet Date: 2012-04-01

3. Variant discovery in targeted resequencing using whole genome amplified DNA.

Authors: Amit R Indap; Regina Cole; Christina L Runge; Gabor T Marth; Michael Olivier
Journal: BMC Genomics Date: 2013-07-10 Impact factor: 3.969

Review 4. The next generation of complex lung genetic studies.

Authors: Ivana V Yang; David A Schwartz
Journal: Am J Respir Crit Care Med Date: 2012-08-30 Impact factor: 21.405

5. Biomarker identification of hepatocellular carcinoma using a methodical literature mining strategy.

Authors: Nai-Wen Chang; Hong-Jie Dai; Yung-Yu Shih; Chi-Yang Wu; Mira Anne C Dela Rosa; Rofeamor P Obena; Yu-Ju Chen; Wen-Lian Hsu; Yen-Jen Oyang
Journal: Database (Oxford) Date: 2017-01-01 Impact factor: 3.451

Review 6. Next-generation sequencing: a frameshift in skeletal dysplasia gene discovery.

Authors: S Lazarus; A Zankl; E L Duncan
Journal: Osteoporos Int Date: 2013-08-01 Impact factor: 4.507

7. Exome sequencing finds a novel PCSK1 mutation in a child with generalized malabsorptive diarrhea and diabetes insipidus.

Authors: Michael Yourshaw; R Sergio Solorzano-Vargas; Lindsay A Pickett; Iris Lindberg; Jiafang Wang; Galen Cortina; Anna Pawlikowska-Haddal; Howard Baron; Robert S Venick; Stanley F Nelson; Martín G Martín
Journal: J Pediatr Gastroenterol Nutr Date: 2013-12 Impact factor: 2.839