| Literature DB >> 25985138 |
Jenny C Taylor1,2, Hilary C Martin2, Stefano Lise2, John Broxholme2, Jean-Baptiste Cazier2, Andy Rimmer2, Alexander Kanapin2, Gerton Lunter2, Simon Fiddy2, Chris Allan2, A Radu Aricescu2, Moustafa Attar2, Christian Babbs3, Jennifer Becq4, David Beeson5, Celeste Bento6, Patricia Bignell7, Edward Blair8, Veronica J Buckle3, Katherine Bull2,9, Ondrej Cais10, Holger Cario11, Helen Chapel12, Richard R Copley1,2, Richard Cornall9, Jude Craft1,2, Karin Dahan13,14, Emma E Davenport2, Calliope Dendrou15, Olivier Devuyst16, Aimée L Fenwick17, Jonathan Flint2, Lars Fugger15, Rodney D Gilbert18, Anne Goriely17, Angie Green2, Ingo H Greger10, Russell Grocock4, Anja V Gruszczyk17, Robert Hastings19, Edouard Hatton2, Doug Higgs3, Adrian Hill2,20, Chris Holmes2,21, Malcolm Howard1,2, Linda Hughes2, Peter Humburg2, David Johnson22, Fredrik Karpe23, Zoya Kingsbury4, Usha Kini8, Julian C Knight2, Jonathan Krohn2, Sarah Lamble2, Craig Langman24, Lorne Lonie2, Joshua Luck17, Davis McCarthy2, Simon J McGowan17, Mary Frances McMullin25, Kerry A Miller17, Lisa Murray4, Andrea H Németh26, M Andrew Nesbit27, David Nutt28, Elizabeth Ormondroyd19, Annette Bang Oturai29, Alistair Pagnamenta1,2, Smita Y Patel12, Melanie Percy30, Nayia Petousi31, Paolo Piazza2, Sian E Piret27, Guadalupe Polanco-Echeverry2, Niko Popitsch1,2, Fiona Powrie32, Chris Pugh31, Lynn Quek3, Peter A Robbins33, Kathryn Robson3, Alexandra Russo34, Natasha Sahgal2, Pauline A van Schouwenburg12, Anna Schuh1,35, Earl Silverman36, Alison Simmons15,32, Per Soelberg Sørensen29, Elizabeth Sweeney37, John Taylor1,38, Rajesh V Thakker27, Ian Tomlinson1,2, Amy Trebes2, Stephen Rf Twigg17, Holm H Uhlig32, Paresh Vyas3, Tim Vyse39, Steven A Wall22, Hugh Watkins19, Michael P Whyte40, Lorna Witty2, Ben Wright2, Chris Yau2, David Buck2, Sean Humphray4, Peter J Ratcliffe31, John I Bell41, Andrew Om Wilkie17, David Bentley4, Peter Donnelly2,21, Gilean McVean2.
Abstract
To assess factors influencing the success of whole-genome sequencing for mainstream clinical diagnosis, we sequenced 217 individuals from 156 independent cases or families across a broad spectrum of disorders in whom previous screening had identified no pathogenic variants. We quantified the number of candidate variants identified using different strategies for variant calling, filtering, annotation and prioritization. We found that jointly calling variants across samples, filtering against both local and external databases, deploying multiple annotation tools and using familial transmission above biological plausibility contributed to accuracy. Overall, we identified disease-causing variants in 21% of cases, with the proportion increasing to 34% (23/68) for mendelian disorders and 57% (8/14) in family trios. We also discovered 32 potentially clinically actionable variants in 18 genes unrelated to the referral disorder, although only 4 were ultimately considered reportable. Our results demonstrate the value of genome sequencing for routine clinical diagnosis but also highlight many outstanding challenges.Entities:
Mesh:
Year: 2015 PMID: 25985138 PMCID: PMC4601524 DOI: 10.1038/ng.3304
Source DB: PubMed Journal: Nat Genet ISSN: 1061-4036 Impact factor: 38.330
Figure 1Overview of projects and results
For each disorder, the number of independent cases (bars) studied is shown alongside information about the nature of the disorder: familial disorders (category 1, light green triangles), severe single-generation disorders suspected to be caused by de novo or recessive mutations (category 2, dark green), unrelated sporadic disorders (category 3, light blue) and extreme cases of common complex diseases (category 4, dark blue). The proportion of cases with each outcome class A-E is also shown (see Online Methods): pathogenic variant in novel gene for disorder (A, red circles), pathogenic variant in gene for related disorder (B, brown), pathogenic variant in known gene for disorder (C, pink), candidate pathogenic variant with validation studies underway (D, orange) and no single candidate variant, or negative results for validation of top candidate/s (blue). Size of points proportional to outcome fraction. Disorders are ranked by fraction of cases with confirmed pathogenic variants (class A to C).
Figure 2The burden of variants of unknown significance
(a) Histograms of the number of previously unreported coding variants at conserved positions in different sets of candidate gene (Tiers 1, 1+2 and 1+2+3 for columns left to right) for early-onset epilepsy, under different inheritance models, across 216 WGS500 samples. (b) Histogram of the number of previously unreported coding variants at conserved positions in known X-linked mental retardation genes (XLMR), for the 99 male WGS500 samples. The candidate genes were chosen by high-throughput searches (Online Methods). Sample identifiers indicate individuals with the disorder in question. Sample names in green text indicate that the variant is not likely to be pathogenic (since it does not fit a plausible inheritance model or is less functionally compelling than another candidate); blue text indicates that the variant is thought to be causal (see Supplementary Table 6). OTH: Ohtahara syndrome; EOE: nonsyndromic early onset epilepsy; MR: mental retardation. See Supplementary Fig. 4 for the analysis of craniosynostosis.
Summary of conditions for which pathogenic genes were identified (class A, B or C)
| Disease | Project category | Result class | Gene[ | Coding consequence | Inheritance (Zygosity)[ | Variant |
|---|---|---|---|---|---|---|
|
| 1.1 | C |
| splicing | D (het) | NM_001177598: c.13+1G>C |
|
| 2.2 | C |
| splicing | D (het) | NM_002295.4:c.-34+5G>C |
|
| 1.2 | B |
| nonsense | AR (hom) | NM_006946:c.1881G>A:p.C627* |
|
| 3 | C |
[ | missense |
[ |
[ |
|
| 1.2 | A |
| missense | AR (hom) | NM_001130010:c.533T>A:p.L178Q |
|
| 3 | B |
| missense | AR (hom) | NM_033087:c.203T>G:p.V68G |
|
| 2.1 | A |
| nonsense | DN (het) | NM_003412.3: c.1163C>A: p.S388* |
| 2.1 | B |
| missense | DN (het) | NM_031407.6: c.329G>A:p.R110Q | |
|
| 1.1 | A |
| noncoding | D (het) | NM_000799.2:c.-136G>A |
| 1.1 | A |
| noncoding | D (het) | NM_000799.2:c.-136G>A | |
| 3 | C |
| missense | D (het) | NM_001724:c.269G>A:p.R90H | |
|
| 1.3 | C |
| noncoding | XL (hemi) | deletion of chrX:139,502,946-139,504,327, |
| 1.1 | C |
| missense | D (het) | NM_000388:c.2299G>C:p.E767Q | |
|
| 1.4 | C |
| missense | D (het) | NM_001008389:c.410G>A:p.C137Y |
|
| 1.1 | C |
| missense (inframe insertion/deletion) | D (het) | NM_001008389:c.279_289del:p.93_97del; |
|
| 1.1 | C |
| nonsense |
[ | NM_000256:c.1303C>T:p.Q435* |
|
| 4 | A |
[ | missense |
[ |
[ |
|
| 1.4 | C |
| - | D (het) |
[ |
|
| 1.1 | C |
| missense | D (het) | NM_000218:c.1195_1196insC:p.A399fs |
|
| 2.1 | C |
| missense | XL (hemi) |
[ |
|
| 2.1 | A |
| splicing | SR (hom) | NM_004204:c.690-2A>G |
| 2.1 | B |
| missense | UPID (hom) | NM_020822: c. 2896G>A:p.A966T | |
| 2.1 | C |
| missense | DN (het) | NM_004518:c.827C>T:p.T276I | |
| 2.1 | C |
| missense | DN (het) | NM_001040143:c.5558A>G:p.H1853R | |
|
| 1.1 | A |
| missense | D (het) | NM_002691:c.1433G>A:p.S478N |
| 1.1 | A |
| missense | D (het) | NM_002691:c.1433G>A:p.S478N | |
| 4 | A |
| missense | D (het) | NM_006231:c.1270C>G:p.L424V | |
| 4 | C |
| missense and nonsense | CR (het; het) | NM_000179:c.2315G>A:p.R772Q | |
| 4 | C |
| frameshift | AR (hom) | NM_004329.2:c.142_143insT:p.Thr49Asnfs*2 2 | |
| 4 | C |
| splicing | D (het) | NM_001127511:c.251-2A>G | |
|
| 3 | A |
| nonsense | DN (het) | NM_207037.1:c.1283T>G; p.L428* |
| 3 | A |
| splicing | DN (het) | NM_207037.1:c.1035+3G>C | |
| 2.1 | A |
| synonymous (splicing) and missense | CR (het; het) | NM_001178010.2:c.318C>T;p.V106=; |
Each line represents a separate case or family, so if the same gene is reported on two lines, this signifies that the gene is thought to be pathogenic in both cases. Some genes have two mutations in the same affected individual, likely representing compound heterozygous inheritance, which is indicated in the Inheritance column.
D(het): dominant - affected individual/s heterozygous; AR (hom): autosomal recessive – affected individual/s homozygous; DN (het) :de novo – affected individual/s heterozygous; XL (hemi): X-linked recessive – affected male/s hemizygous, affected female/s homozygous; UPD (hom): uniparental isodisomy–affected individual homozygous; CR (het; het): compound recessive – affected individual /s heterozygous for two different variants in the same gene
Causal variant discovered independently of WGS500.
Details will be reported in an independent publication.
Form of inheritance not clear. See Supplementary Table 8.
Figure 3Identification of de novo HUWE1 mutation associated with severe craniosynostosis
(a) Upper panel, the proband (CRS_4659; female, aged 6 months) presented with an abnormal skull shape. Lower panel, three-dimensional CT scan aged 5 months shows multisuture synostosis with multiple craniolacunae. (b) Family pedigree showing dideoxy sequence chromatograms with de novo G>A mutation of the X-linked HUWE1 gene in the proband (red arrow). Schematic X chromosomes are annotated from top to bottom with the HUWE1 alleles, haplotype of AA/CC polymorphisms located 1.15 kb away from mutation and used to deduce paternal origin, and androgen receptor (AR) trinucleotide repeat allele size (allele sizes in CRS_4654 and CRS_5215 are in brackets to emphasize that phase is unknown relative to other parts of the two X chromosomes). Note that the HUWE1 mutation abolishes a HpaII restriction site. (c) Analysis of X-inactivation in whole blood samples at AR locus. For each individual, AR alleles are indicated by arrows in the upper panel, while the lower panel shows proportions of methylated alleles and percentage representation of the more highly inactivated X chromosome. (d) Exclusive expression of cDNA from the HUWE1 mutant allele in both fibroblast (Fib) and Epstein Barr virus (EBV)-transformed lymphoblastoid cells from the proband. Arrows highlight absence of expression of the normal allele in either cell type. Product sizes (bp) from different alleles are shown on the right. WT: wild-type, Mut: mutant. (e) X chromosome ideogram showing eight de novo mutations identified. Where known, the parental allele on which the variant arose is indicated.
Figure 4Candidate pathogenic noncoding variants
(a) Multi-species alignment of a region of the 5′ UTR of EPO in which a variant was identified at a conserved position (red text) in two families with erythrocytosis. (b) Erythrocytosis pedigrees studied, showing affected individuals (shaded grey), those sequenced (red borders), and genotypes of all individuals for whom we had DNA. We had no information about the father of PAR09 (dotted box).(c) Summary of read mapping in an individual with hypoparathyroidism showing evidence for an interstitial insertion-deletion event in which a ~ 50 kb region of chromosome 2p25.3 (top panel) has been duplicated and inserted into chromosome X, resulting in a 1.4 kb deletion 81.5 kb downstream of SOX3 (bottom panel). Yellow reads: mate maps to chrX; red reads: mate maps to chr2; grey reads: read and mate map to the same chromosome; white reads: read has mapping quality 0. (d) Pedigree showing segregation of the complex variant within the affected pedigree, with PCR validation below. M: mutation; N: normal. Primers 2SPF and XSPR flank the distal breakpoint of the deletion-insertion and are shown in Supplementary Figure 8. Primers XSPF and XSPR detect the normal allele. The mutation was not seen in 150 alleles from 100 unrelated normocalcemic individuals (50 males and 50 females, including N1 and N2, who are shown).
Incidental findings with potentially actionable consequences
| Incidental finding condition | Gene | AA change | UK10K | EVS_EA | Comments |
|---|---|---|---|---|---|
|
| |||||
|
|
| NM_001943: c.2397T>G:p.Y799* | absent | absent | Stop gain mutation, not previously reported, but mutation class considered pathogenic[ |
|
| NM_001943: c.2554G>T:p.E852* | absent | absent | As above | |
|
|
| NM_000059: c.7558C>T:p.R2520* | absent | 0.0001 | Stop gain mutation; 5 independent R2520* in affected patients[ |
|
|
| NM_000218: c.877C>T:p.R293C | absent | absent | 2 independent reports in literature: i) 4 /2500 independent cases from FAMILION cohort referred for LQT genetic testing[ |
|
| |||||
|
|
| NM_000218: c.1189C>T:p.R397W | absent | 0.0006 | 3 independent reports in literature, including 3/2500 independent cases referred for long QT testing[ |
|
|
| NM_000540: c.5036G>A:p.R1679H | 0.0006 | 0.0014 | Variant observed in single subject with complication, and positive functional testing[ |
Variants deemed to be reportable and clinically actionable are listed Section I of the table. Those for which the evidence was not considered sufficient to be clinically actionable are reported or are uncertain are listed in Section II.
AA: amino acid; VUS: variant of unknown significance; UK10K: frequency in the UK10K twin cohort; EVS: Exome Variant Server; EVS_EA: frequency in European Americans in the EVS; HGMD: Human Gene Mutation Database; UMD: Universal Mutation Database (see URLs).