Literature DB >> 34758253

100,000 Genomes Pilot on Rare-Disease Diagnosis in Health Care - Preliminary Report.

Damian Smedley1, Katherine R Smith1, Antonio Martin1, Ellen A Thomas1, Ellen M McDonagh1, Valentina Cipriani1, Jamie M Ellingford1, Gavin Arno1, Arianna Tucci1, Jana Vandrovcova1, Georgia Chan1, Hywel J Williams1, Thiloka Ratnaike1, Wei Wei1, Kathleen Stirrups1, Kristina Ibanez1, Loukas Moutsianas1, Matthias Wielscher1, Anna Need1, Michael R Barnes1, Letizia Vestito1, James Buchanan1, Sarah Wordsworth1, Sofie Ashford1, Karola Rehmström1, Emily Li1, Gavin Fuller1, Philip Twiss1, Olivera Spasic-Boskovic1, Sally Halsall1, R Andres Floto1, Kenneth Poole1, Annette Wagner1, Sarju G Mehta1, Mark Gurnell1, Nigel Burrows1, Roger James1, Christopher Penkett1, Eleanor Dewhurst1, Stefan Gräf1, Rutendo Mapeta1, Mary Kasanicki1, Andrea Haworth1, Helen Savage1, Melanie Babcock1, Martin G Reese1, Mark Bale1, Emma Baple1, Christopher Boustred1, Helen Brittain1, Anna de Burca1, Marta Bleda1, Andrew Devereau1, Dina Halai1, Eik Haraldsdottir1, Zerin Hyder1, Dalia Kasperaviciute1, Christine Patch1, Dimitris Polychronopoulos1, Angela Matchan1, Razvan Sultana1, Mina Ryten1, Ana L T Tavares1, Carolyn Tregidgo1, Clare Turnbull1, Matthew Welland1, Suzanne Wood1, Catherine Snow1, Eleanor Williams1, Sarah Leigh1, Rebecca E Foulger1, Louise C Daugherty1, Olivia Niblock1, Ivone U S Leong1, Caroline F Wright1, Jim Davies1, Charles Crichton1, James Welch1, Kerrie Woods1, Lara Abulhoul1, Paul Aurora1, Detlef Bockenhauer1, Alexander Broomfield1, Maureen A Cleary1, Tanya Lam1, Mehul Dattani1, Emma Footitt1, Vijeya Ganesan1, Stephanie Grunewald1, Sandrine Compeyrot-Lacassagne1, Francesco Muntoni1, Clarissa Pilkington1, Rosaline Quinlivan1, Nikhil Thapar1, Colin Wallis1, Lucy R Wedderburn1, Austen Worth1, Teofila Bueser1, Cecilia Compton1, Charu Deshpande1, Hiva Fassihi1, Eshika Haque1, Louise Izatt1, Dragana Josifova1, Shehla Mohammed1, Leema Robert1, Sarah Rose1, Deborah Ruddy1, Robert Sarkany1, Genevieve Say1, Adam C Shaw1, Agata Wolejko1, Bishoy Habib1, Gavin Burns1, Sarah Hunter1, Russell J Grocock1, Sean J Humphray1, Peter N Robinson1, Melissa Haendel1, Michael A Simpson1, Siddharth Banka1, Jill Clayton-Smith1, Sofia Douzgou1, Georgina Hall1, Huw B Thomas1, Raymond T O'Keefe1, Michel Michaelides1, Anthony T Moore1, Sam Malka1, Nikolas Pontikos1, Andrew C Browning1, Volker Straub1, Gráinne S Gorman1, Rita Horvath1, Richard Quinton1, Andrew M Schaefer1, Patrick Yu-Wai-Man1, Doug M Turnbull1, Robert McFarland1, Robert W Taylor1, Emer O'Connor1, Janice Yip1, Katrina Newland1, Huw R Morris1, James Polke1, Nicholas W Wood1, Carolyn Campbell1, Carme Camps1, Kate Gibson1, Nils Koelling1, Tracy Lester1, Andrea H Németh1, Claire Palles1, Smita Patel1, Noemi B A Roy1, Arjune Sen1, John Taylor1, Pilar Cacheiro1, Julius O Jacobsen1, Eleanor G Seaby1, Val Davison1, Lyn Chitty1, Angela Douglas1, Kikkeri Naresh1, Dom McMullan1, Sian Ellard1, I Karen Temple1, Andrew D Mumford1, Gill Wilson1, Phil Beales1, Maria Bitner-Glindzicz1, Graeme Black1, John R Bradley1, Paul Brennan1, John Burn1, Patrick F Chinnery1, Perry Elliott1, Frances Flinter1, Henry Houlden1, Melita Irving1, William Newman1, Shamima Rahman1, John A Sayer1, Jenny C Taylor1, Andrew R Webster1, Andrew O M Wilkie1, Willem H Ouwehand1, F Lucy Raymond1, John Chisholm1, Sue Hill1, David Bentley1, Richard H Scott1, Tom Fowler1, Augusto Rendon1, Mark Caulfield1.   

Abstract

BACKGROUND: The U.K. 100,000 Genomes Project is in the process of investigating the role of genome sequencing in patients with undiagnosed rare diseases after usual care and the alignment of this research with health care implementation in the U.K. National Health Service. Other parts of this project focus on patients with cancer and infection.
METHODS: We conducted a pilot study involving 4660 participants from 2183 families, among whom 161 disorders covering a broad spectrum of rare diseases were present. We collected data on clinical features with the use of Human Phenotype Ontology terms, undertook genome sequencing, applied automated variant prioritization on the basis of applied virtual gene panels and phenotypes, and identified novel pathogenic variants through research analysis.
RESULTS: Diagnostic yields varied among family structures and were highest in family trios (both parents and a proband) and families with larger pedigrees. Diagnostic yields were much higher for disorders likely to have a monogenic cause (35%) than for disorders likely to have a complex cause (11%). Diagnostic yields for intellectual disability, hearing disorders, and vision disorders ranged from 40 to 55%. We made genetic diagnoses in 25% of the probands. A total of 14% of the diagnoses were made by means of the combination of research and automated approaches, which was critical for cases in which we found etiologic noncoding, structural, and mitochondrial genome variants and coding variants poorly covered by exome sequencing. Cohortwide burden testing across 57,000 genomes enabled the discovery of three new disease genes and 19 new associations. Of the genetic diagnoses that we made, 25% had immediate ramifications for clinical decision making for the patients or their relatives.
CONCLUSIONS: Our pilot study of genome sequencing in a national health care system showed an increase in diagnostic yield across a range of rare diseases. (Funded by the National Institute for Health Research and others.).
Copyright © 2021 Massachusetts Medical Society.

Entities:  

Mesh:

Year:  2021        PMID: 34758253      PMCID: PMC7613219          DOI: 10.1056/NEJMoa2035790

Source DB:  PubMed          Journal:  N Engl J Med        ISSN: 0028-4793            Impact factor:   176.079


Rare disease is a worldwide healthcare challenge with approximately 10,000 disorders affecting 6% of the population in Western societies.[1,2] Over 80% of rare diseases have a genetic component and these conditions are disabling and expensive to manage. One-third of children with a rare disease die before their fifth birthday.[1] The adoption of next generation sequencing has improved rare disease diagnostic rates over the past decade.[3-5] However, the majority of rare disease patients remain without a molecular diagnosis following standard diagnostic testing.[3-5] To address this, the UK Government launched the 100,000 Genomes Project (100KGP) in 2013 to apply whole genome sequencing (WGS) to rare disease, cancer and infection in national healthcare.[6] To assess impact of this WGS approach on the genetic diagnosis of rare disease in the UK’s National Health Service, we carried out a pilot study in which we enrolled families and undertook detailed clinical phenotyping of the proband.[4] We collected electronic health records from all participants in a multi-petabyte research environment.[5] When necessary, we carried out wet bench orthogonal tests and in-silico approaches.

Methods

Patients

Following ethical approval, consenting participants (identified by healthcare professionals and researchers) with a broad range of rare diseases without diagnoses after undergoing usual care in the NHS (which ranged from no available test through approved tests which did not include genome sequencing) were recruited by nine English hospitals and consented through the National Institute for Health Research (NIHR) BioResource for Rare Diseases. To test the broad applicability of genome sequencing, participants were eligible if they had a rare disease (as defined in the UK as a disorder affecting 1 in 2000 or less), were likely to have a single gene or oligogenic aetiology, and no genomic diagnosis. Data on prior proband testing was collected where possible including single-gene tests, karyotyping, single nucleotide polymorphism (SNP) arrays, next generation sequencing panels, and exomes. Probands and, where feasible, parents and/or other family members were enrolled by multiple clinical specialties in the NHS. Standardized baseline clinical data were recorded using the Human Phenotype Ontology (HPO)[7] against disease specific data models[8] and whole blood was drawn for DNA extraction. The participants are followed over their life course using electronic health records (all hospital episodes, registries and cause of death).

Genome Sequencing

Genome sequencing[9] was performed using the Illumina TruSeq DNA PCR-Free sample preparation kit by Illumina Laboratory Sciences, Cambridge UK on an HiSeq 2500 sequencer, generating a mean depth of 32× (range from 27× to 54×) and greater than 15× for at least 95% of the reference human genome. WGS reads were aligned to the Genome Reference Consortium human genome build 37 (GRCh37) using Isaac Genome Alignment Software. Family-based variant calling of single variant nucleotides and insertion deletions (indels) for chromosomes 1 to 22, X, and the mitochondrial genome (mean 2814x coverage, range 142-16581) was performed using the Platypus variant caller.[10]

The Diagnostic Pipeline

We constructed an automated analytical pipeline to filter the genome down to rare, segregating and predicted damaging candidate variants in coding regions. To limit the possibility of overlooking, or inefficiently prioritizing diagnoses we focussed initially on virtual gene panels based on both the recruited clinical indication/disease and submitted HPO terms (applied virtual panels). To address the issue of which genes have sufficient evidence to attribute causation and include in these virtual gene panels, we used our PanelApp software to enable expert, crowd-sourced review and curation of genes with diagnostic-grade evidence for each of our disease categories e.g. evidence in at least three, unrelated families.[11] Loss of function (LoF) or de novo, protein altering variants affecting genes in the applied virtual panels were classified as tier 1, other variant types such as missense variants affecting these genes were classified as tier 2, and all other filtered variants were classified as tier 3 (Figure S1 in the Supplementary Appendix). To further reduce the possibility of missing, or inefficient prioritization of diagnoses, we ran Exomiser[12], a phenotype-based approach to look across all genes in the genome for a diagnosis. Exomiser prioritizes rare, segregating, predicted pathogenic variants in genes where the patient phenotypes match previous reference knowledge from human disease or model organism databases. The ontology-driven phenotype matching can detect patients possessing atypical profile for a disease. Decision support systems and clinical genetics teams provided by Congenica Ltd and Fabric Genomics[13,14] assisted us in variant prioritization and return of candidate variants to the 13 NHS Genomic Medicine Centres (GMC). These variants were reviewed by NHS clinical scientists and clinicians using the American College of Medical Genetics and Genomics guidelines and a diagnostic report was issued for each proband.[15] Final clinical outcomes included whether a genetic diagnosis was obtained, the variant(s) involved, whether they explained all, or some of the phenotypes and whether an intervention was deployed. The pilot participants were recruited and sequenced throughout 2014-2016, while the infrastructure to collect, QC, process and return data was being established. Results were returned to the GMCs from May 2016 to April 2019. In our post-pilot phase with an established pipeline, we now return results to the GMCs within 6 weeks of sample collection.

Novel Pathogenic Variants

Researchers investigated coding and non-coding regions for novel diagnoses in genes matching the patients’ phenotypes, including the presence of de novo variants in highly constrained coding regions[16] with 95% confidence. We used a novel methodology for mitochondrial DNA that accounts for heteroplasmy,[17] Genomiser,[18] and ExpansionHunter for simple tandem repeat expansions.[19] Finally we employed a novel random forest method to analyse Canvas[20] and Manta[21] calls and identify potentially pathogenic copy number and structural variants. Gene-based burden testing to detect enrichment of rare, predicted pathogenic, segregating variants in novel genes in specific disease cohorts relative to controls was performed on the pilot genomes as well as additional genomes from the rest of the 100KGP to increase power (57,002 genomes; see Supplementary Methods). Access to the pilot genomic and clinical data is freely accessible by becoming a member of a Genomics England Clinical Interpretation Partnership (GeCIP) domain (https://www.genomicsengland.co.uk/about-gecip/).

Statistical Analysis

Testing was performed using the R (version 3.6.0) and Stata (version 16) statistical packages. Further detail on individual methods is given in the Supplementary Appendix.

Results

We enrolled 4660 participants (2183 probands and 2477 family members) from 161 broad categories across rare disease (Table 1), with neurologic, ophthalmologic and tumor syndromes commonly represented. Participants were recruited with varying numbers of affected and unaffected family members. We aimed, with varying degrees of success, to recruit trios or larger family structures to facilitate more effective variant prioritization. Of the probands with multiple bowel polyps whom we recruited, 93% were singletons. In contrast, 12% of probands with intellectual disability were singletons. Adult probands were more commonly enrolled than pediatric probands (age at recruitment 18 years or younger) (74% vs. 26%), in line with the general population (79% vs. 21%; 2011 census of England and Wales). The preponderance of adults is unusual compared to previous sequencing projects and reflects an eligibility criterion: probands had already undergone usual care: in many cases, usual care involved standard genetic testing (mostly single-gene or panel-based). A lower percentage of female probands were recruited, especially for pediatric cases, where the difference was significant (232 female vs. 339 male; P< 0.001) based on the expected female proportion of 51% from 2011 census of England and Wales) across most disease categories. The increased susceptibility of males to recessive X-linked conditions may account for this sex bias: over 6% of total diagnoses involved variants on the X chromosome (which represents approximately 5% of the genome). The inferred ancestry of the probands (see Supplementary Appendix) was in line with that expected from the population (86% white, 7.5% Asian, 3.3% black, 2.2% mixed, 1% other: 2011 census of England and Wales). However, significantly more pediatric probands were of South Asian ancestry compared to adult probands (16% vs. 4%, P<0.001); our results indicated potential consanguinity in 43% of pediatric South Asian probands and 1% for the other pediatric probands (Table 1).
Table 1

Demographics (including inferred ancestry) of the 100,000 Genomes Project pilot.

VariableAll probands (N=2183)Paediatric (age at recruitment <=18) probands (N=571)Adult (age at recruitment > 18) probands (N=1612)
Sex — no. (%)
Male 1138 (52)339 (16)799 (37)
Female 1045 (48)232 (11)813 (37)
2183 (100)571 (26)1612 (74)
Median (IQR) age in years at recruitment 35 (18-54)9 (5-14)45 (31-60)
Race or ethnic group — no.(%), %consangunitysuggested in record
African 50 (2), 025 (4), 025 (2), 0
Ad Mixed American 26 (1), 2312 (2), 2514 (1), 21
East Asian 8 (<1), 02 (<1), 06 (<1), 0
European 1931 (88), <1438 (77), <11493 (93), <1
South Asian 163 (7), 3693 (16), 4370 (4), 25
Not determined 5 (<1), 01 (<1), 04 (<1), 0
2183 (100), 3571 (26), 81612 (74), 2

Clinical Data and Sequencing

We collected HPO terms for each participant (median of 4 present terms, range 1-61 and median of 4 absent terms (phenotypes not exhibited by the proband), range 0-144). We then carried out genome sequencing followed by quality assurance to check coverage, sequence quality, presence of repeat sample submissions or sample swaps, and consistency with reported family structures (see Supplementary Appendix).

The Diagnostic Yield

We obtained genetic diagnoses for 25% of probands and deposited the genotypes into the ClinVar repository (accession numbers XXXX to YYYY). Of these diagnoses, 60% were made on the basis of coding SNV/indels in the applied virtual panels, 26% from coding SNV/indels affecting well-established disease genes outside the virtual panels using phenotype-based prioritization and/or expert review by the clinicians, Congenica Ltd, or Fabric Genomics, and 14% from genome-wide, phenotype-agnostic research analysis looking beyond SNV/indels, coding regions, and disease genes in the virtual panels (Figure 1). Following international guidelines[15] a further 10% of probands were classified with variants of unknown significance in genes consistent with the phenotype by clinical review at the site, but with further functional validation required. Fewer candidate variants were returned after filtering in larger family structures (Table 3), making it easier to identify causative variants, in turn leading to higher diagnostic rates for trios, quads and more complex family structures (Figure 2a), even within a disorder e.g. for hereditary ataxia the diagnostic rate increased from 21% for singletons to 32% for trios (Table S4 in the Supplementary Materials).
Figure 1

Overview of the diagnostic and research pipeline and source of diagnoses. Results were returned to the Genomic Medicine Centres (GMCs) of the recruiting hospitals on an 2183 pilot probands. 25% received a positive diagnosis, 10% had variant(s) of unknown significance (VUS) in genes consistent with the phenotype according to clinical geneticists at the recruiting site, but with further functional validation required. The remaining 65% received a negative report at the time but will be reanalysed. Numbers and source of these positive diagnoses is shown at each stage of the automated diagnostic pipeline and additional research where a clear diagnosis was not immediately obvious.

Table 3

Number of candidate variants returned to the NHS per case by automated virtual panel-based analysis pipeline. Duos refer strictly to parent-child pairs and trios to both parents and a child in a family. Values shown are median (IQR).

All family structuresSingletonsDuosTriosOther family structures
Variants after filtering 221 (49-288)292 (258-327)149 (117-213)29 (17-136)22.5 (9-71)
In virtual panels 1 (0-2)1 (0-2)1 (0-3)1 (0-2)0 (0-1)
Figure 2

Diagnoses in the rare disease pilot.

(a) Percentage diagnostic yield for all samples and sub-divided by family structure or whether likely monogenic (35% yield) vs more complex aetiologies (11% yield) with the numbers of probands shown on bars, (b) Percentage diagnostic yield by disease area (numbers of closed probands shown on bars), (c) Percentage diagnostic yield for probands with/without prior genetics testing and broken down by most extensive testing type: chromosomal (karyotyping, arrayCGH, SNP arrays), targeted single gene tests, NGS panels or WES (numbers of closed probands shown on bars) (d) Performance of virtual panel-based and Exomiser prioritization for identifying the diagnoses. Virtual disease panel only: a single panel for the recruited disease category. Applied panels - all applied virtual panels used in the pipeline including the recruited disease associated panel as well as 0 or more additionally selected panels based on the patient phenotypes (HPO terms). Proportion of diagnoses detected are in blue (sensitivity) along with proportion of prioritized variants leading to a positive diagnosis in orange (positive predictive value). Proportions are also shown on bars. Here, diagnosed variant(s) are true positives and other returned candidate variants are false positives.Table 1. Demographics (including inferred ancestry) of the 100,000 Genomes Project pilot.

Unsurprisingly, we obtained a higher diagnostic yield for diseases that were considered more likely to have a monogenic cause (Table S4 in the Supplementary Appendix) than those we considered more likely to have complex etiology (35% vs 11%) (Figure 2a). Likely monogenic diseases equate to those with a presence in OMIM and where genetic testing is part of the standard diagnostic workup, based on the consensus blinded review of three clinical geneticists. Diagnostic yield was highly variable by disease (Figure 2b, Table S3 in the Supplementary Appendix), varying from 40-55% for intellectual disability and various vision and hearing disorders to 6% for tumor syndromes. We obtained data on the presence or absence of prior genetic testing for a subset (1177) of the participants. The number of tests per proband ranged from 0-16 with a median of 1 (IQR 0-2), and approximately half of the probands in this subset had been tested at least once. The overall diagnostic uplift from genome sequencing in this subset was 32% with only a slight difference depending on whether prior testing had been performed (33%), or not (31%). However, many of these prior tests were not recent. The diagnostic yield provided by genome sequencing varied between 28 to 45% depending on the type of prior testing (Figure 2c, Table S5 in the Supplementary Appendix) which, for the most part, involved targeted single gene and panel testing (Table S6 in the Supplementary Appendix).

Diagnostic Pipeline

The aim of the automated, diagnostic pipeline is to identify a few, potentially causative candidate variants, from the millions in a whole genome, through removal of extremely unlikely candidates (filtering) and identification of the most likely in the remainder (prioritization). This allows the GMCs to efficiently perform manual, clinical interpretation and issue a diagnostic report. The virtual panel-based pipeline identified 322 (66%) of the 490 SNV/indel-based diagnoses from the genomes, with a high positive predictive value given the millions of variants in the whole genomes: of 1041 of returned candidate variants, 291 (28%) proved to be diagnostic. We re-ran this analysis in December 2019 to assess the impact of using updated versions of the virtual panels containing the latest disease gene discoveries, improved virtual panel selection based on the patient’s phenotype and advances in variant filtering strategies, e.g. allowing for incomplete penetrance where suspected. This increased the number of genetic diagnoses detected from 322 to 377 (77%) with a positive predictive value of 15% (Figure 2d), demonstrating effective filtering and prioritization of the variants with only a median of 1 (IQR 0-2) candidate variant in panels returned to the clinicians at the GMCs per case (Table 3). Ongoing evolution of the virtual panels with new disease genes is expected to continue increasing the yield from this approach. Phenotype-based prioritization using Exomiser detected 77%, 86%, and 88% of these diagnoses in the top, top 3 and top 5 ranked candidates respectively (Figure 2d). Exomiser and use of virtual panels were complementary, with 92% of these diagnoses re-called when used combined (last blue bar in Figure 2d). Precision phenotyping of our patients was essential both for Exomiser and for the selection of additional virtual panels, without which only 54% of these diagnoses would have been prioritized in the recruited disease virtual panel and presented to the GMCs as a likely candidate (first blue bar in Figure 2d).

Research-based Diagnoses

14% of the genetic diagnoses required research outside the diagnostic pipeline (Figure 1). This research involved comparisons with the genome sequences and clinical data in our research environment, with validation using wet bench orthogonal tests and in-silico approaches (Table S7 in the Supplementary Appendix). Additional diagnoses were made by screening for the presence of de novo variants in highly constrained coding regions[16]. These diagnoses included a de novo EBF3 missense variant in a patient with hereditary ataxia. Mitochondrial genome analysis, taking into account heteroplasmy, detected 4 new diagnoses as well as the 9 that had already been detected by the main pipeline). Twelve probands had intronic splicing variants prioritized by Exomiser due to the known pathogenic status of these variants in ClinVar.[23] Nine novel non-coding diagnoses involving previously undescribed variants required exploration of the whole genome and in vitro functional validation via reverse transcription polymerase chain reaction, mini-gene, or luciferase assays.[24,25,26] Here, unsolved probands were queried for non-coding variants affecting genes in the applied virtual panels, either alone, or in compound heterozygosity with loss-of-function variants. These were identified using either Genomiser or, for retinal disorder probands, systematic analysis of the untranslated regions, promoter or introns. A further 43 probands were fully or partially explained by structural variants or simple tandem repeat expansions in the genes HTT or FXN in probands with hereditary spastic paraplegia.

Novel Disease Gene Associations

We performed burden testing to discover novel Mendelian disease gene associations and potential genetic diagnoses for unsolved probands; 828 significant disease-gene associations (q value < 0.1) were identified, including 249 known and 579 novel genes (novel with respect to their association with disease), with only 0.03 ± 0.2 (range 0-3) associations from 10,000 permutations where cases and controls were assigned randomly. Twenty two candidates represent the most likely new, fully penetrant, Mendelian disease genes (Table S8 in the Supplementary Appendix and ClinVar accession numbers SCV001759972 - SCV001760540) with three recently independently confirmed diagnoses: UBAP1 in hereditary spastic paraplegia,[27] FOXJ1 in non-CF bronchiectasis,[28] and SORD in Charcot-Marie Tooth disease.[29] Diagnostic reports were issued for three probands with these genes (Figure 1) and we are investigating others in GeneMatcher and by functional validation studies in model organisms.

Diagnostic Sequelae

These findings ended long diagnostic odysseys for some patients and their families (the median duration of odyssey was 75 months and number of hospital visits was 68); Table S1 in Supplementary Appendix); we speculate that they will mitigate NHS resource costs (183,273 episodes of hospital care costing £87 million for affected participants; Table S3 in Supplementary Appendix). In addition, 134 (25%) of the 533 genetic diagnoses were reported by clinicians to be of immediate clinical actionability with only 11 (0.2%) described as having no benefit. As of now, the remainder of the diagnoses are of unknown utility. Healthcare benefits included 4 diagnoses leading to a suggested change in medication, 26 suggesting additional surveillance for the proband or relatives, 13 allowing clinical trial eligibility, 59 informing future reproductive choices, and 32 with other benefits (Table S9 in the Supplementary Appendix). In several specific probands, diagnoses have had important clinical actionability. In a 36-yr-old male with suspected choroideraemia, we detected a novel, CHM promoter variant causing loss of gene expression[26] and offering eligibility for a gene-replacement trial. A male neonate proband presented with severe infection and transient neurologic symptoms immediately after birth and died at 4 months with no diagnosis but healthcare costs of approximately £80,000 (Table S10 in Supplementary Appendix). A diagnosis of transcobalamin 2 deficiency due to a homozygous frameshift in TCN2 was made from this study which enabled predictive testing to be offered to the younger brother within one week of birth. The younger child, who received a positive result, received weekly hydroxocobalamin injections to prevent metabolic decompensation. A 10-year-old girl was admitted to intensive care with life-threatening chicken pox. She had endured a diagnostic odyssey over seven years at a total cost of £356,571 across 307 secondary care episodes (Table S11 in Supplementary Appendix). We were able to diagnose CTPS1 deficiency due to a homozygous, known pathogenic splice acceptor variant. A diagnosis enabled a curative bone marrow transplant (cost £70,000) and predictive testing of her siblings showed no further family members to be at risk. One proband had waited till his sixth decade for a genomic diagnosis of an INF2 mutation causing focal segmental glomerulosclerosis. His father, brother and uncle had all died of renal failue. He had received two kidney transplants, had transmitted the condition to his daughter and was concerned about whether his 15-year-old grand daughter, who was under surveillance, was at risk. After he received his genetic diagnosis, the grand-daughter was tested, found to be negative, and discharged from regular medical surveillance.

Discussion

Our findings demonstrate a substantial uplift in genomic diagnoses achieved for patients by genome sequencing across a broad spectrum of rare disease. The enhanced diagnostic benefit was observed regardless of whether participants had undergone prior genetic testing (31% in those who had received testing and 33% in those who had not). For 25% of those who received a genetic diagnosis, there was immediate clinical actionability. Standardizing procedures, from enrolment of patients to the return of NHS-validated results to clinicians, was critical to our success. For example, clinical data collection using diseasespecific data models and HPO terms enabled diagnoses confirming the value of standardization through ontologies and clinical annotation in precision medicine.[30]. These additional diagnoses, beyond the 264 (49% of total diagnoses) observed in the single disease virtual panel, came from Exomiser and additional, applied virtual panels. The diagnostic discoveries derived by combining research, decision support and clinical validation and assessment leveraged an additional 72 diagnoses. Diagnostic yield was influenced by family structure, and for disorders with a likely Mendelian inheritance and a single gene etiology our yield increased to 35%: ophthalmological, metabolic and neurologic disorders yielded the greatest percentage of diagnoses. The scale of our dataset enabled cohort-wide burden testing which identified numerous novel disease–gene associations including three that have now been confirmed and 19 with compelling evidence that are likely to be confirmed in independent datasets. Of the diseases we diagnosed through genome-sequencing, 13% were caused by mutations in non-coding sequence or mitochondrial genomes, tandem repeat expansions in Huntington disease, and a wide range of structural variants with nucleotide resolution of breakpoints using a novel random forest method. An additional 2% of diagnoses involved coding variants in regions of low coverage on exome sequencing. Our results provide new evidence of the value of genome sequencing and mirror previous studies where 53% of participants who received new diagnoses from genome sequencing had previously received testing by exome sequencing.[5] Previous studies have demonstrated how next-generation sequencing can reveal diagnoses with yields of between 25% and 29% from exome sequencing in persons who had received no prior genetic testing.[32-34] The Undiagnosed Disease Network reported a 26% yield from a mixture of exome and genome sequence analysis of 382 patients[5] and another genome sequencing study gave a 42% yield in 50 families with intellectual disability in whom prior testing had previously been carried out.[35] We obtained similar results with a broad range of disorders (160) with unmet diagnostic need. Our approach is limited to diagnoses that are readily made through short-read genome sequencing. Fully phased, long-read sequencing better detects structural variation and delivers sequence from parts of the genome that are poorly captured by short read sequencing.[31] This pilot has underpinned the case for genome-sequencing in the diagnosis of certain specific rare diseases in the new NHS National Genomic Test Directory[36]. For patients in the National Health Service for specific disorders, such as intellectual disability, genome-sequencing will now be the first-line test (Table S12 in the Supplementary Appendix) and the NHS in England, through a new National Genomic Medicine Service, is in the process of sequencing 500,000 whole genomes in rare disease and cancer in healthcare. We hope our findings will assist other health systems in considering the role of genome sequencing in the care of patients with rare diseases. Disclosure forms provided by the authors are available with the full text of this article at NEJM.org.
Table 2

Clinical features of the 100,000 Genomes Project pilot

Primary symptoms — no. (%)All FamiliesSingletonsDuosTriosLarger families
Cardiovascular 147 (7)56 (3)24 (1)49 (2)18 (1)
Ciliopathies 69 (3)34 (2)14 (1)16 (1)5 (<1)
Dermatological 38 (2)9 (<1)5 (<1)22 (1)2 (<1)
Dysmorphic and congenital abnormalities 20 (1)10 (<1)2 (<1)7 (<1)1 (<1)
Endocrine 87 (4)57 (3)14 (1)12 (1)4 (<1)
Gastroenterological 32 (1)18 (1)14 (1)
Growth 3 (<1)3 (<1)
Haematological and immunological 5 (<1)2 (<1)3 (<1)
Haematological 7 (<1)3 (<1)2 (<1)2 (<1)
Hearing and ear 35 (2)6 (<1)5 (<1)17 (1)7 (<1)
Metabolic 93 (4)24 (1)12 (1)48 (2)9 (<1)
Intellectual disability (ID) 130 (6)10 (<1)24 (1)78 (4)18 (1)
Neurology and neurodevelopmental (excl. ID) 521 (24)193 (9)93 (4)194 (9)41 (2)
Ophthalmological 348 (16)74 (3)62 (3)199 (9)13 (1)
Renal and urinary tract 176 (8)125 (6)21 (1)26 (1)4 (<1)
Respiratory 2 (<1)1 (<1)1 (<1)
Rheumatological 48 (2)14 (1)6 (<1)25 (1)3 (<1)
Skeletal 62 (3)15 (1)11 (1)23 (1)13 (1)
Tumour syndromes 293 (13)231 (11)31 (1)27 (1)4 (<1)
Other 67 (3)17 (1)12 (1)34 (2)4 (<1)
2183(100)881 (40)343 (16)797 (37)162 (7)
  29 in total

Review 1.  The burden of rare diseases.

Authors:  Carlos R Ferreira
Journal:  Am J Med Genet A       Date:  2019-03-18       Impact factor: 2.802

2.  Single-base substitutions in the CHM promoter as a cause of choroideremia.

Authors:  Alina Radziwon; Gavin Arno; Dianna K Wheaton; Ellen M McDonagh; Emma L Baple; Kaylie Webb-Jones; David G Birch; Andrew R Webster; Ian M MacDonald
Journal:  Hum Mutat       Date:  2017-03-24       Impact factor: 4.878

3.  Molecular findings among patients referred for clinical whole-exome sequencing.

Authors:  Yaping Yang; Donna M Muzny; Fan Xia; Zhiyv Niu; Richard Person; Yan Ding; Patricia Ward; Alicia Braxton; Min Wang; Christian Buhay; Narayanan Veeraraghavan; Alicia Hawes; Theodore Chiang; Magalie Leduc; Joke Beuten; Jing Zhang; Weimin He; Jennifer Scull; Alecia Willis; Megan Landsverk; William J Craigen; Mir Reza Bekheirnia; Asbjorg Stray-Pedersen; Pengfei Liu; Shu Wen; Wendy Alcaraz; Hong Cui; Magdalena Walkiewicz; Jeffrey Reid; Matthew Bainbridge; Ankita Patel; Eric Boerwinkle; Arthur L Beaudet; James R Lupski; Sharon E Plon; Richard A Gibbs; Christine M Eng
Journal:  JAMA       Date:  2014-11-12       Impact factor: 56.272

4.  Comprehensive Rare Variant Analysis via Whole-Genome Sequencing to Determine the Molecular Pathology of Inherited Retinal Disease.

Authors:  Keren J Carss; Gavin Arno; Marie Erwood; Jonathan Stephens; Alba Sanchis-Juan; Sarah Hull; Karyn Megy; Detelina Grozeva; Eleanor Dewhurst; Samantha Malka; Vincent Plagnol; Christopher Penkett; Kathleen Stirrups; Roberta Rizzo; Genevieve Wright; Dragana Josifova; Maria Bitner-Glindzicz; Richard H Scott; Emma Clement; Louise Allen; Ruth Armstrong; Angela F Brady; Jenny Carmichael; Manali Chitre; Robert H H Henderson; Jane Hurst; Robert E MacLaren; Elaine Murphy; Joan Paterson; Elisabeth Rosser; Dorothy A Thompson; Emma Wakeling; Willem H Ouwehand; Michel Michaelides; Anthony T Moore; Andrew R Webster; F Lucy Raymond
Journal:  Am J Hum Genet       Date:  2016-12-29       Impact factor: 11.025

5.  Proband-only medical exome sequencing as a cost-effective first-tier genetic diagnostic test for patients without prior molecular tests and clinical diagnosis in a developing country: the China experience.

Authors:  Xuyun Hu; Niu Li; Yufei Xu; Guoqiang Li; Tingting Yu; Ru-En Yao; Lijun Fu; Jiwen Wang; Lei Yin; Yong Yin; Ying Wang; Xingming Jin; Xiumin Wang; Jian Wang; Yiping Shen
Journal:  Genet Med       Date:  2017-11-02       Impact factor: 8.822

6.  Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology.

Authors:  Sue Richards; Nazneen Aziz; Sherri Bale; David Bick; Soma Das; Julie Gastier-Foster; Wayne W Grody; Madhuri Hegde; Elaine Lyon; Elaine Spector; Karl Voelkerding; Heidi L Rehm
Journal:  Genet Med       Date:  2015-03-05       Impact factor: 8.822

7.  A clinical utility study of exome sequencing versus conventional genetic testing in pediatric neurology.

Authors:  Lisenka E L M Vissers; Kirsten J M van Nimwegen; Jolanda H Schieving; Erik-Jan Kamsteeg; Tjitske Kleefstra; Helger G Yntema; Rolph Pfundt; Gert Jan van der Wilt; Lotte Krabbenborg; Han G Brunner; Simone van der Burg; Janneke Grutters; Joris A Veltman; Michèl A A P Willemsen
Journal:  Genet Med       Date:  2017-03-23       Impact factor: 8.822

8.  Detection of long repeat expansions from PCR-free whole-genome sequence data.

Authors:  Egor Dolzhenko; Joke J F A van Vugt; Richard J Shaw; Mitchell A Bekritsky; Marka van Blitterswijk; Giuseppe Narzisi; Subramanian S Ajay; Vani Rajan; Bryan R Lajoie; Nathan H Johnson; Zoya Kingsbury; Sean J Humphray; Raymond D Schellevis; William J Brands; Matt Baker; Rosa Rademakers; Maarten Kooyman; Gijs H P Tazelaar; Michael A van Es; Russell McLaughlin; William Sproviero; Aleksey Shatunov; Ashley Jones; Ahmad Al Khleifat; Alan Pittman; Sarah Morgan; Orla Hardiman; Ammar Al-Chalabi; Chris Shaw; Bradley Smith; Edmund J Neo; Karen Morrison; Pamela J Shaw; Catherine Reeves; Lara Winterkorn; Nancy S Wexler; David E Housman; Christopher W Ng; Alina L Li; Ryan J Taft; Leonard H van den Berg; David R Bentley; Jan H Veldink; Michael A Eberle
Journal:  Genome Res       Date:  2017-09-08       Impact factor: 9.438

9.  Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications.

Authors:  Andy Rimmer; Hang Phan; Iain Mathieson; Zamin Iqbal; Stephen R F Twigg; Andrew O M Wilkie; Gil McVean; Gerton Lunter
Journal:  Nat Genet       Date:  2014-07-13       Impact factor: 38.330

10.  Accurate whole human genome sequencing using reversible terminator chemistry.

Authors:  David R Bentley; Shankar Balasubramanian; Harold P Swerdlow; Geoffrey P Smith; John Milton; Clive G Brown; Kevin P Hall; Dirk J Evers; Colin L Barnes; Helen R Bignell; Jonathan M Boutell; Jason Bryant; Richard J Carter; R Keira Cheetham; Anthony J Cox; Darren J Ellis; Michael R Flatbush; Niall A Gormley; Sean J Humphray; Leslie J Irving; Mirian S Karbelashvili; Scott M Kirk; Heng Li; Xiaohai Liu; Klaus S Maisinger; Lisa J Murray; Bojan Obradovic; Tobias Ost; Michael L Parkinson; Mark R Pratt; Isabelle M J Rasolonjatovo; Mark T Reed; Roberto Rigatti; Chiara Rodighiero; Mark T Ross; Andrea Sabot; Subramanian V Sankar; Aylwyn Scally; Gary P Schroth; Mark E Smith; Vincent P Smith; Anastassia Spiridou; Peta E Torrance; Svilen S Tzonev; Eric H Vermaas; Klaudia Walter; Xiaolin Wu; Lu Zhang; Mohammed D Alam; Carole Anastasi; Ify C Aniebo; David M D Bailey; Iain R Bancarz; Saibal Banerjee; Selena G Barbour; Primo A Baybayan; Vincent A Benoit; Kevin F Benson; Claire Bevis; Phillip J Black; Asha Boodhun; Joe S Brennan; John A Bridgham; Rob C Brown; Andrew A Brown; Dale H Buermann; Abass A Bundu; James C Burrows; Nigel P Carter; Nestor Castillo; Maria Chiara E Catenazzi; Simon Chang; R Neil Cooley; Natasha R Crake; Olubunmi O Dada; Konstantinos D Diakoumakos; Belen Dominguez-Fernandez; David J Earnshaw; Ugonna C Egbujor; David W Elmore; Sergey S Etchin; Mark R Ewan; Milan Fedurco; Louise J Fraser; Karin V Fuentes Fajardo; W Scott Furey; David George; Kimberley J Gietzen; Colin P Goddard; George S Golda; Philip A Granieri; David E Green; David L Gustafson; Nancy F Hansen; Kevin Harnish; Christian D Haudenschild; Narinder I Heyer; Matthew M Hims; Johnny T Ho; Adrian M Horgan; Katya Hoschler; Steve Hurwitz; Denis V Ivanov; Maria Q Johnson; Terena James; T A Huw Jones; Gyoung-Dong Kang; Tzvetana H Kerelska; Alan D Kersey; Irina Khrebtukova; Alex P Kindwall; Zoya Kingsbury; Paula I Kokko-Gonzales; Anil Kumar; Marc A Laurent; Cynthia T Lawley; Sarah E Lee; Xavier Lee; Arnold K Liao; Jennifer A Loch; Mitch Lok; Shujun Luo; Radhika M Mammen; John W Martin; Patrick G McCauley; Paul McNitt; Parul Mehta; Keith W Moon; Joe W Mullens; Taksina Newington; Zemin Ning; Bee Ling Ng; Sonia M Novo; Michael J O'Neill; Mark A Osborne; Andrew Osnowski; Omead Ostadan; Lambros L Paraschos; Lea Pickering; Andrew C Pike; Alger C Pike; D Chris Pinkard; Daniel P Pliskin; Joe Podhasky; Victor J Quijano; Come Raczy; Vicki H Rae; Stephen R Rawlings; Ana Chiva Rodriguez; Phyllida M Roe; John Rogers; Maria C Rogert Bacigalupo; Nikolai Romanov; Anthony Romieu; Rithy K Roth; Natalie J Rourke; Silke T Ruediger; Eli Rusman; Raquel M Sanches-Kuiper; Martin R Schenker; Josefina M Seoane; Richard J Shaw; Mitch K Shiver; Steven W Short; Ning L Sizto; Johannes P Sluis; Melanie A Smith; Jean Ernest Sohna Sohna; Eric J Spence; Kim Stevens; Neil Sutton; Lukasz Szajkowski; Carolyn L Tregidgo; Gerardo Turcatti; Stephanie Vandevondele; Yuli Verhovsky; Selene M Virk; Suzanne Wakelin; Gregory C Walcott; Jingwen Wang; Graham J Worsley; Juying Yan; Ling Yau; Mike Zuerlein; Jane Rogers; James C Mullikin; Matthew E Hurles; Nick J McCooke; John S West; Frank L Oaks; Peter L Lundberg; David Klenerman; Richard Durbin; Anthony J Smith
Journal:  Nature       Date:  2008-11-06       Impact factor: 49.962

View more
  48 in total

1.  3D-Structured Illumination Microscopy of Centrosomes in Human Cell Lines.

Authors:  Kari-Anne M Frikstad; Kay O Schink; Sania Gilani; Lotte B Pedersen; Sebastian Patzke
Journal:  Bio Protoc       Date:  2022-03-20

2.  T-DXd effective in HER2-mutant NSCLC.

Authors:  Peter Sidaway
Journal:  Nat Rev Clin Oncol       Date:  2021-12       Impact factor: 66.675

3.  An estimate of the cumulative paediatric prevalence of rare diseases in Ireland and comment on the literature.

Authors:  Emer Gunne; Deborah M Lambert; Alana J Ward; Daniel N Murphy; Eileen P Treacy; Sally Ann Lynch
Journal:  Eur J Hum Genet       Date:  2022-07-19       Impact factor: 5.351

4.  COVID vaccine immunity is waning - how much does that matter?

Authors:  Elie Dolgin
Journal:  Nature       Date:  2021-09       Impact factor: 69.504

5.  Nuclear-embedded mitochondrial DNA sequences in 66,083 human genomes.

Authors:  Wei Wei; Katherine R Schon; Greg Elgar; Andrea Orioli; Melanie Tanguy; Adam Giess; Marc Tischkowitz; Mark J Caulfield; Patrick F Chinnery
Journal:  Nature       Date:  2022-10-05       Impact factor: 69.504

6.  Resource allocation in genetic and genomic medicine.

Authors:  J Buchanan; I Goranitis; I Slade; A Kerasidou; M Sheehan; K Sideri; S Wordsworth
Journal:  J Community Genet       Date:  2022-10

7.  Seven years since the launch of the Matchmaker Exchange: The evolution of genomic matchmaking.

Authors:  Kym M Boycott; Danielle R Azzariti; Ada Hamosh; Heidi L Rehm
Journal:  Hum Mutat       Date:  2022-05-10       Impact factor: 4.700

8.  Noncanonical Splice Site and Deep Intronic FRMD7 Variants Activate Cryptic Exons in X-linked Infantile Nystagmus.

Authors:  Junwon Lee; Han Jeong; Dongju Won; Saeam Shin; Seung-Tae Lee; Jong Rak Choi; Suk Ho Byeon; Helen J Kuht; Mervyn G Thomas; Jinu Han
Journal:  Transl Vis Sci Technol       Date:  2022-06-01       Impact factor: 3.048

Review 9.  Phenotype-driven approaches to enhance variant prioritization and diagnosis of rare disease.

Authors:  Julius O B Jacobsen; Catherine Kelly; Valentina Cipriani; Genomics England Research Consortium; Christopher J Mungall; Justin Reese; Daniel Danis; Peter N Robinson; Damian Smedley
Journal:  Hum Mutat       Date:  2022-04-27       Impact factor: 4.700

10.  Whole genome sequences discriminate hereditary hemorrhagic telangiectasia phenotypes by non-HHT deleterious DNA variation.

Authors:  Katie E Joyce; Ebun Onabanjo; Sheila Brownlow; Fadumo Nur; Kike Olupona; Kehinde Fakayode; Manveer Sroya; Geraldine A Thomas; Teena Ferguson; Julian Redhead; Carolyn M Millar; Nichola Cooper; D Mark Layton; Freya Boardman-Pretty; Mark J Caulfield; Claire L Shovlin
Journal:  Blood Adv       Date:  2022-07-12
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.