| Literature DB >> 25529582 |
Caroline F Wright1, Tomas W Fitzgerald2, Wendy D Jones2, Stephen Clayton2, Jeremy F McRae2, Margriet van Kogelenberg2, Daniel A King2, Kirsty Ambridge2, Daniel M Barrett2, Tanya Bayzetinova2, A Paul Bevan2, Eugene Bragin2, Eleni A Chatzimichali2, Susan Gribble2, Philip Jones2, Netravathi Krishnappa2, Laura E Mason2, Ray Miller2, Katherine I Morley3, Vijaya Parthiban2, Elena Prigmore2, Diana Rajan2, Alejandro Sifrim2, G Jawahar Swaminathan2, Adrian R Tivey2, Anna Middleton2, Michael Parker4, Nigel P Carter2, Jeffrey C Barrett2, Matthew E Hurles2, David R FitzPatrick5, Helen V Firth6.
Abstract
BACKGROUND: Human genome sequencing has transformed our understanding of genomic variation and its relevance to health and disease, and is now starting to enter clinical practice for the diagnosis of rare diseases. The question of whether and how some categories of genomic findings should be shared with individual research participants is currently a topic of international debate, and development of robust analytical workflows to identify and communicate clinically relevant variants is paramount.Entities:
Mesh:
Year: 2014 PMID: 25529582 PMCID: PMC4392068 DOI: 10.1016/S0140-6736(14)61705-0
Source DB: PubMed Journal: Lancet ISSN: 0140-6736 Impact factor: 79.321
Figure 1Study workflow
SNV=single nucleotide variant. Indel=insertion or deletion. CNV=copy number variant. UPD=uniparental disomy.
Figure 2Variant filtering logic for clinical reporting within the study
Genomic variants were filtered on the basis of six factors, of which the first five were automated and the final one was done manually: (1) frequency, prevalence of the variant in the general population (MAF ≤1%); (2) function, most severe predicted functional consequence, such as LOF, defined by specific sequence ontology terms (transcript ablation, splice donor variant, splice acceptor variant, stop-gained, frameshift variant, stop-lost, initiator codon variant, in-frame insertion, in-frame deletion, missense variant, transcript amplification, and coding sequence variant); (3) location, genomic location compared with DDG2P of published genes; (4) variant type, genotype (eg, heterozygous or homozygous) and loss or gain for small CNVs (which were only considered when they contained entire genes in which LOF or dominant negative mutations had been previously reported, and gains were only considered when they overlapped genes in which increased gene dosage mutations had been previously reported); (5) inheritance, aspects of the pipeline that are dependent on inheritance information derived from parental data are shaded; and (6) phenotype, patient phenotype was manually compared against published phenotypes for a particular gene. MAF=minor allele frequency. CNV=copy number variant. LOF=loss of function. DDG2P=Developmental Disorders Genotype-to-Phenotype database.
Changes to DDG2P over time
| Total reportable genes | 819 | 875 | 1075 | 1128 |
| Genes added | .. | 60 | 201 | 60 |
| Genes removed | .. | 4 | 1 | 7 |
In addition to genes being added or removed, annotations for existing genes can also change (eg, to include multiple modes or mechanisms). The November, 2013 version was used for the analysis presented here and includes 1128 reportable genes. DDG2P=Developmental Disorders Genotype-to-Phenotype database.
DDG2P also contains non-reportable categories when there is insufficient evidence associating a gene and developmental disorder (appendix 1).
The selection of variants for reporting is based on the strongest available evidence of gene function and no variants yet reported have been retracted because of changes in the DDG2P list.
Figure 3Representation of phenotypic diversity in cohort
Our patient cohort represents children with a wide range of severe undiagnosed developmental disorders ascertained clinically across the UK.
Figure 4Analysis of flagged variants in all 1133 children excluding (red) and including (blue) filtering on the basis of parental genotypes and affected status (using the November 2013 version of DDG2P)
(A) Histogram of the number of flagged single nucleotide variants and insertion-deletions in 1133 children with and without parental data. (B) Mean number of flagged variants per child with and without parental data for families where neither, one, or both parents are affected by a developmental phenotype, subdivided by DDG2P genetic mechanism. Note that compound heterozygous variants are counted once. Red=proband-only analysis. Blue=family-trio analysis with parental genotype data. Filled=autosomal dominant DDG2P genes. Vertical stripes=autosomal recessive DDG2P genes. Horizontal stripes=X-linked DDG2P genes. DDG2P=Developmental Disorders Genotype-to-Phenotype database.
Total number of single nucleotide variants and insertion-deletions flagged by the clinical reporting workflow in all 1133 probands (in the November, 2013 version of DDG2P) compared with the number of variants that would have been flagged in the same probands in the absence of parental data
| Inherited | De novo | ||
|---|---|---|---|
| Autosomal dominant | 473 | 193 | 9529 |
| Autosomal recessive (homozygotes) | 83 | 0 | 191 |
| Autosomal recessive (compound heterozygotes) | 341 | 6 | 1071 |
| X-linked dominant | 38 | 21 | 269 |
| X-linked recessive | 322 | 15 | 387 |
| Total | 1257 | 235 | 11 447 |
Inherited variants in autosomal dominant DDG2P genes account for the main difference, in which only de novo variants and those inherited from an affected parent are likely to be of clinical interest. Around 90% of flagged variants were predicted to be missense point mutations. DDG2P=Developmental Disorders Genotype-to-Phenotype database.
Before secondary validation by targeted Sanger sequencing.
Two or more likely pathogenic variants in different copies of the same gene, counted once per compound variant.
Likely diagnoses in the first 1133 families
| Autosomal dominant | |||||
| De novo | 242 (193, 49) | 184 | 75% | 16% | |
| Inherited | 528 (473, 55) | 27 | 5% | 2% | |
| Autosomal recessive | |||||
| De novo | 6 (6, 0) | 0 | .. | .. | |
| Inherited | 425 (424, 1) | 52 | 13% | 5% | |
| X-linked | |||||
| De novo | 41 (36, 5) | 31 | 75% | 3% | |
| Inherited | 371 (360, 11) | 23 | 6% | 2% | |
| Uncertain inheritance | 83 (0, 83) | 0 | .. | .. | |
| Chromosomal events | |||||
| Uniparental disomy | .. | 6 | .. | 0·5% | |
| Mosaicism | .. | 5 | .. | 0·5% | |
| Total | 1696 | 328 | 19% | 27% | |
See appendix 2 for details of individual genes and phenotype classes. The predictive value is the probability that a flagged variant was reported as likely diagnostic (reported or reviewed), and the diagnostic yield is the contribution of that type of variant to the overall diagnostic yield. Note that three pairs of siblings and two pairs of monozygotic twins received the same diagnosis. Only 14% of reported variants were present in the public Human Gene Mutation Database or Leiden Open Variation Database; 84% of flagged variants present in these databases were not reported, because they did not appear to be relevant to the child's phenotype. SNV=single nucleotide variant. CNV=copy number variant.
17 probands received two contributory pathogenic variants.
Figure 5Genetic diagnoses associated with broad phenotype categories
Circos-style plot representing the genetic heterogeneity within developmental disorders, showing individual diagnoses in known Developmental Disorders Genotype-to-Phenotype database genes, which links the genomic location of each gene with some key phenotypes in each child. Phenotypes are listed outside the widest arc of the circle, chromosome numbers are indicated outside the smaller arc, and individual gene names are listed inside. Links are coloured by phenotype group. See appendix 2 for details of the diagnoses. ID=intellectual disability. CHD=congenital heart defect. ASD=autism spectrum disorders. Deaf=hearing impairment. Cleft=oral cleft. VI=visual impairment. MC=microcephalic dwarfism. PD=polydactyly.