| Literature DB >> 25533962 |
.
Abstract
Despite three decades of successful, predominantly phenotype-driven discovery of the genetic causes of monogenic disorders, up to half of children with severe developmental disorders of probable genetic origin remain without a genetic diagnosis. Particularly challenging are those disorders rare enough to have eluded recognition as a discrete clinical entity, those with highly variable clinical manifestations, and those that are difficult to distinguish from other, very similar, disorders. Here we demonstrate the power of using an unbiased genotype-driven approach to identify subsets of patients with similar disorders. By studying 1,133 children with severe, undiagnosed developmental disorders, and their parents, using a combination of exome sequencing and array-based detection of chromosomal rearrangements, we discovered 12 novel genes associated with developmental disorders. These newly implicated genes increase by 10% (from 28% to 31%) the proportion of children that could be diagnosed. Clustering of missense mutations in six of these newly implicated genes suggests that normal development is being perturbed by an activating or dominant-negative mechanism. Our findings demonstrate the value of adopting a comprehensive strategy, both genome-wide and nationwide, to elucidate the underlying causes of rare genetic disorders.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25533962 PMCID: PMC5955210 DOI: 10.1038/nature14135
Source DB: PubMed Journal: Nature ISSN: 0028-0836 Impact factor: 69.504
EDF1Characteristics of the families
A. Gestation Adjusted Decimal Age at Last Clinical Assessment. Histogram showing the distribution of the gestation adjusted decimal age at last clinical assessment across the 1133 probands. The dashed red line shows the median age. B. Frequency of HPO Term Usage. Bar plot showing, for each used HPO term, the number of times it was observed across the 1133 proband patient records. C. Projection PCA plot of the 1133 probands. PCA plot of 1133 DDD probands projected onto a PCA analysis using 4 different HapMap populations from the 1000 genomes project. Black: African, Red: European, Green: East Asian, Blue: South Asian and the 1133 DDD probands are represented by orange triangles. D. Self Declared and Genetically Defined Consanguinity. Overlaid histogram showing the distribution of kinship coefficients from KING comparing parental samples for each trio. Green: Trios where consanguinity was not entered in the patient record on DECIPHER. Red: Trios consanguinity was declared in the patient record on DECIPHER.
EDF2Number of Validated de novo SNVs and indels per Proband
Bar plot showing the distribution of the observed number of validated SNVs and indels per proband sample, and the expected distribution assuming a Poisson distribution with the same mean as the observed distribution.
Breakdown of diagnoses by mode and by sex
| Female (%) | Male (%) | Total (%) | |
|---|---|---|---|
| De novo mutation | 124 (22.5%) | 80 (13.7%) | 204 (18.0%) |
| Autosomal Dominant | 9 (1.6%) | 11 (1.9%) | 20 (1.8%) |
| Autosomal Recessive | 20 (3.6%) | 26 (4.5%) | 46 (4.1%) |
| X-linked Inherited | 1 (0.2%) | 19 (3.3%) | 20 (1.8%) |
| UPD/Mosaicism | 4 (0.7%) | 6 (1.0%) | 10 (0.9%) |
| Composite | 9 (1.6%) | 8 (1.4%) | 17 (1.5%) |
Inherited from an affected parent
EDF3Number of Diagnoses per Gene
Histogram showing the number of diagnoses per gene for genes with at least two diagnoses from different proband samples.
EDF4Burden of Large CNVs in 1133 DDD Proband Samples
Plot comparing the frequency of rare CNVs in three sample groups against CNV size. Y-axis is the on a log scale. Red: DDD probands who have not had previous microarray based genetic testing, Purple: DDD probands who have had negative previous microarray based genetic testing Green: DDD controls.
EDF5Expected and observed numbers of de novo mutations
The expected and observed numbers of mutations of different functional consequences in three mutually exclusive sets of genes are shown, along with the p value from an assessment of a statistical excess of observed mutations. The three classes of genes are described in the main text.
Figure 1Excess of recurrently mutated genes
Each panel shows the observed number of recurrently mutated genes (diamond) and the distribution of the number of recurrently mutated genes in 10,000 simulations (box indicates interquartile range, whiskers indicates 95% confidence interval) under a model of no gene-specific enrichment of mutations: a. all protein-altering mutations in all DDD children and undiagnosed DDD children, b. all LoF mutations in all DDD children and undiagnosed DDD children. Each diamond is annotated with the median excess of recurrently mutated genes, with 95% confidence intervals in brackets. P value of observed excess is <0.0001 for all four tests.
Figure 2Gene-specific significance of enrichment for DNMs
The –log10(p) value of testing for mutation enrichment is plotted only for each gene with at least one mutation in DDD children. On the X-axis is the p value of the most significant test in the DDD dataset, and on the Y-axis is the minimal p value from the significance testing in the meta-analysis dataset. Red indicates genes already known to be associated with DDs (in DDG2P). Only genes with a p value of less than 0.05/18,272 (red lines) are labeled.
Novel genes with compelling evidence for a role in DD
| Evidence | Gene | P Value | Test | Mutation Clustering | Predicted Haploinsufficiency | ||
|---|---|---|---|---|---|---|---|
| 3 (3,0) | 5 (5,0) | 4.10E-12 | Meta | Yes | 14.7% | ||
| 4 (4,0) | 5 (5,0) | 6.01E-12 | DDD | Yes | 19.7% | ||
| 4 (0,4) | 5 (0,5) | 4.59E-11 | Meta | No | 9.8% | ||
| 2 (0,2) | 5 (0,5) | 4.31E-10 | Meta | No | 30.0% | ||
| 3 (3,0) | 3 (3,0) | 2.03E-08 | DDD | Yes | 23.5% | ||
| 4 (3,1) | 5 (3,2) | 2.26E-07 | DDD | No | 12.7% | ||
| 2 (0,2) | 3 (0,3) | 4.58E-07 | Meta | No | 52.9% | ||
| 3 (3,0) | 4 (3,1) | 1.03E-06 | DDD | Yes | 0.6% | ||
| 3 (1,2) | 3 (1,2) | 1.14E-06 | DDD | No | 9.4% | ||
| 3 (3,0) | 5 (5,0) | 1.43E-06 | Meta | No | 13.5% | ||
| 2 (2,0) | 7 (7,0) | 5.16E-06 | Meta | Yes | 25.7% | ||
| 2 (2,0) | 2 (2,0) | 1.08E-05 | DDD | Yes | 37.7% | ||
The table summarises the 12 genes with compelling evidence to be novel DD genes. The number of unrelated patients with independent functional or LoF mutations in the DDD cohort or the wider meta-analysis dataset including DDD patients is listed. The p value reported is the minimum p value from the testing of the DDD dataset and the meta-analysis dataset. The dataset that gave this minimal p value is also reported. Mutations are considered to be clustered if the p value of clustering of functional SNVs is less than 0.01. Predicted haploinsufficiency is reported as a percentile of all genes in the genome, with ~0% being highlight likely to be haploinsufficient and 100% very unlikely to be haploinsufficient, based on the prediction score described in Huang et al 26 updated to enable predictions for a higher fraction of genes in the genome. During submission, a paper was published online describing a novel DD caused by mutations in ADNP 27.
Novel genes with suggestive evidence for a role in DD
Six genes with suggestive evidence to be novel DD genes. The number of unrelated patients with independent functional or LoF mutations in the DDD cohort or the wider meta-analysis dataset including DDD patients is listed. The p value reported is the minimum p value from the testing of the DDD dataset and the meta-analysis dataset. The dataset that gave this minimal p value is also reported. Mutations are considered to be clustered if the p value of clustering of functional SNVs is less than 0.01. Predicted haploinsufficiency is reported as a percentile of all genes in the genome, with ~0% being highly likely to be haploinsufficient and 100% very unlikely to be haploinsufficient, based on the prediction score described in Huang et al 26 updated to enable predictions for a higher fraction of genes in the genome. NAA10 is already known to cause an X-linked recessive DD in males, but here we identified missense mutations in females, suggesting a different, X-linked dominant, disorder.
| Evidence | Gene | P Value | Test | Mutation Clustering | Predicted Haploinsufficiency | ||
|---|---|---|---|---|---|---|---|
| 1 (0,1) | 3 (0,3) | 1.64E-06 | Meta | No | 7.5% | ||
| 3 (1,2) | 3 (1,2) | 4.84E-06 | DDD | No | 0.2% | ||
| 2 (2,0) | 3 (3,0) | 8.28E-06 | Meta | No | 34.1% | ||
| 3 (1,2) | 4(2,2) | 2.13E-05 | Meta | No | 3.8% | ||
| 3 (1,2) | 3 (1,2) | 5.14E-05 | DDD | No | 3.8% | ||
| 2 (0,2) | 2 (0,2) | 7.91E-05 | DDD | No | 19.0% |
Figure 3Five novel genes with clustered mutations
The domains (blue), post-translational modifications, and mutation locations (red stars) are shown for five proteins with highly clustered de novo mutations in unrelated children with severe, undiagnosed DDs. For two proteins (COL4A3BP and PCGF2) where all observed mutations are identical, photos are shown to highlight the facial similarities of patients carrying the same mutation.
Biallelic Loss of function and damaging functional variants
Rare (MAF < 5%) biallelic loss-of-function and damaging functional variants in uninherited diplotypes and probands. ‘Likely dominant probands’ refers to probands with a reported de novo mutation or affected parents, and ‘other probands’ to all remaining probands. ‘DDG2P Biallelic’ refers to confirmed and probable DDG2P genes with a biallelic mode of inheritance. See Supplemental methods for details of variant processing.
| Biallelic Variant Types | Untransmitted Diplotypes (n=1080) | Likely Dominant Probands (n=270) | Other Probands (n=810) |
|---|---|---|---|
| LoF/LoF (Genome-wide) | 110 | 17 | 86 |
| LoF/Dam (Genome-wide) | 87 | 21 | 71 |
| Dam/Dam (Genome-wide) | 312 | 90 | 264 |
| LoF/LoF (DDG2P Biallelic) | 1 | 1 | 3 |
| LoF/Dam (DDG2P Biallelic) | 2 | 0 | 6 |
| Dam/Dam (DDG2P Biallelic) | 26 | 7 | 25 |
Figure 4Candidate gene Loss of Function modeling in zebrafish reveals enrichment for developmentally important proteins
a, Examples of developmental phenotypes: Knockdown of pkn2a results in reduced cartilaginous jaw structures (black arrows), knockdown of fryl results in cardiac and craniofacial defects (white arrowheads and arrows, respectively), while knockdown of psmd3 results in smaller ear primordia (red arrows), and mis-patterned CNS neurons (compare red double arrows and brackets). b, Knockdown outcomes of 7 genes with variants present in microcephaly patients: Interocular measurements of brightfield images from control and LoF embryos reveal significant decreases in head size. A neuronal antibody stain (anti-HuC/D, green channel) labels the brains of control and morphant zebrafish. Measurements taken across the widest extent of the midbrain identify significant reductions in brain size, likely underlying the concomitant head size reductions seen in brightfield. In b, tables show average percentage reduction in head and brain width, and p-values of a t-test.
Zebrafish modeling identifies 21 developmentally important candidate genes
This table summarises the 21 genes whose knockdown results in developmental phenotypes in zebrafish. “# patients” column indicates how many patients were identified as carrying variants in these genes. Split numbers indicate the breakdown of variant types (eg. for BTBD9, 2/1 is two biallelic LoF and one de novo missense carrying patients). A summary of the patient phenotypes is listed, as well as the relevant phenotypes observed in zebrafish knockdown experiments. Phenotypic concordance categories indicate the degree of overlap between the zebrafish phenotyping and the patient phenotypes. Weak concordance typically is the result of severe, multisystem phenotypes in zebrafish. See Supplemental Materials for more detailed phenotype information.
| Gene | # patients | Variant | Patient phenotypes | Phenotypic concordance | Relevant knockdown phenotypes |
|---|---|---|---|---|---|
| 2/1 | Biallelic LoF/ | Seizures, microcephaly, hypertonia | Strong | Reduced head size, brain volume | |
| 1/2 | CNS and craniofacial defects | Strong | Abnormal head shape | ||
| 1/3 | Moderately short stature, microcephaly, CNS defects | Strong | Reduced head size, brain volume | ||
| 1 | CNS and craniofacial defects, seizures, microcephaly, hypertelorism | Strong | Reduced head size, brain volume | ||
| 1 | Short stature, craniofacial and cardiac defects | Strong | Cardiac defects, reduced axis length | ||
| 1 | CNS, cardiac, ear, and craniofacial defects, growth retardation | Strong | Cardiac, craniofacial cartilage, and growth defects | ||
| 1 | Microcephaly, muscular hypotonia, seizures, growth abnormality | Strong | Reduced head size and neural defects | ||
| 1 | Biallelic LoF | Seizures, microcephaly, CNS defects | Strong | Reduced head size, brain volume | |
| 1 | Seizures, CNS and cardiac defects, poor motor coordination | Strong | Reduced head size, cardiac defects, abnormal locomotion | ||
| 2 | Biallelic LoF | Microcephaly, CNS and ear defects | Strong | Reduced head size, brain volume, neural defects | |
| 2 | Microcephaly, muscle defects, seizures | Strong | Reduced heaa size and neural defects | ||
| 1 | Biallelic LoF | CNS defects, growth retardation | Moderate | Growth reduction | |
| 1 | CNS defects, hypertonia, seizures | Moderate | Abnormal musculature, CNS and locomotion | ||
| 1 | Short stature, hypotonia, hypotelorism | Moderate | Abnormal locomotion | ||
| 1/1 | Short stature, cardiac, CNS, ear and craniofacial defects | Weak | Multisystem | ||
| 1 | CNS defects, seizures | Weak | Multisystem | ||
| 1 | CNS and craniofacial defects | Weak | Multisystem | ||
| 1 | CNS and ear defects, hypotonia, growth retardation | Weak | Multisystem, with otic and growth defects | ||
| 2 | CNS and ear defects, hypotonia, seizures, coloboma | Weak | Multisystem | ||
| 1/1 | CNS defects, hypotonia | Weak | Multisystem | ||
| 1 | CNS defects, hypertelorism | None | None |
EDF6Haploinsufficiency analyses
A. Saturation analysis for detecting haploinsufficient DD genes. A boxplot showing the distribution of statistical power to detect a significant enrichment of LoF mutations across 18,272 genes in the genome, for different numbers of trios studied, from 1,000 trios to 12,000 trios. B. Distribution of haplinsufficiency scores in selected sets of . Violin plot of haploinsufficiency scores in five sets of de novo mutations: Silent - all synonymous mutations, Diagnostic - mutations in known DD genes in diagnosed individuals, Undiagnosed_Func - all functional mutations in undiagnosed individuals, Undiagnosed_LoF - All LoF mutations in undiagnosed individuals, Undiagnosed_recur - mutations in genes with recurrent functional mutations in undiagnosed individuals. P values for a Mann-Whitney test comparing each of the latter four distributions to that observed for the silent (synonymous) variants are plotted at the top of each violin.