| Literature DB >> 33743207 |
Rebecca Truty1, Karen Ouyang1, Susan Rojahn1, Sarah Garcia1, Alexandre Colavin1, Barbara Hamlington1, Mary Freivogel1, Robert L Nussbaum1, Keith Nykamp1, Swaroop Aradhya2.
Abstract
The complexities of gene expression pose challenges for the clinical interpretation of splicing variants. To better understand splicing variants and their contribution to hereditary disease, we evaluated their prevalence, clinical classifications, and associations with diseases, inheritance, and functional characteristics in a 689,321-person clinical cohort and two large public datasets. In the clinical cohort, splicing variants represented 13% of all variants classified as pathogenic (P), likely pathogenic (LP), or variants of uncertain significance (VUSs). Most splicing variants were outside essential splice sites and were classified as VUSs. Among all individuals tested, 5.4% had a splicing VUS. If RNA analysis were to contribute supporting evidence to variant interpretation, we estimated that splicing VUSs would be reclassified in 1.7% of individuals in our cohort. This would result in a clinically significant result (i.e., P/LP) in 0.1% of individuals overall because most reclassifications would change VUSs to likely benign. In ClinVar, splicing VUSs were 4.8% of reported variants and could benefit from RNA analysis. In the Genome Aggregation Database (gnomAD), splicing variants comprised 9.4% of variants in protein-coding genes; most were rare, precluding unambiguous classification as benign. Splicing variants were depleted in genes associated with dominant inheritance and haploinsufficiency, although some genes had rare variants at essential splice sites or had common splicing variants that were most likely compatible with normal gene function. Overall, we describe the contribution of splicing variants to hereditary disease, the potential utility of RNA analysis for reclassifying splicing VUSs, and how natural variation may confound clinical interpretation of splicing variants.Entities:
Keywords: RNA analysis; functional studies; gene panel; genetic testing; in silico prediction; next-generation sequencing; splice site; splicing; variant classification; variants of uncertain significance
Year: 2021 PMID: 33743207 PMCID: PMC8059334 DOI: 10.1016/j.ajhg.2021.03.006
Source DB: PubMed Journal: Am J Hum Genet ISSN: 0002-9297 Impact factor: 11.025
Number and proportion of splicing variants in Invitae, ClinVar, and gnomAD data
| Variant class | Invitae—observed P/LP/VUS variants (N = 466,736), No. (%) | Invitae—unique P/LP/VUS variants (N = 149,139), No. (%) | Invitae—patients (N = 689,321), No. (%) | ClinVar P/LP/VUS variants (N = 229,329), No. (%) | gnomAD variants in protein-coding genes |
|---|---|---|---|---|---|
| Splicing variants | 60,807 (13.0) | 22,344 (15.0) | 52,047 (7.6) | 23,041 (10.1) | 825,992 (9.4) |
| Splicing VUSs | 42,534 (9.1) | 16,965 (11.4) | 37,064 (5.4) | 11,116 (4.8) | N/A |
| Splicing VUSs that RNA analysis may reclassify | 13,281 (2.8) | 5,200 (3.5) | 12,013 (1.7) | N/A | N/A |
| Missense variants | 337,649 (72.3) | 110,774 (74.3) | 219,515 (31.8) | 115,571 (50.4) | 5,152,451 (58.6) |
| Truncating variants | 64,472 (13.8) | 16,806 (11.3) | 58,815 (8.5) | 35,985 (15.7) | 396,944 (4.5) |
Columns do not add up to 100% because some variants that fit multiple categories are counted more than once, while other variants (e.g., copy number variants and in-frame indels) are only represented in the total N. ClinVar data include submissions with conflicting interpretations and exclude Invitae submissions. gnomAD, Genome Aggregation Database; P/LP, pathogenic/likely pathogenic; VUS, variant of uncertain significance.
Includes missense variants, truncating variants, silent variants, in-frame indels, alterations to start and stop codons, and splicing variants (to ±8 bp intronic) in all canonical coding transcripts.
Figure 1Clinically classified splicing variants in a large clinical cohort
Number of splicing variants at exonic or intronic positions indicated among 689,321 individuals tested for a variety of inherited diseases. All exonic splicing variants are grouped together; the intronic splicing variants are grouped by distance from the intron-exon junction in base pairs (bp). Note that intronic variants more than 10 bp from the intron-exon junction may not be detected because of reportable range of the sequencing assay; therefore, splice variants ±10 bp intronic are most likely underrepresented.
P/LP, pathogenic/likely pathogenic; VUS, variant(s) of uncertain significance. Colors within each bar indicate the number classified as P/LP (blue) or VUSs (green).
Figure 2Distribution of variant types and their clinical classifications in a clinical cohort of 689,321 individuals tested for genetic disease
(A–C) Number and proportion of variants by type and clinical classification among (A) all observed variants, (B) unique variants, and (C) patients. Splicing variants are shown both as a group and split into ESS and non-ESS variants. VUS + RNA potential indicates splicing VUSs that have the potential to be reclassified with the addition of evidence from RNA analysis; these are included in the splicing VUSs total.
ESS, essential splice site.
Figure 3Frequencies of splice variants in the healthy human genome
Splicing variants were identified in gnomAD (v.2.0.2) via the Ensembl Variant Effect Predictor (v.85).
(A) Bar graph indicating the absolute number of variants identified in gnomAD within coding regions and ±8 bp of intronic sequence. Splicing variants include variants at the ESS (at intronic positions ±1–2) and at non-ESS locations (at intronic positions ±3–8 bp and exonic positions ±1–3 bp). Other includes in-frame indels and alterations to stop and start codons.
(B) Allele frequencies for splicing variants as determined by “popmax” and grouped as common (>1%), rare (0.1%–1%), and very rare (<0.1%).
(C) Outlier boxplot showing the distribution of splicing and truncating variants among hereditary disease genes by inheritance patterns.
(D) Outlier boxplot showing the distribution of splicing and truncating variants in exons of all gnomAD genes with high pLI scores (pLI > 0.9) and low pLI scores (pLI ≤ 0.9).
AD, autosomal dominant; AR, autosomal recessive; ESS, essential splice site; gnomAD, Genome Aggregation Database; pLI, probability of loss-of-function intolerant; XL, X-linked.