| Literature DB >> 35483875 |
Malte Spielmann1,2,3,4, Martin Kircher1,5,6.
Abstract
The increase in sequencing capacity, reduction in costs, and national and international coordinated efforts have led to the widespread introduction of next-generation sequencing (NGS) technologies in patient care. More generally, human genetics and genomic medicine are gaining importance for more and more patients. Some communities are already discussing the prospect of sequencing each individual's genome at time of birth. Together with digital health records, this shall enable individualized treatments and preventive measures, so-called precision medicine. A central step in this process is the identification of disease causal mutations or variant combinations that make us more susceptible for diseases. Although various technological advances have improved the identification of genetic alterations, the interpretation and ranking of the identified variants remains a major challenge. Based on our knowledge of molecular processes or previously identified disease variants, we can identify potentially functional genetic variants and, using different lines of evidence, we are sometimes able to demonstrate their pathogenicity directly. However, the vast majority of variants are classified as variants of uncertain clinical significance (VUSs) with not enough experimental evidence to determine their pathogenicity. In these cases, computational methods may be used to improve the prioritization and an increasing toolbox of experimental methods is emerging that can be used to assay the molecular effects of VUSs. Here, we discuss how computational and experimental methods can be used to create catalogs of variant effects for a variety of molecular and cellular phenotypes. We discuss the prospects of integrating large-scale functional data with machine learning and clinical knowledge for the development of accurate pathogenicity predictions for clinical applications.Entities:
Mesh:
Year: 2022 PMID: 35483875 PMCID: PMC9059783 DOI: 10.1101/mcs.a006196
Source DB: PubMed Journal: Cold Spring Harb Mol Case Stud ISSN: 2373-2873
Figure 1.Rising numbers of variants of uncertain significance (VUSs) and the functional composition of ClinVar variants. (A) The number of variants with clinical assertions in NCBI ClinVar (Landrum and Kattman 2018) increased considerably in the last decade, but VUSs represent the largest class. As of its January 2022 release, ClinVar reports on more than 1.1 million variants. Shown is the number of GRCh38 single-nucleotide variants (SNVs) reported by their last date of variant evaluation (as a proxy for how long the variant has been known as the database was only established in 2013) and the assigned clinical significance (ClinSig) from 1990 to 2020 (left, logarithmic scale) and the last 10 years (right, linear scale). Entries without a date were excluded and only the nine most frequently used ClinSig values retained. In the remaining 961,829 entries, the nine levels were further simplified to five categories by assigning “Pathogenic/Likely pathogenic” (n = 7821) with “Likely pathogenic” (n = 32,421), “Benign/Likely benign” (n = 24,476) with “Likely benign” (n = 258,515) as well as “Conflicting interpretations of pathogenicity” (n = 51,298) and “not provided” (n = 8721) together with “Uncertain significance” (n = 392,706). By 2015, the number of VUSs exceeded the number of reported “Pathogenic” variants. (B) Annotated variant consequences for variants in ClinVar versus potential genomic SNVs highlight clear ascertainment effects. Using SNVs from panel A, we retrieved variant consequence annotation as reported feature by the Combined Annotation Dependent Depletion (CADD) v1.6 tool (Rentzsch et al. 2021) and 250,000 potential SNVs from the whole-genome CADD annotation file as representation of the genomic background. The top panel shows ClinVar variants by their clinical assertion, highlighting coding variants as dominant variant classes and upstream, downstream, and intergenic variants being generally underrepresented. Between clinical assertions, functional class representation follows classical observations of most severe effects for nonsense (stop gain) and missense (nonsynonymous amino acid exchanges) variants. The bottom panel highlights that also in recent years pathogenic variants do not show a substantial increase in the representation of noncoding variants.
Figure 2.Multiplex assays of variant effects (MAVEs): Clustered regularly interspersed short palindromic repeat (CRISPR)-based (A) and massively parallel reporter assay (MPRA)-based (B) MAVE strategies share a common framework. First, hundreds or thousands of genetic variants are created (e.g., by synthesis or error-prone polymerase chain reaction) and cloned into a plasmid system. Second, this mutant library is introduced into an in vitro system and finally read out by a biological phenotype or function using massively paralleled sequencing. (sgRNA) Single-guide RNA, (gRNA) guide RNA, (NGS) next-generation sequencing.