| Literature DB >> 35039090 |
Christine G Preston1, Matt W Wright1, Rao Madhavrao1, Steven M Harrison2, Jennifer L Goldstein3, Xi Luo4, Hannah Wand5, Bryan Wulf1, Gloria Cheung1, Mark E Mandell1, Howard Tong1, Shaung Cheng1, Michael A Iacocca1, Arturo Lopez Pineda6, Alice B Popejoy6, Karen Dalton7, Jimmy Zhen7, Selina S Dwight8, Lawrence Babb2, Marina DiStefano2, Julianne M O'Daniel3, Kristy Lee3, Erin R Riggs9, Diane B Zastrow10, Jessica L Mester11, Deborah I Ritter4, Ronak Y Patel12, Sai Lakshmi Subramanian12, Aleksander Milosavljevic12, Jonathan S Berg3, Heidi L Rehm2,13, Sharon E Plon4,12, J Michael Cherry14, Carlos D Bustamante6,14, Helio A Costa15,16.
Abstract
BACKGROUND: Identification of clinically significant genetic alterations involved in human disease has been dramatically accelerated by developments in next-generation sequencing technologies. However, the infrastructure and accessible comprehensive curation tools necessary for analyzing an individual patient genome and interpreting genetic variants to inform healthcare management have been lacking.Entities:
Keywords: Clinical Genome Resource Consortium; Clinical genetics; Precision medicine; Variant curation
Mesh:
Substances:
Year: 2022 PMID: 35039090 PMCID: PMC8764818 DOI: 10.1186/s13073-021-01004-8
Source DB: PubMed Journal: Genome Med ISSN: 1756-994X Impact factor: 11.117
Fig. 1ClinGen FDA-recognized variant curation process and VCI. Overview of the ClinGen variant curation process using the VCI, an FDA-recognized workflow. Biocurators select a variant and evaluate evidence that falls into six categories. VCI viewers may view all evidence available for any variant using the VCI. The VCI supports users in making a final pathogenicity classification keeping with the ACMG/AMP guidelines. ClinGen expert panels then disseminate their variant classifications through two community resources: the Evidence Repository (ERepo) and ClinVar
VCI displayed information type and source
| Information type | Displayed data | Data source |
|---|---|---|
| Basic Information | • Variant ID • HGVS terms | ClinGen Allele Registry [ |
• ClinVar Variation ID • ClinVar Overall interpretation • ClinVar Submitted interpretations • ClinVar Primary transcript • RefSeq transcripts • dbSNP variant ID • Entrez Gene ID | NCBI E-utilities [ | |
• RefSeq transcripts • Ensembl transcripts • Molecular consequences | Ensembl VEP [ | |
| • Monarch Disease Ontology (Mondo) human disease term(s) | Ontology Lookup Service [ | |
| • Phenotypic abnormality term(s) | Human Phenotype Ontology (HPO) [ | |
| Population | • Allele frequencies ◦ gnomAD ◦ ExAC ◦ Exome Sequencing Project | MyVariant.info [ |
• Allele frequencies ◦ PAGE Study | GGV Browser [ | |
• Allele frequencies ◦ 1000 Genomes | Ensembl VEP [ | |
| Variant Type | • ◦ REVEL ◦ SIFT • PolyPhen2 • LRT • MutationTaster • MutationAssessor ◦ FATHMM ◦ PROVEAN ◦ MetaSVM ◦ MetaLR ◦ CADD ◦ FATHMM-MKL ◦ fitCons • Conservation analysis scores ◦ phyloP100way ◦ phyloP30way ◦ phastCons100way ◦ phastCons30way ◦ GERP++ ◦ SiPhy | MyVariant.info [ |
| Experimental | • Experimental functional data | ClinGen Functional Data Repository (FDRepo) |
| Gene-Centric | • Gene symbol | HGNC [ |
• ExAC constraint scores • UniProt protein ID • GeneCards gene | MyGene.info [ |
Fig. 2Core VCI data model used for storing and retrieving data as JSON documents. The VCI uses a data model centered on a classification (dark blue center box), with relationships to other data models (white boxes). Each assertion is uniquely defined by the Variant, Disease and Mode of Inheritance models and is owned by a user or affiliation. It has two core attributes A status indicating its state in the current workflow cycle and B selected classification (Fig. 1). It uses different types of evidence (see Evidence Collection in Fig. 1) and applies an evidence criteria to arrive at the selected classification. Each evidence type can use articles as supplementary evidence. As an interpretation progresses through review statuses (such as provisional, approved, and published), snapshots of the full data at each review step are created. The relationships between data models are represented here, with 1:1 (solid green lines), 1 to many (N), or many to many (N to N) indicated
VCI software components
| Component | Location | Description |
|---|---|---|
| Front-end | gci-vci-react | Contains all front-end code used for user-interface development |
| Back-end | gci-vci-serverless | All back-end code including controllers, and database access objects |
| Database models | gci-vci-serverless/src/models/ | This is a set of models that are used for validation of the document data posted to the dynamodb database |
| API development | Gci-vci-api | Added in August 2021 to provide API support for GCI/VCI data |
| Messaging | gci-vci-kafka-to-lambda gci-vci-serverless/src/helpers/message_helpers.py | The messaging component to exchange data with other ClinGen tools |
Fig. 3VCI platform components overview including schema and serverless architecture. The platform is web-browser based and uses AWS cloud services. An External Resources Manager retrieves population, predictive, functional, and other variant and gene-level data from external sources (Table 1) via APIs. In addition, the user can add in curated evidence, evaluate against ACMG/AMP guidelines, and save the classification for review and approval. The approved classifications are then submitted to the ClinGen Evidence Repository. All data are saved in a database via microservices and can be accessed via queries. The Amazon Cloud Services provide the microservices to store and retrieve data adopting a Serverless Architecture utilizing the following components and services: Amplify, Cognito, API Gateway, Lambda, DynamoDB, S3, and the user-facing web interface is created using React JavaScript Library
Fig. 4The basic information view. The VCI has six-tab views that collate and display variant information from external and internal sources to biocurators. A The top title view is always viewable and shows the variant title, links to the variant in external resources, and key curation information for the record. B The criteria bar displays evaluated criteria and the calculated pathogenicity. C The basic information tab displays any curations available for the variant in the VCI and ClinVar and transcript information from RefSeq and Ensembl
Fig. 5Case and Segregation Evidence Capture. The structured data capture for the case/segregation view in the VCI, including the A top title view B evidence sources are captured and structured so that users can quickly see all sources and the summed individual counts from the pooled evidence for specific ACMG/AMP criteria (shown here is PM3)
Fig. 6VCI platform growth over time. A Number of curated variant classifications performed in the VCI over time. B The number of biocurators and biocurator affiliations accumulated over time are noted at the top of each bar