| Literature DB >> 31779674 |
Arpad M Danos1, Kilannin Krysiak1,2, Erica K Barnell1,3, Adam C Coffman1, Joshua F McMichael1, Susanna Kiwala1, Nicholas C Spies1, Lana M Sheta1, Shahil P Pema1, Lynzey Kujan1, Kaitlin A Clark1, Amber Z Wollam1, Shruti Rao4, Deborah I Ritter5, Dmitriy Sonkin6, Gordana Raca7, Wan-Hsin Lin8, Cameron J Grisdale9, Raymond H Kim10, Alex H Wagner1,3, Subha Madhavan4,11, Malachi Griffith12,13,14,15, Obi L Griffith16,17,18,19.
Abstract
Manually curated variant knowledgebases and their associated knowledge models are serving an increasingly important role in distributing and interpreting variants in cancer. These knowledgebases vary in their level of public accessibility, and the complexity of the models used to capture clinical knowledge. CIViC (Clinical Interpretation of Variants in Cancer - www.civicdb.org) is a fully open, free-to-use cancer variant interpretation knowledgebase that incorporates highly detailed curation of evidence obtained from peer-reviewed publications and meeting abstracts, and currently holds over 6300 Evidence Items for over 2300 variants derived from over 400 genes. CIViC has seen increased adoption by, and also undertaken collaboration with, a wide range of users and organizations involved in research. To enhance CIViC's clinical value, regular submission to the ClinVar database and pursuit of other regulatory approvals is necessary. For this reason, a formal peer reviewed curation guideline and discussion of the underlying principles of curation is needed. We present here the CIViC knowledge model, standard operating procedures (SOP) for variant curation, and detailed examples to support community-driven curation of cancer variants.Entities:
Keywords: Cancer; Curation; Knowledgebase; Standard operating procedure; Variant
Mesh:
Year: 2019 PMID: 31779674 PMCID: PMC6883603 DOI: 10.1186/s13073-019-0687-x
Source DB: PubMed Journal: Genome Med ISSN: 1756-994X Impact factor: 11.117
Fig. 1Overview of the CIViC knowledge model for the exploration of existing data (i.e., searching and browsing) and content curation. a The CIViC knowledge model consists of four interconnected levels that contribute to the content within CIViC: Genes (blue), Variants (orange), Evidence (yellow), and Assertions (green). Each broadly defined CIViC Variant is associated with a single gene but can have many lines of evidence linking it to clinical relevance. b CIViC curation typically begins with the submission of an Evidence Item. Creation of an Evidence Item will automatically generate Gene and Variant records in the knowledgebase if they do not already exist. Once submitted, the Evidence Item undergoes evaluation by expert Editors and (if necessary) revision with ultimate rejection or acceptance. Accepted Evidence Items can be used to build Assertions, which are visualized at the Variant-level. Similar cycles of curation and moderation are employed for all curatable entities in CIViC (e.g., Variant Summaries, Coordinates, Assertions)
Fig. 2Overview of the Gene and Variant knowledge models and the structure of Variant Groups. The Gene and Variant knowledge models shown above display their associated features (including the Variant Groups feature of Variants) and their origins. Features that are linked to their notes with dotted lines are automatically generated, whenever possible. a Gene data (blue box) consists of curated features (Gene Name, Summary, Sources) and auto-generated links to external entities (MyGene.info and DGIdb). Each Gene can be associated with any number of Variants (dark orange box) and Variants can be grouped (light orange box) based on any unifying feature type (e.g., fusions, activating mutations). b Variant Group features are outlined by the light orange box. These features include a Summary with Sources and associated Variants. c Variant data (dark orange box) includes the Gene Name, Aliases, HGVS Expressions, Variant Evidence Score, Allele Registry ID, Summary Sources, Variant Types, ClinVar IDs, MyVariant.info, and Coordinates. Variants can be associated with CIViC Assertions (green) and Evidence Items (yellow)
Fig. 3Diagram of the Evidence Item knowledge model. Evidence Items provide a summarized statement about a variant’s implication in clinical oncology in the context of structured data. The knowledge model consists of features (yellow box) that are user-generated and human-readable while leveraging outside ontologies and CIViC-defined fields. Features that are linked to their notes with dotted lines are automatically generated, whenever possible. The Variant Type, Direction, and Clinical Significance features allow Curators to develop complex Evidence Items with nuanced meaning while maintaining queryable structure
Fig. 4Diagram of knowledge model for CIViC Assertions. Assertions summarize a collection of Evidence Items to make a definitive clinical statement about the Variant in a specific Disease context which incorporates all known data within the knowledgebase. Assertions features (green box) build on the Evidence Item knowledge model to bring together clinical guidelines, public resources, and regulatory approvals relevant to a final variant interpretation. Assertions can be associated with any number of Evidence Items. Like Evidence Items, Assertion Type, Direction, and Clinical Significance can be used to create a specific meaning for the Assertion
Fig. 5CIViC Assertion development by Assertion Type. CIViC Assertions summarize a collection of Evidence Items which reflect the state of literature for the given variant and disease. a For Assertion Types typically associated with somatic variants (Predictive, Prognostic, or Diagnostic), AMP-ASCO-CAP 2017 guidelines are followed to associate the Assertion with an AMP Tier and Level, which involves consideration of practice guidelines as well as regulatory approvals associated with specific drugs, as well as consideration of available clinical evidence in the absence of explicit regulatory or practice guidelines. b CIViC Predisposing Assertions utilize ACMG-AMP 2015 guidelines to evaluate the 5-tier classification for a variant in a given disease context, which is supported by a collection of CIViC Evidence Items, along with other data. ACMG evidence codes for an Assertion are supplied by a collection of supporting CIViC Evidence Items (e.g., PP1 from co-segregation data available in a specific publication), and additionally are derived from Variant data (e.g., PM2 from population databases such as gnomAD). ACMG evidence codes are then combined at the Assertion level to generate a disease-specific classification for the Assertion