Literature DB >> 31618044

Open-Sourced CIViC Annotation Pipeline to Identify and Annotate Clinically Relevant Variants Using Single-Molecule Molecular Inversion Probes.

Erica K Barnell1, Adam Waalkes2, Matt C Mosior1, Kelsi Penewit2, Kelsy C Cotto1, Arpad M Danos1, Lana M Sheta1, Katie M Campbell1,3, Kilannin Krysiak1, Damian Rieke4, Nicholas C Spies1, Zachary L Skidmore1, Colin C Pritchard2, Todd A Fehniger1, Ravindra Uppaluri5, Ramaswamy Govindan1, Malachi Griffith1, Stephen J Salipante2, Obi L Griffith1.   

Abstract

PURPOSE: Clinical targeted sequencing panels are important for identifying actionable variants for patients with cancer; however, existing approaches do not provide transparent and rationally designed clinical panels to accommodate the rapidly growing knowledge within oncology.
MATERIALS AND METHODS: We used the Clinical Interpretations of Variants in Cancer (CIViC) database to develop an Open-Sourced CIViC Annotation Pipeline (OpenCAP). OpenCAP provides methods to identify variants within the CIViC database, build probes for variant capture, use probes on prospective samples, and link somatic variants to CIViC clinical relevance statements. OpenCAP was tested using a single-molecule molecular inversion probe (smMIP) capture design on 27 cancer samples from 5 tumor types. In total, 2,027 smMIPs were designed to target 111 eligible CIViC variants (61.5 kb of genomic space).
RESULTS: When compared with orthogonal sequencing, CIViC smMIP sequencing demonstrated a 95% sensitivity for variant detection (n = 61 of 64 variants). Variant allele frequencies for variants identified on both sequencing platforms were highly concordant (Pearson's r = 0.885; n = 61 variants). Moreover, for individuals with paired tumor and normal samples (n = 12), 182 clinically relevant variants missed by orthogonal sequencing were discovered by CIViC smMIP sequencing.
CONCLUSION: The OpenCAP design paradigm demonstrates the utility of an open-source and open-access database built on attendant community contributions with peer-reviewed interpretations. Use of a public repository for variant identification, probe development, and variant interpretation provides a transparent approach to build dynamic next-generation sequencing-based oncology panels.

Entities:  

Mesh:

Substances:

Year:  2019        PMID: 31618044      PMCID: PMC6873961          DOI: 10.1200/CCI.19.00077

Source DB:  PubMed          Journal:  JCO Clin Cancer Inform        ISSN: 2473-4276


INTRODUCTION

Despite recognition that genomics plays an important role in tumor prognosis, diagnosis, and treatment, scaling genetic analysis for routine analysis of most tumor specimens has been unattainable.[1,2] Barriers preventing widespread incorporation of genomic analysis into treatment protocols include costs associated with genomic sequencing and analysis,[3] computational limitations preventing timely identification of relevant variants,[3] and rapidly evolving knowledge of the clinical actionability of variants.[4] Technologic improvements in sequencing and data analysis continue to reduce these first 2 limitations; however, less progress has been made in integrating dynamic genomic annotation into clinical workflows. More than 22% of oncologists have acknowledged limited confidence in their own understanding of how genomic knowledge applies to patients’ treatment, and 18% reported testing patients’ genetics infrequently.[5] In the face of exponential growth in clinically relevant genomic findings, driven by precision oncology efforts, there will likely be increased inability for physicians to command the most current information, resulting in increasing delay between academic discovery and clinical utility. This information gap has been described as the interpretation bottleneck.[4-6] Alleviating the interpretation bottleneck will require codevelopment of targeted sequencing panels, bioinformatic tools, and variant knowledgebases that effectively elucidate and annotate clinically actionable variants from sequencing data.[7,8] These requirements each raise separate challenges. With regard to targeted panel development, commercial and academic pancancer clinical gene capture panels have now become commonplace, with at least 2 obtaining US Food and Drug Administration approval (FoundationONE CDx[9] [Foundation Medicine, Cambridge, MA] and Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets[10] [Memorial Sloan Kettering Cancer Center, New York, NY]). Even so, few panels indicate how genomic loci are selected for panel inclusion (Data Supplement), and none have proposed a sustainable or scalable mechanism to allow for panel evolution over time in response to knowledge advances in molecular oncology. With regard to bioinformatic tool development, the OncoPaD[11] portal provides one of the only methods to create rational designed panels by linking clinically relevant variants to genomic loci on the basis of a cohort of tumor samples; however, this database is not directly linked to actively updated clinical interpretations with detailed underlying evidence. The final challenge of building knowledgebases for variant interpretation perhaps poses even greater and more persistent challenges. Commercial entities typically rely on the manual curation and organization of research findings into structured databases, which are expensive to create and maintain, forcing companies to limit public access or to charge for use. The resulting lack of transparency creates inefficiencies in the field through unnecessary replication of curation effort and suboptimal communication with clinicians, ultimately hindering development of effective patient treatment plans. Separately, governmental and academic institutions have developed variant interpretation resources, such as the Catalogue of Somatic Mutations in Cancer,[12] ClinVar,[13] and cBioPortal,[14,15] that have drastically improved research efforts and academic discovery; however, these resources do not have well-supported (evidence-based) clinical relevance summaries for cancer variants that can be easily accessed and used by physicians. Several resources provide detailed clinical interpretation of cancer variants (eg, OncoKB,[16] JAX Clinical Knowledgebase,[17] and others), but these databases are either limited by license restrictions or closed curation models. Key Objective Development of clinical genomics pipelines and associated analytical software is needed to meet the growing needs of oncologists for cancer diagnosis and treatment. Knowledge Generated Here we describe methods for using the Clinical Interpretations of Variants in Cancer (CIViC) database to develop the Open-Sourced CIViC Annotation Pipeline (OpenCAP). This resource first describes methods for variant capture and subsequently provides tools for variant annotation. Using OpenCAP, we demonstrated applicability through development of a single-molecule molecular inversion probe capture panel, which was validated against whole-exome sequencing. Relevance Maintenance and continuous improvement of the OpenCAP software will help to serve the needs of researchers and physicians who are using precision oncology to guide treatment of their patients. To address these limitations, we developed a method to identify, capture, and annotate variants using the Clinical Interpretation of Variants in Cancer (CIViC) database.[18] The CIViC database is a freely accessible (public domain content), publicly curated, expert-moderated repository of therapeutic, prognostic, predisposing, and diagnostic information in precision oncology.[19] The database provides a powerful platform for panel development and variant annotation for the following reasons: each variant within CIViC is described by clinical relevance summaries linked to medical literature; the history of curation within CIViC is stored and publicly available to all users; and CIViC has an open-source, open-access applied programming interface (API) for external query. Using the CIViC database and API, we developed the Open-Sourced CIViC Annotation Pipeline (OpenCAP) for creating custom capture panels, executing capture panel sequencing on prospective samples, identifying variants from sequencing data, and annotating variants for clinical relevance.[20] An exemplary clinical capture panel was created using OpenCAP to demonstrate utility. Specifically, variants within the CIViC database were identified based on clinical relevance, and single-molecule molecular inversion probes (smMIPs) were designed to target variants of interest. This panel was used on cancer samples to evaluate design, and identified somatic variants were compared with orthogonal sequencing. Variants identified via smMIP capture were linked back to the CIViC database for clinical annotation (Fig 1). Ultimately, this method could be used to rapidly and efficiently link variants to clinical relevance summaries, enabling the development of custom capture panels for a variety of clinical and research scenarios.
FIG 1.

Methods for Clinical Interpretations of Variants in Cancer (CIViC) single-molecule molecular inversion probe (smMIP) development and validation using the Open-Sourced CIViC Annotation Pipeline (OpenCAP). The first series describes CIViC smMIP development. Variants were selected using sequence ontology identification numbers (IDs) and the CIViC Variant Evidence Score. Subsequently, eligible variants were categorized based on length, and smMIP reagents were designed to target regions of interest. The second series describes sample selection and sequencing methods. In total, there were 22 tumor samples derived from 5 tumor subtypes. Of these 27 samples, 15 had tumor and paired normal samples, and 7 were tumor-only samples. The third series shows the analysis used to validate the CIViC smMIP design. Variants were called using the pipeline described in Materials and Methods, and accuracy was attained by comparing variants observed on original sequencing to variants observed using the CIViC smMIP capture panel. Variant allele frequencies (VAFs) across both platforms were also compared. AML, acute myeloid leukemia; CRC, colorectal cancer; HL, Hodgkin lymphoma; HNSCC, head and neck squamous cell carcinoma; SCLC, small-cell lung cancer.

Methods for Clinical Interpretations of Variants in Cancer (CIViC) single-molecule molecular inversion probe (smMIP) development and validation using the Open-Sourced CIViC Annotation Pipeline (OpenCAP). The first series describes CIViC smMIP development. Variants were selected using sequence ontology identification numbers (IDs) and the CIViC Variant Evidence Score. Subsequently, eligible variants were categorized based on length, and smMIP reagents were designed to target regions of interest. The second series describes sample selection and sequencing methods. In total, there were 22 tumor samples derived from 5 tumor subtypes. Of these 27 samples, 15 had tumor and paired normal samples, and 7 were tumor-only samples. The third series shows the analysis used to validate the CIViC smMIP design. Variants were called using the pipeline described in Materials and Methods, and accuracy was attained by comparing variants observed on original sequencing to variants observed using the CIViC smMIP capture panel. Variant allele frequencies (VAFs) across both platforms were also compared. AML, acute myeloid leukemia; CRC, colorectal cancer; HL, Hodgkin lymphoma; HNSCC, head and neck squamous cell carcinoma; SCLC, small-cell lung cancer.

MATERIALS AND METHODS

Development of Operating Procedure for OpenCAP

OpenCAP was built to guide users through the development of a custom capture panel linked to CIViC clinical relevance summaries.[20] OpenCAP consists of 5 sections, each with examples and user tutorials. The first section describes CIViC and directs users through the CIViC Web interface. The next section describes methods for building a custom capture panel, which includes identifying pertinent variants within the CIViC database and targeting those variants with probes using curated genomic coordinates. Subsequently, OpenCAP gives a high-level overview of the massively parallel sequencing pipeline, which includes brief summaries for sample procurement, nucleic acid generation, library preparation, and high-throughput sequencing. The final sections describe identifying variants from raw sequencing data and annotating those variants for clinical relevance.

Determining Eligible CIViC Variants for smMIP Capture

Variants in CIViC were filtered using their Variant Evidence Score (required > 20 points) and sequence ontology identification numbers (SOIDs; must be DNA based; Appendix). Variants were also filtered if all evidence supported only germline clinical relevance, evidence was directly conflicting, or a majority of evidence in a container variant (eg, MUTATION) pointed to a hotspot that was already being covered. The remaining variants were eligible for the CIViC smMIP capture panel.

Designing smMIPs for the CIViC Capture Reagents

Variants were further categorized by length. If the variant length was < 250 base pairs, the variant was eligible for hotspot targeting. If the variant was > 250 base pairs, the variant required either sparse or full tiling of the protein coding exons (Appendix). For all variants, smMIPs were designed and synthesized as previously described[23] with the single alteration that the “-double_tile_strands_separately”[24] flag was used with the MIPgen tool to separately capture each strand of DNA surrounding the target.

Rescue and Annotation of Clinically Relevant Variants

Variants called using the CIViC smMIP capture panel were compared with variants called using original sequencing for samples that had matched tumor and normal sequencing. All genomic loci were manually reviewed[23] using both the smMIP aligned Binary Alignment Map (BAM) files and the original aligned BAM files. Variants only identified using smMIP sequencing were grouped into the following 4 categories: germline polymorphism, pipeline artifact (low variant support or poor mapping), variant support on smMIP sequencing but no support on original sequencing, or variant support on both smMIP sequencing and original sequencing. For variants that showed support on smMIP sequencing but no variant support on original sequencing, the binomial probability was used to assess whether ≤ 3 variant-supporting reads would be detected with 95% confidence using the original coverage and the observed smMIP variant allele frequency (VAF). The accession number for the first release of the Database of Genotypes and Phenotypes study was phs001890.v1.p1, and the accession number for first release of the Sequence Read Archive was PRJNA529857.

RESULTS

Identification of Eligible CIViC Variants for smMIP Targeting

At the time of the CIViC smMIP capture panel design, there were 988 variants from 275 genes within the CIViC database with at least 1 evidence item. After filtering based on the Variant Evidence Score and the SOID (Appendix, Data Supplement), smMIPs were designed to cover all eligible CIViC variants. A set of 2,097 probes was developed and tested on control samples. Of these, 70 probes showed poor capture efficiency and were eliminated from the panel. Removal of the underperforming probes affected 32 variants across 16 genes. The final capture reagent targeted 111 CIViC variants spanning approximately 61.5 kb of genomic space (Data Supplement). When compared with other pancancer panels, the CIViC capture panel showed high overlap with previously defined clinical variants. For example, the CIViC smMIP capture panel covered 10 of the 13 well-defined variants on FoundationOne CDx (EGFR: exon 19, L858R, and T790M; BRAF: V600E/K; ERBB2 amplification; KRAS G12/13; BRCA1; and BRCA2).[24] The 3 variants on FoundationOne CDx that were not originally covered by the smMIPs panel (KRAS wild type, NRAS wild type, and ALK rearrangements) have all since attained a Variant Evidence Score that would be sufficient for inclusion in a panel built today. Of the 111 targeted variants, 71 required hotspot targeting, 14 variants required sparse exon tiling, and 26 required full exon tiling. The 111 variants covered by the CIViC smMIP capture panel were based on 1,168 clinically relevant evidence items, whereby 820 evidence items (70%) predicted response to a therapeutic, 232 (20%) detailed prognostic information, 52 (4%) indicated diagnostic information, and 64 (6%) supported predisposition to cancer (Fig 2).
FIG 2.

Regions targeted by the Clinical Interpretations of Variants in Cancer (CIViC) single-molecule molecular inversion probes (smMIPs) are, by design, supported by extensive clinical relevance according to the CIViC database. Variants that were eligible for CIViC smMIP development were divided into various coverage methods based on sequence ontology identification number and length. The bar graph shows the total number of evidence items used for each of the groups parsed by the evidence type.

Regions targeted by the Clinical Interpretations of Variants in Cancer (CIViC) single-molecule molecular inversion probes (smMIPs) are, by design, supported by extensive clinical relevance according to the CIViC database. Variants that were eligible for CIViC smMIP development were divided into various coverage methods based on sequence ontology identification number and length. The bar graph shows the total number of evidence items used for each of the groups parsed by the evidence type.

Tumor Samples Used to Validate CIViC smMIP Design

Samples used to validate the CIViC smMIP capture panel design were derived from 5 different cancer genomic studies (Data Supplement). Tumor and paired normal samples were obtained from 5 individuals with head and neck squamous cell carcinoma (HNSCC), 9 individuals with small-cell lung cancer,[25] and 1 individual with Hodgkin lymphoma (HL). Tumor-only samples were obtained from 1 individual with HL, 1 individual with acute myeloid leukemia,[26] and 5 individuals with colorectal cancer (CRC). In total, 37 samples were evaluated from 22 individuals. Samples from the CRC cohort were formalin-fixed paraffin-embedded, and all other samples were fresh frozen tissue. Each of the 22 individuals had previously undergone whole-exome or whole-genome sequencing, somatic variant calling, and somatic variant refinement via manual review (Appendix). Considering original sequencing, there were 12,602 putative somatic variants called for these 22 samples. The average variant burden was 573 variants per sample, with a range of 2-3,900 variants per sample. Variant coordinates from these samples were compared with the genomic region covered by the CIViC smMIP capture panel to determine potential validating variants. In total, there were 84 variants identified via original sequencing that overlapped with the CIViC smMIP capture panel (Data Supplement).

smMIP Sequencing and Data Analysis

Initial quality check.

The average number of tags captured for all samples was 5.4 million (standard deviation, 3.3 million tags). One HNSCC normal sample failed smMIP capture, 2 HNSCC tumor samples had significantly fewer reads than the rest (ie, > 1 standard deviation), and 1 HL tumor sample had reduced tag complexity relative to the rest (ie, < 600,000 unique captured smMIPs). Sequencing failure for these 4 samples was attributable to poor template quality or quantity and not attributable to the capture reagents. All other samples passed sequencing quality checks. After quality check, 31 samples derived from 19 individuals were eligible for reagent validation. These samples had 65 variants derived from orthogonal sequencing that had overlap with the CIViC smMIP coverage (Fig 3). The average consensus read depth for these 65 variants was 2,942 reads (standard deviation, 4,697 reads).
FIG 3.

Waterfall plot showing extensive overlap between variants observed using original exome or whole-genome sequencing with variants observed using Clinical Interpretations of Variants in Cancer (CIViC) single-molecule molecular inversion probe (smMIP) sequencing. Each column represents a sample that had original exome or whole-genome sequencing with subsequent orthogonal validation using the CIViC smMIP sequencing. Rows represent mutated genes across all samples. Numbers within each box represent the variant allele frequency (VAF) observed on original exome or whole-genome sequencing. Red boxes indicate that a variant was observed by CIViC smMIPs and validated with original exome or whole-genome sequencing. Blue boxes indicate that the variant was observed on original exome or whole-genome sequencing but not identified via the CIViC smMIP capture panel. The left panel indicates the number of samples containing a mutation in the indicated gene. AML, acute myeloid leukemia; CRC, colorectal cancer; HL, Hodgkin lymphoma; HNSCC, head and neck squamous cell carcinoma; SCLC, small-cell lung cancer.

Waterfall plot showing extensive overlap between variants observed using original exome or whole-genome sequencing with variants observed using Clinical Interpretations of Variants in Cancer (CIViC) single-molecule molecular inversion probe (smMIP) sequencing. Each column represents a sample that had original exome or whole-genome sequencing with subsequent orthogonal validation using the CIViC smMIP sequencing. Rows represent mutated genes across all samples. Numbers within each box represent the variant allele frequency (VAF) observed on original exome or whole-genome sequencing. Red boxes indicate that a variant was observed by CIViC smMIPs and validated with original exome or whole-genome sequencing. Blue boxes indicate that the variant was observed on original exome or whole-genome sequencing but not identified via the CIViC smMIP capture panel. The left panel indicates the number of samples containing a mutation in the indicated gene. AML, acute myeloid leukemia; CRC, colorectal cancer; HL, Hodgkin lymphoma; HNSCC, head and neck squamous cell carcinoma; SCLC, small-cell lung cancer.

Accuracy of CIViC smMIP variant identification compared with exome or genome variant identification.

Of the 65 variants identified on exome sequencing, all but 4 were also identified using CIViC smMIP sequencing (Fig 3). One variant was missed as a result of lack of adequate coverage, 2 variants were missed as a result of low-performing probes, and 1 variant was retrospectively considered ineligible as a result of smMIP design (Appendix). After removing this variant from the list of eligible variants, the CIVIC smMIP capture sequencing attained a 95% sensitivity for variant detection (n = 64 variants).

VAF correlation between CIViC smMIP sequencing and exome or genome sequencing.

VAFs obtained via original sequencing were compared with the VAF obtained using the CIViC smMIPs. To compare VAF quantitation across platforms, the 19 variants obtained from samples that failed the CIViC smMIP sequencing quality check were eliminated (Fig 4A). Subsequently, we eliminated the 4 variants that were not validated using the CIViC smMIP reagents (Fig 4B). When comparing original VAFs to CIViC smMIP VAFs, Pearson correlation for the remaining 61 variants was 0.885. There were several variants whereby the VAF observed by the CIViC smMIP sequencing was lower than that observed by the original sequencing. These outliers were not associated with tumor type, sequencing mass input, average coverage, presence of matched normal, or sample type (Figs 4C to 4F).
FIG 4.

Variant allele frequencies (VAFs) observed using original exome or whole-genome sequencing compared with VAFs observed using Clinical Interpretations of Variants in Cancer (CIViC) single-molecule molecular inversion probe (smMIP) sequencing. (A) Correlation of VAF with original sequencing parsed by sequencing status (ie, passed sequencing if total sequencing counts were > 1 standard deviation from the mean and tag complexity was > 600,000 unique captured smMIPs). (B) Correlation of VAF with validation status (ie, true if the variant identified using exome or genome sequencing was identified on CIViC smMIP sequencing). (C) Correlation of VAF parsed by coverage at variant loci. (D) Correlation of VAF parsed by DNA mass input for library construction. (E) Correlation of VAF parsed by presence or absence of matched normal tissue. (F) Correlation of VAF parsed by tumor type. AML, acute myeloid leukemia; CRC, colorectal cancer; HL, Hodgkin lymphoma; OSCC, oral squamous cell carcinoma; SCLC, small-cell lung cancer.

Variant allele frequencies (VAFs) observed using original exome or whole-genome sequencing compared with VAFs observed using Clinical Interpretations of Variants in Cancer (CIViC) single-molecule molecular inversion probe (smMIP) sequencing. (A) Correlation of VAF with original sequencing parsed by sequencing status (ie, passed sequencing if total sequencing counts were > 1 standard deviation from the mean and tag complexity was > 600,000 unique captured smMIPs). (B) Correlation of VAF with validation status (ie, true if the variant identified using exome or genome sequencing was identified on CIViC smMIP sequencing). (C) Correlation of VAF parsed by coverage at variant loci. (D) Correlation of VAF parsed by DNA mass input for library construction. (E) Correlation of VAF parsed by presence or absence of matched normal tissue. (F) Correlation of VAF parsed by tumor type. AML, acute myeloid leukemia; CRC, colorectal cancer; HL, Hodgkin lymphoma; OSCC, oral squamous cell carcinoma; SCLC, small-cell lung cancer.

Analysis of Variants Only Identified Using CIViC smMIP Sequencing

Using samples that had sequencing data for both tumor and matched normal (n = 12 samples), we evaluated whether the targeted CIViC smMIP sequencing could identify clinically relevant variants that had not been observed by the original sequencing. There were 273 variants recovered by CIViC smMIP sequencing that were not identified using original sequencing. After manually reviewing these variants within the original exome or genome alignments, 55 variants (20.1%) were identified as germline mutations. smMIP sequencing VAF distribution at 50% and 100% further supported that these variants were germline polymorphisms (Fig 5A). An additional 36 variants (13.2%) were thought to be caused by pipeline artifacts and attributable to assumptions underlying automated callers or alignment problems. The majority of these artifacts were associated with nucleotide repeats in the reference sequence (Fig 5B). There were 171 variants (62.6%) called as somatic using CIViC smMIPs that did not have any variant support on the original sequencing. For these variants, we calculated the binomial probability that ≤ 3 reads would support the variant given the original coverage (number of chances to get a variant supporting read) and the observed smMIP VAF (likelihood that a read would show variant support). If the binomial probability of ≤ 3 variant-supporting reads was > 95%, then it was considered statistically unlikely that a variant would be called using original sequencing data. Using this calculation, 162 variants (94.7%) showed insufficient coverage in the original sequencing for detection (Fig 5C). Finally, 11 variants (4.2%) were not called as somatic on original sequencing but did show some variant support in those original sequencing data. The VAFs observed on original sequencing data were strongly correlated with the VAFs observed using CIViC smMIP sequencing (Pearson’s r = 0.92; Fig 5D). Reviewing manual review files from the original sequencing, we observed that 6 of these variants failed manual review as a result of low VAF, 4 variants had not been called by automated somatic variant callers, and 1 variant failed manual review as a result of a perceived sequencing artifact. In summary, there were 182 potentially clinically relevant somatic variants missed by original sequencing, primarily as a result of insufficient coverage, that contained CIViC variant annotations.
FIG 5.

Analysis of variants rescued by Clinical Interpretations of Variants in Cancer (CIViC) single-molecule molecular inversion probe (smMIP) sequencing for samples with both tumor and matched normal. There were 217 variants called as somatic by CIViC smMIP sequencing that were not identified by the original sequencing. All variants were manually reviewed using both CIViC smMIP sequencing data and original sequencing data. (A) During manual review, 55 variants were identified as germline. A histogram shows that the distribution of the smMIP variant allele frequencies (VAFs) for these germline variants was observed at 50% and 100% VAF, indicating heterozygosity and homozygosity, respectively. (B) An additional 36 variants were identified as sequencing artifacts. Most artifacts were either mononucleotide repeats (MN), dinucleotide repeats (DN), or tandem repeats (TR). Other artifacts include multiple mismatches (MM) or multiple variants (MV). (C) During manual review, 162 variants did not show any support in the original sequencing data. Most unsupported variants did not have sufficient coverage to be detected based on a binomial probability of ≤ 3 variant-supporting reads (see Materials and Methods). (D) The remaining 11 variants had variant support in original sequencing but were not called as somatic in final original annotation. The scatter plot shows correlation between original VAF and CIViC smMIP VAF for these variants.

Analysis of variants rescued by Clinical Interpretations of Variants in Cancer (CIViC) single-molecule molecular inversion probe (smMIP) sequencing for samples with both tumor and matched normal. There were 217 variants called as somatic by CIViC smMIP sequencing that were not identified by the original sequencing. All variants were manually reviewed using both CIViC smMIP sequencing data and original sequencing data. (A) During manual review, 55 variants were identified as germline. A histogram shows that the distribution of the smMIP variant allele frequencies (VAFs) for these germline variants was observed at 50% and 100% VAF, indicating heterozygosity and homozygosity, respectively. (B) An additional 36 variants were identified as sequencing artifacts. Most artifacts were either mononucleotide repeats (MN), dinucleotide repeats (DN), or tandem repeats (TR). Other artifacts include multiple mismatches (MM) or multiple variants (MV). (C) During manual review, 162 variants did not show any support in the original sequencing data. Most unsupported variants did not have sufficient coverage to be detected based on a binomial probability of ≤ 3 variant-supporting reads (see Materials and Methods). (D) The remaining 11 variants had variant support in original sequencing but were not called as somatic in final original annotation. The scatter plot shows correlation between original VAF and CIViC smMIP VAF for these variants.

Annotation of CIViC smMIP Capture Panel Somatic Variants Using OpenCAP

Using the OpenCAP annotation software, we developed clinical interpretation reports for all variants observed using the CIViC smMIP capture panel. In total, there were 1,340 variants observed across the 19 samples that passed smMIP sequencing. Of the 1,340 variants observed, 127 had direct matches (chromosome, start, stop, reference, variant) with CIViC annotations (average, 6.7 variants per sample). The OpenCAP output report for variants observed on original sequencing and validated by the CIViC smMIP capture panel for CRC1 is shown in the Data Supplement. For each identified clinical variant, links to external databases, CIViC variant descriptions, associated CIViC assertions, and associated CIViC evidence items are provided. Associated evidence items provide a brief description of the clinical relevance, links to CIViC evidence items, and associated citations. An illustrative output report that displays most OpenCAP features, including CIViC variant descriptions and CIViC assertions, was created using a previously reported patient from the literature[27] (Data Supplement).

DISCUSSION

OpenCAP is a resource for users to develop a custom capture panel that can be easily linked to actively maintained clinical relevance summaries. The methods described by OpenCAP to build a clinical capture panel offer several advantages relative to existing design paradigms. Use of an open-source database provides a systematic mechanism to survey existing literature within precision oncology to identify variants that are relevant for capture. In addition, the public API permits rapid mapping of identified somatic and germline variants to CIViC clinical relevance summaries. Most importantly, the variants covered by CIViC and associated clinical summaries can be updated in real time as knowledge is entered into the database to accommodate new information discovered within the field of precision oncology. The smMIP capture method for sequencing provides inherent error correction capability, scalability to detect ultrasensitive variation, and cost effectiveness within a modular design. Combining the public access CIViC database with an ultrasensitive and versatile capture reagent provides an advantageous and principled method for building precision oncology capture reagents. This approach could enable a standardized framework for detecting and interpreting cancer-relevant genomic variation, lowering barriers to use of genomic analysis in the clinical practice of oncology. For maximal flexibility, OpenCAP describes methods for using both unique molecular identifiers (UMIs) and non–UMI-based probes to capture variants of interest. The CIViC smMIP capture panel used Variant Evidence Scores and SOIDs to identify variants of interest for targeting. However, alternate filtering strategies have been outlined in OpenCAP documents. Regardless of variants targeted for capture, the presented research helped to show that CIViC variants and variant coordinates can be used for accurate capture panel design (95% detection accuracy with Pearson’s r = 0.885 for VAFs). This finding helps to validate that the methods described in OpenCAP can be used to accurately interrogate desired variants of interest. Like all targeted reagents, the preliminary CIViC smMIP design has limitations that can be addressed with future iterations. First, the reagent design is limited by the current knowledge within CIViC. Extensive curation from certain groups (eg, the University Health Network curation of VHL variants) disproportionately increases representation for certain genes, cancers, and variant types. Conversely, lack of curation in certain areas shows a disproportionate decreased representation. To address existing curation disparities, CIVIC has joined the Variant Interpretation for Cancer Consortium (VICC)[28] to integrate multiple variant interpretation knowledgebases into a single meta-knowledgebase. Successful execution of the aims outlined by the VICC would result in harmonization of information from CIViC, the Cancer Genome Interpreter,[29] Clinical Knowledgebase,[30] MolecularMatch, OncoKB,[16] Precision Medicine Knowledgebase,[31] and others. This would allow users to leverage variant interpretations across multiple platforms for building custom capture panels that are linked to clinical relevance summaries. In summary, the methods described here validate that community curated data on clinically relevant cancer variants can provide a systematic and dynamic method for capture reagent design. The curated coordinates in the database accurately map to desired variants, and probes designed using these coordinates show accurate recapitulation of the genomic landscape described by orthogonal sequencing. It is our hope that OpenCAP will provide the research community with a novel method to develop next-generation sequencing–based oncology panels.
  25 in total

1.  A new initiative on precision medicine.

Authors:  Francis S Collins; Harold Varmus
Journal:  N Engl J Med       Date:  2015-01-30       Impact factor: 91.245

2.  MIPgen: optimized modeling and design of molecular inversion probes for targeted resequencing.

Authors:  Evan A Boyle; Brian J O'Roak; Beth K Martin; Akash Kumar; Jay Shendure
Journal:  Bioinformatics       Date:  2014-05-26       Impact factor: 6.937

3.  Actionable, pathogenic incidental findings in 1,000 participants' exomes.

Authors:  Michael O Dorschner; Laura M Amendola; Emily H Turner; Peggy D Robertson; Brian H Shirts; Carlos J Gallego; Robin L Bennett; Kelly L Jones; Mari J Tokita; James T Bennett; Jerry H Kim; Elisabeth A Rosenthal; Daniel S Kim; Holly K Tabor; Michael J Bamshad; Arno G Motulsky; C Ronald Scott; Colin C Pritchard; Tom Walsh; Wylie Burke; Wendy H Raskind; Peter Byers; Fuki M Hisama; Deborah A Nickerson; Gail P Jarvik
Journal:  Am J Hum Genet       Date:  2013-09-19       Impact factor: 11.025

4.  OncoKB: A Precision Oncology Knowledge Base.

Authors:  Debyani Chakravarty; Jianjiong Gao; Sarah M Phillips; Ritika Kundra; Hongxin Zhang; Jiaojiao Wang; Julia E Rudolph; Rona Yaeger; Tara Soumerai; Moriah H Nissan; Matthew T Chang; Sarat Chandarlapaty; Tiffany A Traina; Paul K Paik; Alan L Ho; Feras M Hantash; Andrew Grupe; Shrujal S Baxi; Margaret K Callahan; Alexandra Snyder; Ping Chi; Daniel Danila; Mrinal Gounder; James J Harding; Matthew D Hellmann; Gopa Iyer; Yelena Janjigian; Thomas Kaley; Douglas A Levine; Maeve Lowery; Antonio Omuro; Michael A Postow; Dana Rathkopf; Alexander N Shoushtari; Neerav Shukla; Martin Voss; Ederlinda Paraiso; Ahmet Zehir; Michael F Berger; Barry S Taylor; Leonard B Saltz; Gregory J Riely; Marc Ladanyi; David M Hyman; José Baselga; Paul Sabbatini; David B Solit; Nikolaus Schultz
Journal:  JCO Precis Oncol       Date:  2017-05-16

5.  The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data.

Authors:  Ethan Cerami; Jianjiong Gao; Ugur Dogrusoz; Benjamin E Gross; Selcuk Onur Sumer; Bülent Arman Aksoy; Anders Jacobsen; Caitlin J Byrne; Michael L Heuer; Erik Larsson; Yevgeniy Antipin; Boris Reva; Arthur P Goldberg; Chris Sander; Nikolaus Schultz
Journal:  Cancer Discov       Date:  2012-05       Impact factor: 39.397

6.  The cancer precision medicine knowledge base for structured clinical-grade mutations and interpretations.

Authors:  Linda Huang; Helen Fernandes; Hamid Zia; Peyman Tavassoli; Hanna Rennert; David Pisapia; Marcin Imielinski; Andrea Sboner; Mark A Rubin; Michael Kluk; Olivier Elemento
Journal:  J Am Med Inform Assoc       Date:  2017-05-01       Impact factor: 4.497

7.  CIViC is a community knowledgebase for expert crowdsourcing the clinical interpretation of variants in cancer.

Authors:  Malachi Griffith; Nicholas C Spies; Kilannin Krysiak; Joshua F McMichael; Adam C Coffman; Arpad M Danos; Benjamin J Ainscough; Cody A Ramirez; Damian T Rieke; Lynzey Kujan; Erica K Barnell; Alex H Wagner; Zachary L Skidmore; Amber Wollam; Connor J Liu; Martin R Jones; Rachel L Bilski; Robert Lesurf; Yan-Yang Feng; Nakul M Shah; Melika Bonakdar; Lee Trani; Matthew Matlock; Avinash Ramu; Katie M Campbell; Gregory C Spies; Aaron P Graubert; Karthik Gangavarapu; James M Eldred; David E Larson; Jason R Walker; Benjamin M Good; Chunlei Wu; Andrew I Su; Rodrigo Dienstmann; Adam A Margolin; David Tamborero; Nuria Lopez-Bigas; Steven J M Jones; Ron Bose; David H Spencer; Lukas D Wartman; Richard K Wilson; Elaine R Mardis; Obi L Griffith
Journal:  Nat Genet       Date:  2017-01-31       Impact factor: 38.330

8.  ACMG recommendations for reporting of incidental findings in clinical exome and genome sequencing.

Authors:  Robert C Green; Jonathan S Berg; Wayne W Grody; Sarah S Kalia; Bruce R Korf; Christa L Martin; Amy L McGuire; Robert L Nussbaum; Julianne M O'Daniel; Kelly E Ormond; Heidi L Rehm; Michael S Watson; Marc S Williams; Leslie G Biesecker
Journal:  Genet Med       Date:  2013-06-20       Impact factor: 8.822

9.  Organizing knowledge to enable personalization of medicine in cancer.

Authors:  Benjamin M Good; Benjamin J Ainscough; Josh F McMichael; Andrew I Su; Obi L Griffith
Journal:  Genome Biol       Date:  2014-08-27       Impact factor: 13.583

10.  Recurrent WNT pathway alterations are frequent in relapsed small cell lung cancer.

Authors:  Alex H Wagner; Siddhartha Devarakonda; Zachary L Skidmore; Kilannin Krysiak; Avinash Ramu; Lee Trani; Jason Kunisaki; Ashiq Masood; Saiama N Waqar; Nicholas C Spies; Daniel Morgensztern; Jason Waligorski; Jennifer Ponce; Robert S Fulton; Leonard B Maggi; Jason D Weber; Mark A Watson; Christopher J O'Conor; Jon H Ritter; Rachelle R Olsen; Haixia Cheng; Anandaroop Mukhopadhyay; Ismail Can; Melissa H Cessna; Trudy G Oliver; Elaine R Mardis; Richard K Wilson; Malachi Griffith; Obi L Griffith; Ramaswamy Govindan
Journal:  Nat Commun       Date:  2018-09-17       Impact factor: 14.919

View more
  3 in total

1.  Informatics Tools for Cancer Research and Care: Bridging the Gap Between Innovation and Implementation.

Authors:  Jeremy L Warner; Juli D Klemm
Journal:  JCO Clin Cancer Inform       Date:  2020-09

2.  Real-world progression-free survival (rwPFS) and the impact of PD-L1 and smoking in driver-mutated non-small cell lung cancer (NSCLC) treated with immunotherapy.

Authors:  J Nicholas Bodor; Jessica R Bauman; Elizabeth A Handorf; Eric A Ross; Margie L Clapper; Joseph Treat
Journal:  J Cancer Res Clin Oncol       Date:  2022-06-16       Impact factor: 4.322

3.  Analysis of a Real-World Progression Variable and Related Endpoints for Patients with Five Different Cancer Types.

Authors:  Aracelis Z Torres; Nathan C Nussbaum; Christina M Parrinello; Ariel B Bourla; Bryan E Bowser; Samuel Wagner; David C Tabano; Daniel George; Rebecca A Miksad
Journal:  Adv Ther       Date:  2022-04-17       Impact factor: 4.070

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.