| Literature DB >> 33465745 |
Jim Abraham1, Amy B Heimberger2, John Marshall3, Elisabeth Heath4, Joseph Drabick5, Anthony Helmstetter6, Joanne Xiu6, Daniel Magee6, Phillip Stafford6, Chadi Nabhan7, Sourabh Antani6, Curtis Johnston6, Matthew Oberley6, Wolfgang Michael Korn8, David Spetzler9.
Abstract
Cancer of Unknown Primary (CUP) occurs in 3-5% of patients when standard histological diagnostic tests are unable to determine the origin of metastatic cancer. Typically, a CUP diagnosis is treated empirically and has very poor outcomes, with median overall survival less than one year. Gene expression profiling alone has been used to identify the tissue of origin but struggles with low neoplastic percentage in metastatic sites which is where identification is often most needed. MI GPSai, a Genomic Prevalence Score, uses DNA sequencing and whole transcriptome data coupled with machine learning to aid in the diagnosis of cancer. The algorithm trained on genomic data from 34,352 cases and genomic and transcriptomic data from 23,137 cases and was validated on 19,555 cases. MI GPSai predicted the tumor type in the labeled data set with an accuracy of over 94% on 93% of cases while deliberating amongst 21 possible categories of cancer. When also considering the second highest prediction, the accuracy increases to 97%. Additionally, MI GPSai rendered a prediction for 71.7% of CUP cases. Pathologist evaluation of discrepancies between submitted diagnosis and MI GPSai predictions resulted in change of diagnosis in 41.3% of the time. MI GPSai provides clinically meaningful information in a large proportion of CUP cases and inclusion of MI GPSai in clinical routine could improve diagnostic fidelity. Moreover, all genomic markers essential for therapy selection are assessed in this assay, maximizing the clinical utility for patients within a single test.Entities:
Year: 2021 PMID: 33465745 PMCID: PMC7815805 DOI: 10.1016/j.tranon.2021.101016
Source DB: PubMed Journal: Transl Oncol ISSN: 1936-5233 Impact factor: 4.243
Landscape of tissue of origin approaches.
| Assay | Cancer Categories | N Independent Test Set | Accuracy | Cases Called |
|---|---|---|---|---|
| (%) | (%) | |||
| Caris MI GPSai | 21 | 13,661 | 94.7 | 93 |
| 2020 | ||||
| PCAWG | 14 | 1436 | 88 | 100 |
| 2020 | ||||
| MSK IMPACT | 22 | 11,644 | 74.1 | 100 |
| 2019 | ||||
| Cancer Genetics Tissue of Origin | 9 | 27 | 94.1 | 89 |
| 2012 | ||||
| Biotheranostics CancerTYPE ID | 30 | 187 | 83 | 100 |
| 2011 | ||||
| Park SY | 7 | 60 | 75 | 78 |
| 2007 | ||||
| Dennis JL | 7 | 130 | 88 | 100 |
| 2005 | ||||
| Brown RW | 5 | 128 | 66 | 86 |
| 1997 | ||||
| Gamble AR | 14 | 100 | 70 | 100 |
| 1993 |
Fig. 1CONSORT diagram. The DNA and RNA components of MI GPSai were trained using a combined 57,489 patients, which were then validated on 4,602 non-CUP and 185 CUP patients to determine optimal performance settings. Following this evaluation, MI GPSai rendered a prediction on routinely profiled cases resulting in the final prospective validation set and CUP cases.
Summary of performance in the independent validation cohort at the selected threshold.
| Category | Call Rate (%) | Sensitivity (%) | |
|---|---|---|---|
| Global | 4602 | 93.3 | 93.3 |
| Primary Specimen | 2544 | 94 | 94.1 |
| Metastatic Specimen | 1969 | 92.2 | 92.5 |
| Percent Tumor >= 20, <= 50 | 2885 | 92.7 | 93.4 |
| Percent Tumor > 50, <= 80 | 1657 | 94.1 | 93.1 |
| Percent Tumor > 80 | 54 | 100 | 100 |
Summary of algorithm performance in the prospective validation cohort.
| Category | Above Threshold | Call Rate (%) | Sensitivity in Top 1 (%) | Sensitivity in Top 2 (%) | Sensitivity in Top 3 (%) | Sensitivity in Top 4 (%) | Sensitivity in Top 5 (%) | Rule Outs / Case | Rule Out Accuracy (%) | |
|---|---|---|---|---|---|---|---|---|---|---|
| Global | 13,661 | 12,699 | 93 | 94.7 | 97.2 | 97.9 | 98.1 | 98.2 | 17.6 | 99.9 |
| Primary Specimen | 7521 | 7087 | 94.2 | 96.1 | 98.2 | 98.7 | 98.8 | 98.9 | 17.8 | 100 |
| Metastatic Specimen | 5942 | 5426 | 91.3 | 93 | 96 | 97 | 97.2 | 97.4 | 17.4 | 99.9 |
| Percent Tumor < 20 | 4 | 3 | 75 | 100 | 100 | 100 | 100 | 100 | 18.7 | 100 |
| Percent Tumor >= 20, <= 50 | 8227 | 7636 | 92.8 | 94.5 | 97 | 97.8 | 97.9 | 98 | 17.4 | 99.9 |
| Percent Tumor > 50, <= 80 | 5189 | 4835 | 93.2 | 95 | 97.7 | 98.2 | 98.4 | 98.5 | 17.9 | 100 |
| Percent Tumor > 80 | 241 | 225 | 93.4 | 96 | 96.4 | 96.4 | 96.4 | 96.9 | 18 | 99.9 |
Summary of algorithm performance in the prospective validation cohort by cancer category.
| Category | Call Rate (%) | Sensitivity (%) | PPV (%) | Rule Out Accuracy (%) | |
|---|---|---|---|---|---|
| Breast Adenocarcinoma | 1533 | 98 | 98.4 | 99 | 100 |
| Central Nervous System Cancer | 445 | 99.8 | 99.8 | 100 | 100 |
| Cervical Adenocarcinoma | 60 | 51.7 | 38.7 | 66.7 | 98 |
| Cholangiocarcinoma | 363 | 73.8 | 69.4 | 83 | 99.7 |
| Colon Adenocarcinoma | 2119 | 97 | 98.5 | 98.2 | 100 |
| Gastroesophageal Adenocarcinoma | 613 | 84.5 | 90.9 | 89.5 | 99.9 |
| GIST | 23 | 95.7 | 100 | 95.7 | 100 |
| Hepatocellular Carcinoma | 66 | 84.9 | 92.9 | 96.3 | 99.7 |
| Lung Adenocarcinoma | 2287 | 95 | 96.4 | 93.6 | 100 |
| Melanoma | 373 | 96.5 | 99.7 | 99.7 | 100 |
| Meningioma | 21 | 90.5 | 100 | 95 | 100 |
| Ovarian Granulosa Cell Tumor | 25 | 88 | 95.5 | 95.5 | 100 |
| Ovarian, Fallopian Tube Adenocarcinoma | 1493 | 91.6 | 92.5 | 94.3 | 99.9 |
| Pancreas Adenocarcinoma | 815 | 87.6 | 91.9 | 87.7 | 100 |
| Prostate Adenocarcinoma | 556 | 97.1 | 99.1 | 98.7 | 100 |
| Renal Cell Carcinoma | 176 | 92.6 | 95.7 | 96.9 | 99.8 |
| Squamous Cell Carcinoma | 1193 | 93 | 93.5 | 93.4 | 99.9 |
| Thyroid Cancer | 74 | 85.1 | 85.7 | 91.5 | 99.2 |
| Urothelial Carcinoma | 354 | 90.7 | 85.4 | 96.1 | 99.9 |
| Uterine Endometrial Adenocarcinoma | 989 | 89.4 | 91.4 | 89.7 | 100 |
| Uterine Sarcoma | 83 | 83.1 | 98.6 | 94.4 | 100 |
Fig. 2Prediction matrix in the prospective validation set. Each row shows the percentage of the actual disease types observed when a MI GPSai achieves a score > 0.835. The diagonal represents the PPV for the given disease type. Blank cells have values between 0 and 1.
Fig. 3Confusion matrix in the prospective validation set. Each column shows observed predictions for each disease type when a MI GPSai achieves a score > 0.835. The diagonal represents the sensitivity for the given disease type. Blank cells have values between 0 and 1.
Fig. 4A clinical example showing a representative case in which the pathological diagnosis was reassigned based on MI GPSai predictions using Whole Exome and Whole Transcriptome Sequencing (WES, WTS) data. (A) Molecular profiling was performed using WES and WTS data that was then routed into the MI GPSai pipeline for diagnostic predictions. (B) The whole transcriptome expression data was then used to select for lineage specific gene expression to guide immunohistochemical antibody selection, the current gold-standard for lineage assignment. In the example provided, the mean RNA expression of Uroplakin II and GATA3 of the urothelial carcinoma cases in our database is relatively high (box plots). With the specimen being considered (red line), Uroplakin II and GATA3 RNA expression high. (C) and (D) Immunohistochemical evaluation of the tumor with clinically validated antibodies against Uroplakin II and GATA3 confirmed lineage specific protein expression diagnostic of urothelial carcinoma. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)