| Literature DB >> 34999743 |
Zuhal Ozcan1,2, Francis A San Lucas3, Justin W Wong1, Kyle Chang1, Konrad H Stopsack4,5, Jerry Fowler1, Yasminka A Jakubek1, Paul Scheet1,2.
Abstract
MOTIVATION: RNA sequencing of tumor tissue is typically only used to measure gene expression. Here, we present a statistical approach that leverages existing RNA sequencing data (RNA-seq) to also detect somatic copy number alterations (SCNAs), a pervasive phenomenon in human cancers, without a need to sequence the corresponding DNA.Entities:
Year: 2022 PMID: 34999743 PMCID: PMC8896613 DOI: 10.1093/bioinformatics/btab861
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Gene-level performance assessment
| BRCA | COAD | GBM | LUAD | LUSC | PAAD | PRAD | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Sens (%) | Spec (%) | Sens (%) | Spec (%) | Sens (%) | Spec (%) | Sens (%) | Spec (%) | Sens (%) | Spec (%) | Sens (%) | Spec (%) | Sens (%) | Spec (%) | |
| hapLOHseq (RNA-seq) | 71 | 94 | 47 | 99 | 79 | 93 | 74 | 91 | 73 | 89 | 69 | 97 | 45 | 97 |
| hapLOHseq (RNA-seq + imputed genotypes) | 84 | 92 | 79 | 97 | 89 | 92 | 84 | 90 | 88 | 88 | 80 | 95 | 66 | 94 |
| hapLOHseq (WES) | 93 | 89 | 94 | 92 | 94 | 96 | 89 | 92 | 91 | 90 | 94 | 89 | 76 | 97 |
Note: We evaluated the method at the gene level by comparing SCNA status of genes between the RNA-seq-derived SCNAs and the gold standard (array-based analysis) for seven cohorts in the TCGA. Sens, sensitivity; the proportion of genes covered by an SCNA in the gold standard that were also identified by the listed approach. Spec, specificity; the proportion of genes that are not covered by an SCNA event in the gold standard that were also not inferred to be covered by an SCNA by the listed approach.
Gene-level performance summaries across 28 cancer sites
| Tumor site (abbreviation) (sample size) | Sensitivity (%) | Specificity (%) |
|---|---|---|
| Adrenocortical carcinoma (ACC) ( | 89 | 93 |
| Bladder urothelial carcinoma (BLCA) ( | 83 | 90 |
| Breast invasive carcinoma (BRCA) ( | 84 | 92 |
| Cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC) ( | 88 | 94 |
| Cholangiocarcinoma (CHOL) ( | 85 | 95 |
| Colon adenocarcinoma (COAD) ( | 79 | 97 |
| Esophageal carcinoma (ESCA) ( | 91 | 79 |
| Glioblastoma multiforme (GBM) ( | 89 | 92 |
| Head and neck squamous cell carcinoma (HNSC) ( | 86 | 94 |
| Kidney chromophobe (KICH) ( | 94 | 95 |
| Kidney renal clear cell carcinoma (KIRC) ( | 90 | 94 |
| Kidney renal papillary cell carcinoma (KIRP) ( | 92 | 95 |
| Brain lower grade glioma (LGG) ( | 79 | 96 |
| Liver hepatocellular carcinoma (LIHC) ( | 82 | 95 |
| Lung adenocarcinoma (LUAD) ( | 84 | 90 |
| Lung squamous cell carcinoma (LUSC) ( | 88 | 88 |
| Mesothelioma (MESO) ( | 87 | 95 |
| Ovarian serous cystadenocarcinoma (OV) ( | 87 | 87 |
| Pancreatic adenocarcinoma (PAAD) ( | 80 | 95 |
| Pheochromocytoma and paraganglioma (PCPG) ( | 89 | 96 |
| Prostate adenocarcinoma (PRAD) ( | 66 | 94 |
| Rectum adenocarcinoma (READ) ( | 80 | 96 |
| Skin cutaneous melanoma (SKCM) ( | 86 | 94 |
| Stomach adenocarcinoma (STAD) ( | 86 | 87 |
| Testicular germ cell tumors (TGCT) ( | 88 | 93 |
| Thyroid carcinoma (THCA) ( | 85 | 94 |
| Uterine carcinosarcoma (UCS) ( | 85 | 90 |
| Uveal melanoma (UVM) ( | 87 | 98 |
Note: For each cancer site, the study abbreviation, number of samples analyzed in the cohort and median gene level sensitivity and specificity are shown.
Fig. 1.Chromosome arm-level concordance assessment summaries across 28 cancer sites. We identified chromosome arms that were spanned by SCNAs (50%) and for each arm we evaluated the concordance between RNA-seq and gold standard. The distribution of the non-acrocentric autosomal chromosome arms (n = 39) across the cancer sites are shown. For each site, a stacked bar plot of the number of samples with concordance-specific chromosome arm-level SCNAs are shown for all 39 chromosome arms
Fig. 2.Concordance assessment at genome level ‘genomic burden’ across 28 cancer sites. Genomic burden is defined as the fraction of the genome that is affected by SCNAs. A scatter plot demonstrating the concordance between RNA-seq- and gold standard-derived genomic burden (median) for each cancer site is shown
hapLOHseq and CaSpER comparison
| BRCA ( | GBM ( | |||
|---|---|---|---|---|
| Method | Sens (%) | Spec (%) | Sens (%) | Spec (%) |
| hapLOHseq (RNA-seq) | 74 | 94 | 81 | 93 |
| hapLOHseq (RNA-seq + imputed genotypes) | 85 | 91 | 89 | 92 |
| CaSpER | 43 | 82 | 59 | 95 |
Note: hapLOHseq and CaSpER performance evaluation. Rows 1–3 show performance results at the gene level obtained by comparing each method to the gold standard (Sivakumar ).
Fig. 3.Clinical efficacy of hapLOHseq results demonstrated using TCGA BRCA cohort. (A) Recapitulating the genomic burden distribution across different subtypes: left: from the supplementary material of the TCGA BRCA paper (Cancer Genome Atlas Network, 2012), right: hapLOHseq results; histogram of sample genomic burden across the cohort grouped by subtypes. (B) Frequency of chromosome arm level alterations in 1q, 5q and 16q as a fraction of number of samples across different subtypes. (C) Concordance assessment for the five genes that are frequently affected by SCNA events. Rows represent the genes and columns represent the samples in the cohort