| Literature DB >> 34321100 |
Douglas Meyer1, Jacob Kames1, Haim Bar2, Anton A Komar3, Aikaterini Alexaki1, Juan Ibla4, Ryan C Hunt1, Luis V Santana-Quintero5, Anton Golikov5, Michael DiCuccio6, Chava Kimchi-Sarfaty7.
Abstract
BACKGROUND: Gene expression is highly variable across tissues of multi-cellular organisms, influencing the codon usage of the tissue-specific transcriptome. Cancer disrupts the gene expression pattern of healthy tissue resulting in altered codon usage preferences. The topic of codon usage changes as they relate to codon demand, and tRNA supply in cancer is of growing interest.Entities:
Keywords: Cancer transcriptome; CancerCoCoPUTs; Codon pair; Codon usage; Invasive ductal carcinoma; Invasive lobular carcinoma; Relative synonymous codon usage (RSCU); Survival analysis; Synonymous codons; The Cancer Genome Atlas (TCGA)
Mesh:
Substances:
Year: 2021 PMID: 34321100 PMCID: PMC8317675 DOI: 10.1186/s13073-021-00935-6
Source DB: PubMed Journal: Genome Med ISSN: 1756-994X Impact factor: 11.117
Fig. 1Primary tumor and normal tissue-specific codon and codon pair usage. A, B Euclidean distance dendrogram which clusters tissues based on codon usage (A) or codon pair usage (B). Distance between tissues is reflected by the height of the parent node. Codon and codon pair usage values reflect the median tissue values for each primary tumor and normal tissue type
Fig. 2Aggregate normal vs. cancer codon and codon pair usage comparison for select tissues. A, B Scatter plots comparing codon usage between prostate adenocarcinoma and normal prostate tissue (A) and between cholangiocarcinoma and normal bile duct tissue (B). Each red point represents a codon. Codons above the black diagonal line are more frequent in cancer tissue than normal tissue. The mean square error (MSE) value is noted in the top left of the graph. A higher MSE value indicates more difference between codon usage in the primary tumor tissue and codon usage in normal tissue. C, D Principal component analysis for codon (C) and codon pair (D) usage in normal lung tissue, non-small cell lung cancer tissues, and genomics. Genomic codon and codon pair usage values are not transcriptome weighted. E, F Euclidean distance dendrograms based on tissue-specific codon usage (E) or codon pair usage (F)
Codon usage differences between each cancer and its respective normal tissue
| Cancer name | Higher in cancer or normal | Codon | % difference |
|---|---|---|---|
| Aggregate transitional cell carcinoma—bladder | Cancer | CGG | 5.42 |
| Aggregate transitional cell carcinoma—Bladder | Normal | TGT | 5.23 |
| Transitional cell carcinoma—bladder | Cancer | CGG | 4.76 |
| Transitional cell carcinoma—bladder | Normal | TGT | 5.03 |
| Papillary transitional cell carcinoma—bladder | Cancer | CGG | 8.26 |
| Papillary transitional cell carcinoma—bladder | Normal | CCT | 9.28 |
| Aggregate carcinoma—breast | Cancer | GGT | 15.83 |
| Aggregate carcinoma—breast | Normal | TGT | 7.44 |
| Ductal carcinoma—breast | Cancer | GGT | 15.06 |
| Ductal carcinoma—breast | Normal | TGT | 7.60 |
| Lobular carcinoma—breast | Cancer | GGT | 19.60 |
| Lobular carcinoma—breast | Normal | TTA | 8.93 |
| Duct and lobular carcinoma—breast | Cancer | GGT | 28.12 |
| Duct and lobular carcinoma—breast | Normal | TGT | 5.01 |
| Colorectal adenocarcinoma | Cancer | CGT | 11.28 |
| Colorectal adenocarcinoma | Normal | TGC | 8.23 |
| Left colorectal adenocarcinoma | Cancer | CGT | 11.39 |
| Left colorectal adenocarcinoma | Normal | TGC | 7.55 |
| Right colorectal adenocarcinoma | Cancer | CGT | 12.23 |
| Right colorectal adenocarcinoma | Normal | TGC | 10.40 |
| Adenocarcinoma—endometrium | Cancer | GCG | 6.06 |
| Adenocarcinoma—endometrium | Normal | CAA | 8.52 |
| Endometrioid adenocarcinoma | Cancer | GCG | 6.07 |
| Endometrioid adenocarcinoma | Normal | CAA | 9.01 |
| Serous cystadenocarcinoma—endometrium | Cancer | GCG | 6.86 |
| Serous cystadenocarcinoma—endometrium | Normal | CAA | 7.22 |
| Squamous cell carcinoma—head and neck | Cancer | TCC | 6.20 |
| Squamous cell carcinoma—head and neck | Normal | CAC | 4.78 |
| Squamous cell carcinoma—esophagus | Cancer | CGC | 19.16 |
| Squamous cell carcinoma—esophagus | Normal | TAT | 22.41 |
| Esophageal adenocarcinoma | Cancer | CGG | 17.63 |
| Esophageal adenocarcinoma | Normal | TAT | 18.16 |
| Clear cell renal cell carcinoma | Cancer | CGC | 6.67 |
| Clear cell renal cell carcinoma | Normal | ATA | 6.57 |
| Papillary renal cell carcinoma | Cancer | CGC | 11.36 |
| Papillary renal cell carcinoma | Normal | TTA | 13.01 |
| Chromophobe renal cell carcinoma | Cancer | CGC | 10.11 |
| Chromophobe renal cell carcinoma | Normal | TGC | 10.03 |
| Hepatocellular carcinoma | Cancer | CGG | 24.19 |
| Hepatocellular carcinoma | Normal | TGT | 28.55 |
| Cholangiocarcinoma | Cancer | CGG | 47.48 |
| Cholangiocarcinoma | Normal | TGT | 61.40 |
| Adenocarcinoma—lung | Cancer | CGT | 11.77 |
| Adenocarcinoma—lung | Normal | TGC | 13.07 |
| Squamous cell carcinoma—lung | Cancer | CGT | 16.52 |
| Squamous cell carcinoma—lung | Normal | TGC | 17.95 |
| Adenocarcinoma with mixed subtypes—lung | Cancer | CGT | 11.86 |
| Adenocarcinoma with mixed subtypes—lung | Normal | TGT | 12.38 |
| Bronchioloalveolar carcinoma | Cancer | TTA | 12.15 |
| Bronchioloalveolar carcinoma | Normal | TGC | 10.07 |
| Papillary adenocarcinoma—lung | Cancer | CGT | 10.82 |
| Papillary adenocarcinoma—lung | Normal | TGT | 11.76 |
| Mucinous adenocarcinoma—lung | Cancer | CGG | 8.76 |
| Mucinous adenocarcinoma—lung | Normal | CCT | 8.90 |
| Prostate adenocarcinoma | Cancer | GGT | 3.74 |
| Prostate adenocarcinoma | Normal | ATA | 3.57 |
| Adenocarcinoma—stomach | Cancer | TTA | 12.74 |
| Adenocarcinoma—stomach | Normal | TGC | 5.62 |
| Intestinal type adenocarcinoma—stomach | Cancer | TTA | 12.45 |
| Intestinal type adenocarcinoma—stomach | Normal | TGC | 5.82 |
| Diffuse type carcinoma—stomach | Cancer | TTA | 13.49 |
| Diffuse type carcinoma—stomach | Normal | GTC | 5.68 |
| Tubular adenocarcinoma—stomach | Cancer | TTA | 13.51 |
| Tubular adenocarcinoma—stomach | Normal | TGC | 6.88 |
This table describes the most pronounced codon usage differences for each cancer type based on median transcriptome-weighted codon usage comparison between each cancer type and its respective normal tissue type. For each cancer type, one codon with higher usage in primary tumor tissue and one codon with higher usage in normal tissue are listed. More codon differences can be found in Additional File 4: Table S3
Fig. 3Aggregate comparison for select primary tumor types. A, B Principal component analysis for liver and bile duct tissues based on codon usage (A) and codon pair usage (B). C, D Euclidean distance dendrogram for liver and bile duct tissues based on codon usage (C) and codon pair usage (D). E, F Principal component analysis for colorectal tissues based on codon usage (E) and codon pair usage (F). “Right” colon refers to the ascending colon, cecum, and hepatic flexure of the colon. “Left” colon refers to the descending colon, splenic flexure of the colon, sigmoid colon, rectosigmoid junction, and rectum. G, H Euclidean distance dendrograms for colorectal tissues based on codon usage (G) and codon pair usage (H). I, J Principal component analysis for gastric and esophageal tissues based on codon (I) and codon pair usage (J). K, L Euclidean distance dendrograms for gastric and esophageal tissues based on codon (K) and codon pair usage (L)
Fig. 4Change in codon usage and change in RSCU in breast cancers. A Scatterplot representing codon usage difference between normal breast tissue and aggregate breast cancer based on the median tissue values. Values along the x-axis represent the codon usage per thousand in normal breast tissue. Values along the y-axis represent the percent difference between aggregate breast cancer and normal breast usage. B–D Scatterplots representing the correlation between change in relative synonymous codon usage (RSCU) of GGT and its synonymous codons GGA (B), GGC (C), and GGG (D). Each point represents a change in individual patients (n = 107). p-value text appears green where the null hypothesis may be rejected (see the “Methods” section for the explanation of the Wald test used). E Scatterplot representing the codon usage difference between normal breast tissue and invasive ductal carcinoma (IDC) of the breast based on the median tissue values. Values along the x-axis represent the codon usage per thousand in normal breast tissue. Values along the y-axis represent the percent difference between IDC and normal breast usage. F–H Scatterplots representing the correlation between change in relative synonymous codon usage (RSCU) of GGT and its synonymous codons GGA (F), GGC (G), and GGG (H). Each point represents a change in individual IDC patients (n = 85). p-value text appears green where the null hypothesis may be rejected (see the “Methods” section for the explanation of the Wald test used). I Scatterplot representing the codon usage difference between normal breast tissue and invasive lobular carcinoma (ILC) based on the median tissue values. J–L Scatterplots representing the correlation between change in RSCU of GGT and its synonymous codons GGA (J), GGC (K), and GGG (L). Each point represents a codon change in individual ILC patients (n = 7). p-value text appears green where the null hypothesis may be rejected (see the “Methods” section for the explanation of the Wald test used). M Scatterplots representing the codon usage difference between normal breast tissue and mixed invasive ductal and lobular carcinoma (IDLC) based on the median tissue values. N–P Scatterplots representing the correlation between change in RSCU of GGT and its synonymous codons GGA (N), GGC (O), and GGG (P). Each point represents the change in individual IDLC patients (n = 9). p-value text appears green where the null hypothesis may be rejected (see the “Methods” section for the explanation of the Wald test used)
Summary of MSE variation among patients with each cancer type
| Cancer type | Mean | Min | 25 percentile | Median | 75 percentile | Max | Range | Number of patients |
|---|---|---|---|---|---|---|---|---|
| Prostate adenocarcinoma | 3.88 | 0.04 | 0.24 | 0.42 | 1.21 | 56.14 | 56.10 | 50 |
| Squamous cell carcinoma—head and neck | 3.24 | 0.26 | 1.25 | 1.84 | 3.46 | 35.24 | 34.97 | 40 |
| Hepatocellular carcinoma | 4.21 | 0.40 | 1.41 | 2.96 | 6.26 | 19.59 | 19.19 | 49 |
| Clear cell renal cell carcinoma | 1.02 | 0.09 | 0.27 | 0.53 | 0.84 | 15.59 | 15.50 | 71 |
| Esophageal adenocarcinoma | 5.35 | 0.17 | 1.50 | 4.67 | 7.05 | 15.51 | 15.33 | 7 |
| Aggregate carcinoma—breast | 1.62 | 0.17 | 0.46 | 0.92 | 1.91 | 12.30 | 12.13 | 107 |
| Ductal carcinoma—breast | 1.60 | 0.17 | 0.46 | 0.86 | 1.89 | 12.30 | 12.13 | 85 |
| Colorectal adenocarcinoma | 1.74 | 0.18 | 0.42 | 0.76 | 1.44 | 11.76 | 11.59 | 46 |
| Right colorectal adenocarcinoma | 2.95 | 0.34 | 0.51 | 1.22 | 3.66 | 11.76 | 11.42 | 15 |
| Duct and lobular carcinoma—breast | 2.07 | 0.37 | 1.01 | 1.51 | 1.99 | 7.63 | 7.26 | 9 |
| Adenocarcinoma—endometrium | 1.07 | 0.18 | 0.40 | 0.55 | 1.20 | 7.43 | 7.25 | 23 |
| Endometrioid adenocarcinoma | 1.20 | 0.18 | 0.40 | 0.59 | 1.28 | 7.43 | 7.25 | 19 |
| Cholangiocarcinoma | 9.42 | 6.62 | 6.95 | 9.95 | 10.64 | 13.58 | 6.96 | 9 |
| Adenocarcinoma—stomach | 1.98 | 0.39 | 0.74 | 1.66 | 2.68 | 7.31 | 6.92 | 27 |
| Papillary renal cell carcinoma | 0.63 | 0.07 | 0.20 | 0.44 | 0.71 | 4.79 | 4.73 | 31 |
| Chromophobe renal cell carcinoma | 1.06 | 0.10 | 0.32 | 0.64 | 1.43 | 4.47 | 4.37 | 23 |
| Aggregate transitional cell carcinoma—bladder | 1.22 | 0.14 | 0.47 | 1.02 | 1.67 | 3.95 | 3.81 | 18 |
| Transitional cell carcinoma—bladder | 1.19 | 0.14 | 0.45 | 0.99 | 1.58 | 3.95 | 3.81 | 17 |
| Adenocarcinoma—lung | 1.22 | 0.18 | 0.60 | 1.01 | 1.77 | 3.54 | 3.36 | 51 |
| Squamous cell carcinoma—lung | 1.34 | 0.03 | 0.64 | 1.14 | 1.93 | 3.30 | 3.27 | 48 |
| Bronchioloalveolar carcinoma | 1.67 | 0.51 | 0.73 | 0.96 | 2.25 | 3.54 | 3.03 | 3 |
| Left colorectal adenocarcinoma | 0.83 | 0.18 | 0.43 | 0.72 | 1.04 | 2.64 | 2.46 | 18 |
| Lobular carcinoma—breast | 1.23 | 0.17 | 0.82 | 1.01 | 1.61 | 2.60 | 2.43 | 7 |
| Intestinal type adenocarcinoma—stomach | 1.13 | 0.39 | 0.46 | 0.69 | 1.31 | 2.81 | 2.41 | 5 |
| Adenocarcinoma with mixed subtypes—lung | 1.22 | 0.39 | 0.46 | 1.22 | 1.98 | 2.07 | 1.68 | 4 |
| Diffuse type carcinoma—stomach | 2.54 | 1.51 | 2.39 | 2.61 | 3.07 | 3.13 | 1.63 | 5 |
| Papillary adenocarcinoma—lung | 1.33 | 0.88 | 0.99 | 1.09 | 1.55 | 2.00 | 1.12 | 3 |
| Mucinous adenocarcinoma—lung | 0.78 | 0.23 | 0.71 | 0.94 | 1.01 | 1.04 | 0.81 | 4 |
| Serous cystadenocarcinoma—endometrium | 0.49 | 0.26 | 0.42 | 0.51 | 0.58 | 0.69 | 0.43 | 4 |
This table summarizes the MSE computed for each patient with each of 29 cancer types based on codon usage. Values presented here better describe the spread of MSE values for a cancer type. Columns include “Min,” “Max,” “Range,” “Median,” and “Mean.” The number of patients examined for each cancer type is described under the “Number of patients” column; 25% and 75% refers to the first and third quartile, respectively, and may not be useful for cancer types with low patient numbers
Fig. 5Codon and codon pair usage differences in prostate adenocarcinoma patients. A–D Representative scatterplots for 4 prostate adenocarcinoma patients. Case numbers are assigned in order of decreasing MSE (case 1 refers to the patient with the highest MSE while case 50 refers to the patient with the lowest MSE). E, G PCA based on codon usage in each patient’s normal prostate tissue (E) or primary tumor tissue (G). Cases with high MSE are colored red, and cases with low MSE are colored blue. F, H PCA based on codon pair usage in each patient’s normal prostate tissue (F) or primary tumor tissue (H). Cases with high MSE (> 16) are colored red, and cases with low MSE (< 3) are colored blue
Fig. 6Kaplan-Meier analysis of codon, codon pair, and transcriptome changes in 596 patients. A Kaplan-Meier curves for patients with relatively high global codon usage changes (red) and patients with relatively low global codon usage changes (green). The horizontal dashed line represents 50% survival probability and intersects with each curve at their median survival time. Shaded regions represent 95% confidence intervals. B Kaplan-Meier curves for patients with relatively high global codon pair usage changes (red) and patients with relatively low global codon pair usage changes (green). The horizontal dashed line represents 50% survival probability and intersects with each curve at their median survival time. Shaded regions represent 95% confidence intervals. C Kaplan-Meier curves for patients with relatively high transcriptome changes (red) and patients with relatively low transcriptome changes (green). The horizontal dashed line represents 50% survival probability and intersects with each curve at their median survival time. Shaded regions represent 95% confidence intervals