| Literature DB >> 32403416 |
Nikolay Alabi1, Dropen Sheka1, Ashar Siddiqui2, Edwin Wang2.
Abstract
Contention exists within the field of oncology with regards to gastroesophageal junction (GEJ) tumors, as in the past, they have been classified as gastric cancer, esophageal cancer, or a combination of both. Misclassifications of GEJ tumors ultimately influence treatment options, which may be rendered ineffective if treating for the wrong cancer attributes. It has been suggested that misclassification rates were as high as 45%, which is greater than reported for junctional cancer occurrences. Here, we aimed to use the methylation profiles of GEJ tumors to improve classifications of GEJ tumors. Four cohorts of DNA methylation profiles, containing ~27,000 (27k) methylation sites per sample, were collected from the Gene Expression Omnibus and The Cancer Genome Atlas. Tumor samples were assigned into discovery (nEC = 185, nGC = 395; EC, esophageal cancer; GC gastric cancer) and validation (nEC = 179, nGC = 369) sets. The optimized Multi-Survival Screening (MSS) algorithm was used to identify methylation biomarkers capable of distinguishing GEJ tumors. Three methylation signatures were identified: They were associated with protein binding, gene expression, and cellular component organization cellular processes, and achieved precision and recall rates of 94.7% and 99.2%, 97.6% and 96.8%, and 96.8% and 97.6%, respectively, in the validation dataset. Interestingly, the methylation sites of the signatures were very close (i.e., 170-270 base pairs) to their downstream transcription start sites (TSSs), suggesting that the methylations near TSSs play much more important roles in tumorigenesis. Here we presented the first set of methylation signatures with a higher predictive power for characterizing gastroesophageal tumors. Thus, they could improve the diagnosis and treatment of gastroesophageal tumors.Entities:
Keywords: MSS; Multi-Survival Screening Algorithm; esophageal cancer; gastric cancer; gastroesophageal cancer diagnosis; gastroesophageal junction cancer; methylation array-based profile; methylation signature; predictor; tumor characterization; tumor classification
Year: 2020 PMID: 32403416 PMCID: PMC7281220 DOI: 10.3390/cancers12051208
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.639
Figure 1Diagram depicting the workflow used within the methodology. For greater detail in each step, refer to methods for dataset and further information.
The Cancer Genome Atlas (TCGA) training set clinical characteristics.
| Training Set (N = 628) | ||
|---|---|---|
| Sex | Male | 443 |
| Female | 185 | |
| Unknown | 0 | |
| Age (yrs.) | Range | 27–90 |
| Mean | 64.7 | |
| Unknown | 5 | |
| Stage | I | 54 |
| II | 136 | |
| III | 286 | |
| IV | 124 | |
| Unknown | 28 | |
For additional information pertaining to each study refer to Tables S1 and S2.
Validation set clinical characteristics composed of GSE72872, GSE30601, GSE32925, GSE81334, GSE25869, and GSE31788 datasets.
| Validation Set (N = 548) | ||
|---|---|---|
| Sex | Male | 391 |
| Female | 139 | |
| Unknown | 18 | |
| Age (yrs.) | Range | 23–92 |
| Mean | 64.5 | |
| Unknown (No. of patients) | 43 | |
For additional information pertaining to each study refer to Tables S3–S8.
Results for the differential expression of methylation probes across TCGA gastric cancer and esophageal cancer.
| Differentially Methylated Probes (DMPs) (FC >3, | N = 81814 |
|---|---|
| Differentially Methylated Regions | N = 28054 |
| DMPs overlapping with 27k Array | N = 536 |
| Regulatory Feature Group | Percentage of DMPs |
| Promoter Associated | 81.3% |
| Gene Associated | 0.25% |
| Gene Associated Cell Specific | 0.55% |
| Relation to Island | Percentage of DMPs |
| OpenSea | 18.4% |
| Island | 55.0% |
| N_Shore | 11.5% |
| S_Shore | 9.7% |
FC, Fold Change
Gene ontology (GO) analysis of differentially methylated probes to pool together for signature sets.
| GO Term | Fold Enrichment | FDR | GO Accession Number |
|---|---|---|---|
| Single-multicellular organism process | 1.0508656015 | 0.82 × 10−7 | 0044707 |
| Anatomical structure morphogenesis | 1.083136144 | 3.45 × 10−6 | 0009653 |
| Single-organism developmental process | 1.050520832 | 6.97 × 10−6 | 0044767 |
| Anatomical structure development | 1.050309311 | 8.2 × 10−6 | 0048856 |
| Cell fate commitment | 1.281284221 | 2.60 × 10−5 | 0045165 |
| Epithelium development | 1.1318538 | 2.65 × 10−5 | 0060429 |
| Developmental process | 1.047949197 | 1.32 × 10−5 | 0032502 |
| Organ morphogenesis | 1.13549378 | 3.45 × 10−5 | 0009887 |
| Tissue development | 1.097778353 | 5.93 × 10−5 | 0009888 |
| Skeletal system development | 1.187655599 | 1.30 × 10−4 | 0001501 |
| Multicellular organism development | 1.050805467 | 1.45 × 10−4 | 0007275 |
| Tube development | 1.157944133 | 0.001406 | 0035295 |
Methylation signatures’ precision and recall.
| Methylation Signature | Training Set (nEC = 185, nGC = 443) | Validation Set (nEC = 164, nGC = 383) |
|---|---|---|
| Protein Binding | Precision: 99.5% | Precision: 94.7% |
| Cellular Component Organization | Precision: 98.5% | Precision: 96.8% |
| Gene Expression | Precision: 98.0% | Precision: 97.6% |
Protein binding, cellular component organization, and gene expression methylation signatures’ probes, associated genes, and gene descriptions.
| Cellular Component Biogenesis | Gene Expression | Protein Binding | ||||||
|---|---|---|---|---|---|---|---|---|
| Probe | Gene | Gene Description | Probe | Gene | Gene Description | Probe | Gene | Gene Description |
| cg26117023 | LOXL3 | Lysyl Oxidase Like 3 | cg00901683 | CPSF4 | Cleavage and Polyadenylation Specific Factor 4 | cg08946989 | TBC1D7 | TBC1 Domain Family Member 7 |
| cg04020816 | MAN2A1 | Mannosidase Alpha Class 2A Member 1 | cg01491225 | ZCCHC9 | Zinc Finger CCHC-Type Containing 9 | cg01091448 | AMACR | Alpha-Methylacyl-CoA Racemase |
| cg21475255 | DAG1 | Dystroglycan 1 | cg11225935 | KDM5A | Lysine Demethylase 5A | cg09892390 | ARHGAP21 | Rho GTPase Activating Protein 21 |
| cg23364287 | IP6K2 | Inositol Hexakisphosphate Kinase 2 | cg14576628 | PRMT1 | Protein Arginine Methyltransferase 1 | cg01107741 | CANT1 | Calcium Activated Nucleotidase 1 |
| cg01651593 | CDC20 | Cell Division Cycle 20 | cg00155485 | MED13L | Mediator Complex Subunit 13L | cg03887534 | BCL2L13 | BCL2 Like 13 |
| cg05173789 | RPLP0 | Ribosomal Protein Lateral Stalk Subunit P0 | cg08587820 | BHLHE40 | Basic Helix-Loop-Helix Family Member E40 | cg05368762 | TMBIM6 | Transmembrane BAX Inhibitor Motif Containing 6 |
| cg09288658 | ZAK | Mitogen-Activated Protein Kinase Kinase Kinase 20 | cg12403575 | TRADD | TNFRSF1A Associated Via Death Domain | cg26117023 | LOXL3 | Lysyl Oxidase Like 3 |
| cg06649520 | ARFIP1 | ADP Ribosylation Factor Interacting Protein 1 | cg12179044 | GCN1L1 | GCN1 Activator of EIF2AK4 | cg05761032 | CCPG1 | Cell Cycle Progression 1 |
| cg10384134 | RPS9 | Ribosomal Protein S9 | cg12813922 | RAB3GAP1 | RAB3 GTPase Activating Protein Catalytic Subunit 1 | cg07448856 | ZNF670 | Zinc Finger Protein 670 |
| cg10872447 | GTF2F2 | General Transcription Factor IIF Subunit 2 | cg17982504 | DDX28 | DEAD-Box Helicase 28 | cg07628086 | AP2B1 | Adaptor Related Protein Complex 2 Subunit Beta 1 |
| cg14671453 | STX4 | Syntaxin 4 | cg19846927 | MRPL44 | Mitochondrial Ribosomal Protein L44 | cg09822001 | APOA1BP | NAD(P)HX Epimerase |
| cg17982504 | DDX28 | DEAD-Box Helicase 28 | cg19886179 | PSMD14 | Proteasome 26S Subunit, Non-ATPase 14 | cg10049968 | FAM219A | Family with Sequence Similarity 219 Member A |
| cg21289924 | EIF3A | Eukaryotic Translation Initiation Factor 3 Subunit A | cg02357725 | IMP3 | IMP U3 Small Nucleolar Ribonucleoprotein 3 | cg11356290 | AZI2 | 5-Azacytidine Induced 2 |
| cg01522721 | MIR1181 | MicroRNA 1181 | cg05141870 | MIR423 | MicroRNA 423 | cg12520111 | PPIA | Peptidylprolyl Isomerase A |
| cg03954150 | C18orf55 | Translocase of Inner Mitochondrial Membrane 21 | cg07483064 | ENO1 | Enolase 1 | cg12675800 | TRAPPC6B | Trafficking Protein Particle Complex 6B |
| cg03976567 | AKD1 | Adenylate Kinase 9 | cg09307279 | GLT8D1 | Glycosyltransferase 8 Domain Containing 1 | cg14874121 | HSD17B4 | Hydroxysteroid 17-Beta Dehydrogenase 4 |
| cg05369142 | ALS2CL | ALS2 C-Terminal Like | cg10872447 | GTF2F2 | General Transcription Factor IIF Subunit 2 | cg20218060 | CLK1 | CDC Like Kinase 1 |
| cg06804431 | GNRHR2 | Gonadotropin Releasing Hormone Receptor 2 (Pseudogene) | cg13208492 | TSN | Translin | cg20982583 | POLR2F | RNA Polymerase II Subunit F |
| cg10892866 | PYGO2 | Pygopus Family PHD Finger 2 | cg15305343 | NSUN4 | NOP2/Sun RNA Methyltransferase 4 | cg02226871 | VPS28 | Vacuolar Protein Sorting-Associated Protein 28 Homolog |
| cg12241125 | EIF4H | Eukaryotic Translation Initiation Factor 4H | cg15636365 | PNPLA7 | Patatin Like Phospholipase Domain Containing 7 | cg02792677 | MRPL4 | Mitochondrial Ribosomal Protein L4 |
| cg12674192 | MAK16 | MAK16 Homolog | cg16199381 | TSTD2 | Thiosulfate Sulfurtransferase Like Domain Containing 2 | cg04733989 | NAGA | Alpha-N-Acetylgalactosaminidase |
| cg13057891 | ERCC5 | ERCC Excision Repair 5, Endonuclease | cg16385933 | PDCD4 | Programmed Cell Death 4 | cg05347567 | ZC3H10 | Zinc Finger CCCH-Type Containing 10 |
| cg13908523 | PRKCD | Protein Kinase C Delta | cg18242682 | FOXK2 | Forkhead Box K2 | cg07772309 | NELF | NMDA Receptor Synaptonuclear Signaling and Neuronal Migration Factor |
| cg17165266 | KRT18 | Keratin 18 | cg24342628 | KDM1B | Lysine Demethylase 1B | cg07936037 | SSR1 | Signal Sequence Receptor Subunit 1 |
| cg17872064 | NOP58 | NOP58 Ribonucleoprotein | cg26117023 | LOXL3 | Lysyl Oxidase Like 3 | cg08525481 | OGFR | Opioid Growth Factor Receptor |
| cg22366626 | ZFYVE20 | Rabenosyn, RAB Effector | cg00080012 | EED | Embryonic Ectoderm Development | cg11023442 | PITPNA-AS1 | PITPNA antisense RNA 1 |
| cg23311628 | RAB8B | RAB8B, Member RAS Oncogene Family | cg01522721 | CDC37 | Cell Division Cycle 37 | cg12056618 | KLF13 | Kruppel Like Factor 13 |
| cg23521281 | WDR75 | WD Repeat Domain 75 | cg03196745 | ISCU | Iron-Sulfur Cluster Assembly Enzyme | cg14279899 | IFNGR1 | Interferon Gamma Receptor 1 |
| cg24711626 | KIAA1012 | Trafficking Protein Particle Complex 8 | cg04044561 | POP7 | POP7 Homolog, Ribonuclease P/MRP Subunit | cg14694952 | HTT | Huntingtin |
| cg24949344 | ANO6 | Anoctamin 6 | cg05088512 | DKKL1 | Dickkopf Like Acrosomal Protein 1 | cg15133363 | HILPDA | Hypoxia Inducible Lipid Droplet Associated |
Figure 2Box plots’ methylation values for all methylation probes in the best performing methylation signatures distinguishing esophageal cancer vs. gastric cancer.