| Literature DB >> 32528944 |
Shiqi Zhang1,2, Tao Zeng3, Bin Hu4, Yu-Hang Zhang5, Kaiyan Feng6, Lei Chen7, Zhibin Niu8, Jianhao Li4, Tao Huang5, Yu-Dong Cai1.
Abstract
DNA methylation is an essential epigenetic modification for multiple biological processes. DNA methylation in mammals acts as an epigenetic mark of transcriptional repression. Aberrant levels of DNA methylation can be observed in various types of tumor cells. Thus, DNA methylation has attracted considerable attention among researchers to provide new and feasible tumor therapies. Conventional studies considered single-gene methylation or specific loci as biomarkers for tumorigenesis. However, genome-scale methylated modification has not been completely investigated. Thus, we proposed and compared two novel computational approaches based on multiple machine learning algorithms for the qualitative and quantitative analyses of methylation-associated genes and their dys-methylated patterns. This study contributes to the identification of novel effective genes and the establishment of optimal quantitative rules for aberrant methylation distinguishing tumor cells with different origin tissues.Entities:
Keywords: cell line; classification; dys-methylated pattern; methylation signature; rule
Year: 2020 PMID: 32528944 PMCID: PMC7264161 DOI: 10.3389/fbioe.2020.00507
Source DB: PubMed Journal: Front Bioeng Biotechnol ISSN: 2296-4185
Sample sizes of 13 tissues.
| 1 | Aerodigestive Tract | 80 |
| 2 | Blood | 177 |
| 3 | Bone | 38 |
| 4 | Breast | 52 |
| 5 | Digestive system | 105 |
| 6 | Kidney | 33 |
| 7 | Lung | 198 |
| 8 | Nervous system | 96 |
| 9 | Pancreas | 31 |
| 10 | Skin | 59 |
| 11 | Soft tissue | 21 |
| 12 | Thyroid | 17 |
| 13 | Urogenital system | 115 |
Figure 1Analysis framework.
Figure 2IFS curves with SVM, PART, and RIPPER based on mRMR-ranked features. (A) IFS with SVM. When top 1910 features are used, SVM gives the best MCC of 0.958. (B) IFS with PART. When top 910 features are used, PART gives the best MCC of 0.741. (C) IFS with RIPPER. When top 2810 features are used, RIPPER gives the best MCC of 0.703.
Performance of IFS with SVM, PART, and RIPPER based on mRMR-ranked features for classifying tumor cells from different tissues.
| SVM | 1,910 | 0.963 | 0.958 |
| PART | 910 | 0.768 | 0.741 |
| RIPPER | 2,810 | 0.735 | 0.703 |
Figure 3Radar chart to show the performance of the best SVM, PART and RIPPER classifiers on 13 tissues based on the feature list yielded by mRMR.
Figure 4IFS curves with SVM, PART, and RIPPER based on MCFS-ranked features. (A) IFS with SVM. When top 3600 features are used, SVM gives the best MCC of 0.963. (B) IFS with PART. When top 1950 features are used, PART gives the best MCC of 0.770. (C) IFS with RIPPER. When top 2580 features are used, RIPPER gives the best MCC of 0.716.
Performance of IFS with SVM, RART, and RIPPER based on MCFS-ranked features for classifying tumor cells from different tissues.
| SVM | 3,600 | 0.967 | 0.963 |
| PART | 1,950 | 0.795 | 0.770 |
| RIPPER | 2,580 | 0.746 | 0.716 |
Figure 5Radar chart to show the performance of the best SVM, PART, and RIPPER classifiers on 13 tissues based on the feature list yielded by MCFS.
Representative rules for classifying tumor cells from different tissues.
| Rule 1 | mRMR | RIPPER | (cg03977657 ≤ 0.099) and (cg01149192 ≤ 0.666) and (cg25593954 ≥ 0.757) | Aerodigestive tract | LAMB3, MGAT1, SPOP |
| Rule 2 | mRMR | RIPPER | (cg00983904 ≥ 0.833) and (cg24393316 ≤ 0.090) and (cg04976330 ≥ 0.777) | Lung | IFFO1, FOXE1, PUM1 |
| Rule 3 | MCFS | RIPPER | (cg22609576 ≥ 0.084) and (cg00879790 ≤ 0.134) | Digestive system | TRIM15, SPG20 |
| Rule 4 | MCFS | RIPPER | (cg20783697 ≤ 0.300) | Blood | BZRAP1 |
| Rule 5 | mRMR | PART | (cg22203219 ≤ 0.460) and (cg16419724 > 0.408) and (cg08454824 > 0.683) and (cg13466284 > 0.577) and (cg16798247 ≤ 0.754) and (cg00989853 > 0.900) | Nervous system | IFFO1, MARVED2, ERICH1, SFN, ELMO1, IRF6 |
| Rule 6 | mRMR | PART | (cg20783697 ≤ 0.698) and (cg01951274 ≤ 0.130) | Blood | BZRAP1, MIR142 |
| Rule 7 | MCFS | PART | (cg02505827 ≤ 0.184) and (cg19519643 ≤ 0.785) and (cg00112091 > 0.118) and (cg05607401 ≤ 0.864) | Urogenital system | TEAD1, GMFG, MARVELD2 |
| Rule 8 | MCFS | PART | (cg02505827 ≤ 0.150) and (cg23229016 ≤ 0.645) | Skin | MARVELD2, RPS6KA1 |
Figure 6Venn diagram to show two marker gene sets yielded by mRMR and MCFS.