| Literature DB >> 24565165 |
Ryoichi Kinoshita, Mitsuo Iwadate, Hideaki Umeyama, Y-h Taguchi.
Abstract
BACKGROUND: Aberrant DNA methylation is often associated with cancers. Thus, screening genes with cancer-associated aberrant DNA methylation is a useful method to identify candidate cancer-causing genes. Aberrant DNA methylation is also genotype dependent. Thus, the selection of genes with genotype-specific aberrant DNA methylation in cancers is potentially important for tailor-made medicine. The selected genes are important candidate drug targets.Entities:
Mesh:
Year: 2014 PMID: 24565165 PMCID: PMC4080267 DOI: 10.1186/1752-0509-8-S1-S4
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Figure 1PCs for genotypes measured by Nsp microarray. (a) PC1 (81%). (b) PC2 (3%). Black circle, blood; red triangle, normal tissue; green cross, tumor tissue. The horizontal axes indicate the subjects and their samples. The order of the 30 subjects in the 1―30, 31 ―60, and 61―90 sections are the same; i.e., 1, 31, and 61 are samples from the same patient.
Figure 2PCs for DNA methylation measured by Nsp microarray. (a) PC1 (80%). (b) PC2 (3%). Other notations are the same as those in Figure 1.
Figure 3Two dimensional embedding of SNPs with PC1 and PC2 for the Nsp microarray measurements. (a) Genotype (Figure 1). (b) DNA methylation (Figure 2). The top 300 outliers are shown in red.
SNPs selected for DNA methylation and genotype measured by the Nsp microarray.
| Rank | SNPs | |||
|---|---|---|---|---|
| 50 | SNP_A-1825620 | SNP_A-2213037 | SNP_A-2145008 | SNP_A-2309865 |
| SNP_A-4233167 | ||||
| 100 | SNP_A-2172952 | SNP_A-2234716 | SNP_A-1984943 | SNP_A-2121000 |
| SNP_A-2085071 | ||||
| 150 | SNP_A-2040111 | SNP_A-4195285 | SNP_A-4199352 | SNP_A-2089983 |
| SNP_A-1944699 | SNP_A-1988914 | SNP_A-1834529 | SNP_A-1950742 | |
| SNP_A-2105346 | SNP_A-4235277 | |||
| 200 | SNP_A-4229534 | SNP_A-4226834 | SNP_A-4196078 | SNP_A-2199615 |
| SNP_A-1919825 | SNP_A-2276203 | SNP_A-1961374 | ||
| 250 | SNP_A-1989613 | SNP_A-1845324 | SNP_A-1880907 | SNP_A-2142865 |
| SNP_A-2124767 | SNP_A-1810962 | SNP_A-4193660 | SNP_A-1852621 | |
| SNP_A-1961109 | SNP_A-4212314 | SNP_A-2042678 | SNP_A-1886593 | |
| SNP_A-1980533 | SNP_A-2143521 | SNP_A-2088571 | ||
| 300 | SNP_A-2043441 | SNP_A-2287632 | SNP_A-2056366 | SNP_A-2185001 |
| SNP_A-1910539 | SNP_A-4213049 | SNP_A-2007288 | SNP_A-4228665 | |
| SNP_A-4236336 | SNP_A-2063926 | SNP_A-1911642 | SNP_A-1950919 | |
| SNP_A-2053247 | SNP_A-4197286 | SNP_A-4204073 | SNP_A-2221049 | |
| SNP_A-2065785 | ||||
The 59 selected SNPs (probes) that exhibited significant differences between blood, and normal and tumor tissues, within the top N (= 50, 100, 150, 200, 250, and 300) outliers for both DNA methylation and genotype. Top 300 outliers are shown in red in Fig. 3. For detailed annotations, including associated genes, of the selected SNPs, see Additional file 1 (sheet name Nsp).
Figure 4PCs for genotype measured by Sty microarray.(a) PC1 (81%). (b) PC2 (3%). (c) PC3 (2%). (d) PC4 (1%). Other notations are the same as those in Fig. 1.
Figure 5PCs for DNA methylation measured by Sty microarray. (a) PC1 (83%). (b) PC2 (2%). (c) PC3 (1%). (d) PC4 (1%). Other notations are the same as those in Fig. 1.
Figure 6Two dimensional embedding of SNPs for the Sty microarray measurements. For genotype (a) with PC1 (Figure 4a) and PC3 (Figure 4c) and (b) with PC1 (Figure 4a) and PC4 (Figure 4d). For DNA methylation (c) with PC1 (Figure 5a) and PC3 (Figure 5c) and (d) with PC1 (Figure 5a) and PC4 (Figure 5d). Top 300 outliers are shown in red.
SNPs measured by the Sty microarray using PC4 for genotype and PC3 for DNA methylation.
| Rank | SNPs | |||
|---|---|---|---|---|
| 50 | SNP_A-2176803 | SNP_A-4286712 | SNP_A-4276813 | SNP_A-2134351 |
| 100 | SNP_A-4252327 | SNP_A-4271493 | SNP_A-4261117 | SNP_A-1955805 |
| 150 | SNP_A-2278684 | SNP_A-2159288 | SNP_A-1798268 | |
| 200 | SNP_A-1975466 | |||
| 250 | SNP_A-2221439 | SNP_A-2175811 | SNP_A-4259136 | SNP_A-2186260 |
| 300 | SNP_A-4247667 | SNP_A-4296608 | SNP_A-4302067 | |
The 22 selected SNPs (probes) that exhibited significant difference between blood, and normal and tumor tissues, within the top N (= 50, 100, 150, 200, 250, and 300) outliers for both DNA methylation and genotype. Top 300 outliers are shown in red in Figures 6b and 6c. For detailed annotations, including associated genes, of the selected SNPs, see Additional file 1 (sheet name Sty1).
SNPs measured by the Sty microarray using PC3 for genotype and PC4 for DNA methylation.
| Rank | SNPs | |||
|---|---|---|---|---|
| 50 | SNP_A-1879798 | SNP_A-4261939 | SNP_A-4288122 | SNP_A-1906431 |
| 100 | SNP_A-1781703 | SNP_A-4257840 | SNP_A-1990707 | SNP_A-2092003 |
| 150 | SNP_A-4292799 | SNP_A-4285002 | SNP_A-2295075 | SNP_A-2176803 |
| 200 | SNP_A-4275514 | SNP_A-4242077 | SNP_A-2073412 | SNP_A-1793920 |
| 250 | SNP_A-4258351 | SNP_A-2207678 | SNP_A-4277760 | SNP_A-1834280 |
| 300 | SNP_A-4258451 | SNP_A-4302014 | SNP_A-4293935 | |
The 37 selected SNPs (probes) that exhibited significant difference between blood, and normal and tumor tissues, within the top N (= 50, 100, 150, 200, 250, and 300) outliers for both DNA methylation and genotype. Top 300 outliers are shown in red in Figures 6a and 6d. For detailed annotations, including associated genes, of the selected SNPs, see Additional file 1 (sheet name Sty2).
Comparison of our method with other feature selection methods.
| Method | Nsp | Sty | ||
|---|---|---|---|---|
| (Sty1 | Sty2) | |||
| Present | 68 | 81 | 50 | |
| Pearson | 49 | 14 | ||
| Spearman | 39 | 18 | ||
| PLS | 7 | 13 | ||
| Stepclass | -- | -- | ||
| lasso | -- | -- | ||
The number of SNPs selected in common between the top-ranked 300 SNPs in genotype and DNA methylationusing various methods. Present, this study; Pearson, Pearson correlation coefficients based method; Spearman, Spearman correlation coefficients based method; PLS, partial least squares based method; Stepclass, stepclass (R function that executes iterative feature selection) based method; and lasso: Lasso based method. Nsp and Sty are the microarray data sets used in the study. Sty1 and Sty2 correspond to the PC4 for genotype (Figures 4d and 6b)/PC3 for DNA methylation (Figures 5c and 6c) and the PC3 for genotype (Figures 4c and 6a)/PC4 for DNA methylation (Figures 5d and 6d) combinations of PCs.
Figure 7Comparison of microarray measurements between genotype and DNA methylation. Comparison of (a) 68 SNPs selected with the Nsp microarray data set and (b) 81 SNPs selected with the Sty microarray data set using PC4 for genotype (Figures 4d and 6b) and PC3 for DNA methylation (Figures 5c and 6c). Comparison of (c) 50 SNPs selected with the Sty microarray using PC3 for genotype (Figures 4c and 6a) and PC4 for DNA methylation (Figures 5d and 6d). Black circle, blood; red triangle, normal tissue; green cross, tumor tissue. Solid lines indicate the boundary where the microarray measurements are equal between genotype and DNA methylation.
t-tests of microarray measurements between genotype and DNA methylation for blood, normal and tumor tissues.
| Nsp | |||
|---|---|---|---|
| Genotype | DNA methylation | ||
| blood | 1.32 | 1.19 | 3.1 × 10 |
| normal tissue | 2.45 | 1.75 | ∗ |
| tumor | 2.84 | 2.23 | ∗ |
| Sty1 | |||
| Genotype | DNA methylation | ||
| blood | 2.21 | 1.77 | ∗ |
| normal tissue | 2.58 | 2.14 | ∗ |
| tumor | 2.87 | 2.48 | ∗ |
| Sty2 | |||
| Genotype | DNA methylation | ||
| blood | 1.69 | 1.29 | ∗ |
| normal tissue | 2.42 | 2.02 | ∗ |
| tumor | 2.51 | 2.04 | ∗ |
Nsp and Sty are the microarray data sets used in the study. Sty1 and Sty2 correspond to the PC4 for genotype (Figures 4d and 6b)/PC3 for DNA methylation (Figures 5c and 6c) and PC3 for genotype (Figures 4c and 6a)/PC4 for DNA methylation (Figures 5d and 6d) combinations of PCs. ∗ indicates P-values <2.2 x 10.
t-tests of randomly sampled SNPs between genotype and DNA methylation.
| Nsp | |||
|---|---|---|---|
| Lower bound | Upper bound | Number of significant | |
| blood | 6.9 × 10 | 1.00 | 19 |
| normal tissue | 0.32 | 1.00 | 0 |
| tumor | 0.96 | 1.00 | 0 |
| Sty1 | |||
| Lower bound | Upper bound | Number of significant | |
| blood | 0.05 | 1.00 | 5 |
| normal tissue | 0.01 | 1.00 | 4 |
| tumor | 0.04 | 1.00 | 6 |
| Sty2 | |||
| Lower bound | Upper bound | Number of significant | |
| blood | 6.12 × 10 | 1.00 | 2 |
| normal tissue | 0.06 | 1.00 | 0 |
| tumor | 9.56 × 10 | 1.00 | 1 |
P-values were computed using t-tests for the microarray measurements between genotype and DNA methylation for 1000 independent sets of randomly sampled SNPs. Each of the sets contained at least as many SNPs as are included in Table 4. Nsp and Sty are the microarray data sets used in the study. Sty1 and Sty2 correspond to the PC4 for genotype (Figures 4d and 6b)/ PC3 for DNA methylation (Figures 5c and 6c) and PC3 for genotype (Figures 4c and 6a)/ PC4 for DNA methylation (Figures 5d and 6d) combinations of PCs. P-values, adjusted by the BH criterion, of < 0.05 were regarded as significant.
Figure 8Schematic illustration of the gene screening process. The grey rectangle indicates the processes performed in this study. The red (blue) boxes indicate the data processing flow for the genotype (DNA methylation) data. The solid (dotted) lines indicate data processing flow for the Nsp (Sty) measurements. Sty1 and Sty2 indicate the two combinations of PCs that were used; PC4 for genotype (Figures 4d and 6b)/PC3 for DNA methylation (Figures 5c and 6c), and PC3 for genotype (Figures 4c and 6a)/PC4 for DNA methylation (Figures 5d and 6d).
Figure 9Schematic illustration of the drug discovery process. For the proteins encoded by the selected genes (ALK, EGLN3 and NUAK1), about 1,000 compounds, selected based on the Tanimoto index from DrugBank, were tested by ChooseLD using template protein structures from PDB. The templates are specified by their PDB IDs. The ligands are specified by the PDB ID, ligand name and a sequential number. For example, 3I7C_BK2_n1 indicates ligand BK2 (1-tert-butyl-3-naphthalen-2-yl-1H-pyrazolo[3,4-d]pyrimidin-4-amine) included in PDB entry 3I7C [PDB: 3I7C], and n1 means no.1. The drug discovery process for EGLN3 was performed twice, with and without Fe as a ligand. When Fe was excluded as a ligand, it was regarded as a mediator. That is, Fe bounds to the protein during docking simulation, but was excluded from the Tanimoto index computation.
The 10 top-ranked compounds as drug targets for ALK, EGLN3, and NUAK1.
| DrugBank ID | Compound name | Representative target cancer genes |
|---|---|---|
| ALK | ||
| DB01933 | 7-Hydroxystaurosporine | PDK1 |
| DB08700 | 3-[(1R)-1-(2,6-dichloro-3-fluorophenyl)ethoxy] | |
| -5-(1-piperidin-4-yl-1H-pyrazol-4-yl)pyridin-2-amine | TRKA, TRKB, TIE2, ABL | |
| DB04651 | BIOTINOL-5-AMP | -- |
| DB02491 | 4-[4-(1-Amino-1-Methylethyl)Phenyl]-5-Chloro-N | FGFR2 |
| -[4-(2-Morpholin-4-Ylethyl)Phenyl]Pyrimidin-2-Amine | ||
| DB07006 | 9-HYDROXY-6-(3-HYDROXYPROPYL)-4 | WEE1 |
| -(2-METHOXYPHENYL)PYRROLO[3,4-C] | ||
| CARBAZOLE-1,3(2H,6H)-DIONE | ||
| DB02010 | Staurosporine | ITK, SYK, MAPKAPK2, GSK3, |
| CSK, CDK, PIK3CG, ZAP-70 | ||
| DB02654 | 6-Hydroxy-Flavin-Adenine Dinucleotide | -- |
| DB07460 | 2-({5-CHLORO-2-[(2-METHOXY-4-MORPHOLIN | |
| -4-YLPHENYL)AMINO]PYRIMIDIN-4 | ||
| -YL}AMINO)-N-METHYLBENZAMIDE | ||
| DB07186 | 4-(4-METHYLPIPERAZIN-1-YL)-N-[5 | AURKA, PLK1 |
| -(2-THIENYLACETYL)-1,5-DIHYDROPYRROLO | ||
| [3,4-C]PYRAZOL-3-YL]BENZAMIDE | ||
| DB03247 | Riboflavin Monophosphate | RPS6KA4, POR(P450), SGK1, |
| NOS1, DPYD, DHODH | ||
| EGLN3 (with Fe) | ||
| DB03702 | 2-[4-[[(S)-1-[[(S)-2-[[(Rs)-3,3,3-Trifluoro-1-Isopropyl-2 | CELA1 |
| -Oxopropyl]Aminocarbonyl]Pyrrolidin-1-Yl-]Carbonyl]-2 | ||
| -Methylpropyl]Aminocarbonyl]Benzoylamino]Acetic Acid | ||
| DB04761 | PYRIMIDINE-4,6-DICARBOXYLIC ACID | MMP13 |
| BIS-[(PYRIDIN-3-YLMETHYL)-AMIDE] | ||
| DB08687 | N-[(1-CHLORO-4-HYDROXYISOQUINOLIN-3-YL) | |
| CARBONYL]GLYCINE | ||
| DB08131 | 2-{4-[2-(2-AMINO-4-OXO-4,7-DIHYDRO-3H | thyA |
| -PYRROLO[2,3-D]PYRIMIDIN-5-YL)-ETHYL] | ||
| -BENZOYLAMINO}-3-METHYL-BUTYRIC ACID | ||
| DB02718 | 5-Formyl-6-Hydrofolic Acid | -- |
| DB02015 | Dihydrofolic Acid | -- |
| DB02031 | (6s)-5,6,7,8-Tetrahydrofolate | NOS1, thyA |
| DB04760 | PYRIMIDINE-4,6-DICARBOXYLIC ACID | MMP13 |
| BIS-(4-FLUORO-3-METHYL-BENZYLAMIDE) | ||
| DB04759 | PYRIMIDINE-4,6-DICARBOXYLIC ACID | MMP13 |
| BIS-(3-METHYL-BENZYLAMIDE) | ||
| DB07112 | N-[(4-HYDROXY-8-IODOISOQUINOLIN-3-YL) | |
| CARBONYL]GLYCINE | ||
| EGLN3 (without Fe) | ||
| DB08687 | N-[(1-CHLORO-4-HYDROXYISOQUINOLIN | |
| -3-YL)CARBONYL]GLYCINE | ||
| DB03702 | already listed in EGLN3 (with Fe) | |
| DB04759 | already listed in EGLN3 (with Fe) | |
| DB03625 | 5,10-Dideazatetrahydrofolic Acid | GARFTase |
| DB04760 | already listed in EGLN3 (with Fe) | |
| DB07112 | already listed in EGLN3 (with Fe) | |
| DB02015 | already listed in EGLN3 (with Fe) | |
| DB03541 | 10-Propargyl-5,8-Dideazafolic Acid | TYMS, DHFR |
| DB00158 | Folic Acid | -- |
| DB04761 | already listed in EGLN3 (with Fe) | |
| NUAK1 | ||
| DB08053 | 1-cyclobutyl-3-(3,4-dimethoxyphenyl)-1H | CSF1R and others |
| -pyrazolo[3,4-d]pyrimidin-4-amine | ||
| DB08052 | 1-cyclopentyl-3-(1H-pyrrolo[2,3-b]pyridin-5-yl) | CSF1R and others |
| -1H-pyrazolo[3,4-d]pyrimidin-4-amine | ||
| DB08054 | 1-(1-methylethyl)-3-quinolin-6-yl-1H | CSF1R and others |
| -pyrazolo[3,4-d]pyrimidin-4-amine | ||
| DB07563 | 1-{7-cyclohexyl-6-[4-(4-methylpiperazin-1-yl) | CTSK |
| benzyl]-7H-pyrrolo[2,3-d]pyrimidin-2-yl}methanamine | ||
| DB08035 | 1-TERT-BUTYL-3-(2,5-DIMETHYLBENZYL) | AR |
| -1H-PYRAZOLO[3,4-D]PYRIMIDIN-4-AMINE | ||
| DB04463 | 3-(4-Amino-1-Tert-Butyl-1h-Pyrazolo[3,4-D] | CBR1 |
| Pyrimidin-3-Yl)Phenol | ||
| DB08300 | 1-methyl-3-naphthalen-2-yl-1H-pyrazolo | CSF1R and others |
| [3,4-d]pyrimidin-4-amine | ||
| DB01809 | 1-Ter-Butyl-3-P-Tolyl-1h-Pyrazolo | PKD1 and others |
| [3,4-D]Pyrimidin-4-Ylamine | ||
| DB08461 | 3-[(4-AMINO-1-TERT-BUTYL-1H-PYRAZOLO[3,4-D] | AR |
| PYRIMIDIN-3-YL)METHYL]PHENOL | ||
| DB08699 | 1-tert-butyl-3-(3-methylbenzyl)-1H-pyrazolo | CAMK2G |
| [3,4-d]pyrimidin-4-amine | ||
The compounds were ranked based on FPAScores averaged over three independent trials and their representative target cancer genes. For the full lists of ranked compounds and a detailed discussion of the target cancer genes listed here, see Additional files 7 and 8, respectively. ALK and EGLN1, a paralog of EGLN3, are in bold letters. "―" indicates that no known cancer-associated genes are targeted by these compounds.