| Literature DB >> 16948864 |
Hiro Takahashi1, Takeshi Nemoto, Teruhiko Yoshida, Hiroyuki Honda, Tadashi Hasegawa.
Abstract
BACKGROUND: Recent advances in genome technologies have provided an excellent opportunity to determine the complete biological characteristics of neoplastic tissues, resulting in improved diagnosis and selection of treatment. To accomplish this objective, it is important to establish a sophisticated algorithm that can deal with large quantities of data such as gene expression profiles obtained by DNA microarray analysis.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16948864 PMCID: PMC1569882 DOI: 10.1186/1471-2105-7-399
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Hierarchical clustering of STS patients by using 12,241 unfiltered probes.
Blind accuracies for the SVM models using different filtering methods
| Filtering method | Number of genes | Accuracy (%) SVM model |
| PART | 1000 | 88.9 |
| NSC | 1000 | 66.7 |
| S2N | 1000 | 77.8 |
| SAM | 1000 | 77.8 |
| Student's t-test | 1000 | 66.7 |
| U-test | 1000 | 66.7 |
| Welch's t-test | 1000 | 66.7 |
| Random selection1 | 1000 | 55.6 |
| No filtering | 12241 | 55.6 |
1 The SVM model was constructed by using 1000 probes selected randomly. This process was repeated 1000 times. Average accuracies of 1000 SVM models were calculated.
Blind accuracies for various combinations of filtering and modeling methods
| Filtering methods | Wrapper methods | |||||
| BFCS | FNN | SVM | MRA | kNN | WV | |
| PART | 81.1 ± 14.1 (8) | 64.4 ± 15.6 (3) | 73.3 ± 5.4 (2) | 60.0 ± 16.6 (11) | 67.8 ± 16.8 (10) | 56.7 ± 13.6(14) |
| NSC | 68.9 ± 6.7 (5) | 60.0 ± 12.4 (3) | 62.2 ± 13.3 (3) | 65.6 ± 9.2 (9) | 68.9 ± 13.0 (3) | 66.7 ± 9.9 (21) |
| S2N | 68.9 ± 6.7 (15) | 56.7 ± 16.1 (3) | 61.1 ± 15.9 (3) | 61.1 ± 14.3 (4) | 63.3 ± 17.2 (4) | 58.9 ± 12.2 (18) |
| SAM | 71.1 ± 7.4 (12) | 64.4 ± 12.0 (3) | 67.8 ± 13.6 (3) | 63.3 ± 12.2 (10) | 74.4 ± 8.7 (7) | 63.3 ± 11.2 (9) |
| Student's t-test | 71.1 ± 5.4 (15) | 53.3 ± 12.0 (4) | 60.0 ± 10.2 (13) | 58.9 ± 13.2 (5) | 68.9 ± 8.3 (4) | 60.0 ± 19.4 (26) |
| U-test | 66.7 ± 9.9 (9) | 56.7 ± 16.1 (3) | 64.4 ± 13.0 (7) | 54.4 ± 10.2 (7) | 67.8 ± 11.6 (14) | 62.2 ± 12.4 (1) |
| Welch's t-test | 65.6 ± 10.5 (15) | 55.6 ± 14.9 (3) | 58.9 ± 8.7 (13) | 53.3 ± 12.2 | 67.8 ± 10.5 | 65.6 ± 13.6 (12) |
| No filtering | 68.9 ± 9.7 (10) | 58.9 ± 10.0 (3) | 66.7 ± 15.7 (2) | 61.1 ± 13.4 (3) | 55.6 ± 15.7 (3) | 57.8 ± 17.1 (26) |
Parenthesized values indicate the numbers of probes used in each model.
Figure 2Hierarchical clustering of STS patients by using 28 genes selected by PART-BFCS.
The genes selected by PART-BFCS and the genes having high correlation with them
| Accession no. | Gene name | Times of selection | Top 10 high correlation genes | |||||||||
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |||
| NM_002415 | MIF | 9 | NIPSNAP1 NM_003634 (0.74) | DDT NM_001355 (0.73) | PORIMIN BG538627 (0.73) | NDUFA7 NM_005001 (0.71) | SNAP29 NM_004782 (0.71) | TSSC3 AF001294 (0.71) | MMP1 NM_002421 (0.71) | LSM1 NM_014462 (0.71) | DGCR14 L77566 (0.71) | MRPL20 NM_017971 (0.69) |
| AB032261 | SCD | 8 | SCD AA678241 (0.94) | SMAP1 NM_021940 (0.74) | SCD BC005807 (0.69) | NRBF-2 AA883074 (0.67) | INSIG1 NM_005542 (0.65) | FADS1 AL512760 (0.63) | VDAC2 L08666 (0.63) | GLUD1 NM_005271 (0.62) | TFG NM_006070 (0.62) | TFAP2A BF343007 (0.61) |
| NM_016332 | SEPX1 | 7 | SIAH2 U76248 (0.81) | WHSC1 BF111870 (0.73) | KIAA0220 AI424872 (0.73) | BAIAP3 AI799802 (0.71) | TBC1D1 AB029031 (0.70) | FARSLA AD000092 (0.70) | OPRS1 NM_005866 (0.70) | CTBP2 N23018 (0.70) | TCF3 AW062341 (0.69) | KCTD5 NM_018992 (0.68) |
| AL161999 | CYFIP2 | 5 | CRISPLD2 AL136861 (0.91) | NTF3 NM_002527 (0.90) | PRO1331 NM_030778 (0.89) | TNFSF11 AF053712 (0.88) | NTRK3 S76476 (0.87) | SLC24A3 NM_020689 (0.86) | KCNIP1 NM_014592 (0.86) | CP NM_000096 (0.86) | ARTN AF120274 (0.86) | KIAA0523 AB011095 (0.86) |
| AI218219 | HSPCB | 5 | HSPCB AF275719 (0.89) | HSP105B NM_006644 (0.84) | HSP105B BG403660 (0.82) | FOXG1B NM_005249 (0.81) | HSPD1 BE256479 (0.80) | TERA_ NM_021238 (0.80) | DNAJB1 BG537255 (0.79) | HSPE1 NM_002157 (0.79) | FXR1 AI990766 (0.79) | NXT2 AF201942 (0.78) |
| AI811298 | OSR2 | 5 | OAZ AW149417 (0.80) | FXYD1 NM_005031 (0.78) | FBLN2 NM_001998 (0.78) | PMP22 L03203 (0.73) | KIAA0763 AI652645 (0.73) | TEKNM_000459 (0.73) | KIAA0644 NM_014817 (0.72) | GAS7 BE439987 (0.71) | FLJ10159 NM_018013 (0.70) | WNT10B NM_003394 (0.69) |
| U67195 | TIMP3 | 5 | TIMP3 BF347089 (0.91) | TIMP3 AW338933 (0.90) | IL6ST AW242916 (0.88) | IL6ST NM_002184 (0.88) | IL6ST AB015706 (0.82) | HLA-DRB3 AA807056 (0.80) | TIMP3 NM_000362 (0.78) | IL6ST BE856546 (0.76) | HAS1 NM_001523 (0.74) | C6orf133 AB002347 (0.74) |
| NM_020122 | PCMF | 4 | NTPBP AB044661 (0.85) | MGC10882 BC004952 (0.80) | C16orf34 AK023154 (0.77) | FKBP4 NM_002014 (0.77) | PFDN2 NM_012394 (0.76) | FKBP4 AA894574 (0.75) | LDLR NM_000527 (0.75) | STIP1 BE886580 (0.74) | AHSA1 NM_012111 (0.74) | FXR1 NM_005087 (0.73) |
| NM_001998 | FBLN2 | 4 | GAS7 NM_005890 (0.86) | GAS7 BE439987 (0.86) | PMP22 L03203 (0.85) | BMP1 NM_006129 (0.79) | OSR2 AI811298 (0.78) | KIAA0644 NM_014817 (0.77) | FXYD1 NM_005031 (0.77) | WNT10B NM_003394 (0.74) | ZDHHC3 NM_016598 (0.73) | AHNAK BG287862 (0.73) |
| NM_005566 | LDHA | 4 | PLOD2 NM_000935 (0.74) | ALDOA AK026577 (0.72) | ADM NM_001124 (0.70) | PSMA1 NM_002786 (0.69) | ALDOA NM_000034 (0.68) | VDAC1 AL515918 (0.68) | QSCN6 NM_002826 (0.67) | PKM2 NM_002654 (0.67) | PSG3 BC005924 (0.67) | TCP11L1 NM_018393 (0.67) |
| NM_005756 | GPR64 | 3 | ADD3 NM_019903 (0.81) | ADD3 AI818488 (0.79) | SLC4A4 NM_003759 (0.79) | ADD3 AI763123 (0.79) | ADD3 BE545756 (0.78) | CRYAB AF007162 (0.78) | LRRC16 NM_017640 (0.77) | EYA2 U71207 (0.74) | HSPB2 NM_001541 (0.71) | SPRY1 BF508662 (0.71) |
| AL136663 | PLXNA1 | 2 | PCBP2 AW103422 (0.70) | MGC5566 NM_024049 (0.63) | CLIC5 NM_016929 (0.63) | SMAD3 NM_015400 (0.62) | PTPRB NM_002837 (0.62) | SMAD3 BF971416 (0.62) | ICAM2 AA126728 (0.62) | SEMA3G NM_020163 (0.61) | KIAA0417 AB007877 (0.61) | EXT1 NM_000127 (0.60) |
| AL136663 | ABR | 2 | RNMTL1 NM_018146 (0.72) | MFAP4 R72286 (0.62) | ABR AL136663 (0.61) | KIAA1085 AU160676 (0.61) | P2RX4 NM_002560 (0.61) | CYP2E AF182276 (0.60) | LOC51031 AF061730 (0.60) | ZNF212 NM_012256 (0.59) | GSPT2 NM_018094 (0.59) | IDUA NM_000203 (0.59) |
| AL527773 | RARRES2 | 2 | PANX1 NM_015368 (0.91) | CCT8 NM_006585 (0.89) | GART NM_000819 (0.88) | ASMTL Y15521 (0.87) | ASMTL AA669799 (0.87) | ASMTL BC002508 (0.87) | SERPINB7 NM_003784 (0.85) | SERPINB3 AB046400 (0.84) | SERPINB4 U19557 (0.83) | ATP5O NM_001697 (0.83) |
| NM_021106 | RGS3 | 2 | TDO2 NM_005651 (0.84) | MMP13 NM_002427 (0.81) | COL11A1 NM_001854 (0.80) | MMP9 NM_004994 (0.75) | COL11A1 J04177 (0.74) | CLECSF5 NM_013252 (0.72) | HBA2 T50399 (0.72) | SLC19A1 AF004354 (0.72) | MMP11 AI761713 (0.71) | MMP11 NM_005940 (0.71) |
| 13 additional genes | ||||||||||||
The left hand side of the table shows the genes selected by PART-BFCS and the right hand side shows the genes correlated with them. Parenthesized values indicate correlation coefficients.
Figure 3Hierarchical clustering of STS patients by using 145 probes having high correlation with the 15 probes selected by PART-BFCS.