| Literature DB >> 16780590 |
Anirban P Mitra1, Arpit A Almal, Ben George, David W Fry, Peter F Lenehan, Vincenzo Pagliarulo, Richard J Cote, Ram H Datar, William P Worzel.
Abstract
BACKGROUND: Previous studies on bladder cancer have shown nodal involvement to be an independent indicator of prognosis and survival. This study aimed at developing an objective method for detection of nodal metastasis from molecular profiles of primary urothelial carcinoma tissues.Entities:
Mesh:
Year: 2006 PMID: 16780590 PMCID: PMC1550424 DOI: 10.1186/1471-2407-6-159
Source DB: PubMed Journal: BMC Cancer ISSN: 1471-2407 Impact factor: 4.430
Figure 1Marker panel employed for standardized competitive RT-PCR analysis. A total of 70 genes involved in eight broad pathways commonly deregulated in cancer were chosen for this study. The primary effector pathways of tumorigenesis encompass apoptosis, cell cycle, gene regulation, cell growth regulation and anti-oxidation, and are comprised of 57 genes. There is a significant overlap of markers among the first three pathways. The secondary effector pathways include signal transduction, angiogenesis and invasion, and are comprised of 13 genes. All the listed genes exert stimulatory, inhibitory and/or regulatory effects on their respective pathway(s).
Distribution of the study population on the basis of nodal positivity and tumor stage.
| Normal controls | 3 | 3 | 2 | 2 | ||
| pTa | 0 | 3 | 3 | 0 | 7 | 7 |
| pT1 | 2 | 6 | 8 | 1 | 4 | 5 |
| pT2 | 0 | 4 | 4 | 0 | 4 | 4 |
| pT3 | 7 | 5 | 12 | 7 | 3 | 10 |
| pT4 | 2 | 2 | 4 | 2 | 1 | 3 |
| 11 | 23 | 10 | 21 | |||
The total cohort of 65 subjects included five normal controls that were classified as node negative. An approximately equal distribution of the subjects was attempted between both sets in all tumor stages and nodal classes to eliminate bias. Tumor and nodal stages was determined according to the American Joint Committee on Cancer recommended TNM system for urinary bladder cancer (2002).
Figure 2The genetic programming process. This iterative technique was employed on the training set samples to generate classifier rules that were tested on the validation set. Randomly chosen components were initially used to create a population of candidate programs from which a small mating pool of candidate programs was generated. Inputs were passed into these programs and the predicted nodal statuses were evaluated for fitness. The two best performing programs were then mated to produce offspring that replaced the two least fit programs. This process was repeated over many generations to create better programs.
Final meta-rule for node positive patients generated from the set of 70 genes.
| 1 | exp(exp( |
| 2 | ( |
| 3 | ( |
| 4 | |
| 5 | ( |
| 6 | ( |
| 7 | ( |
| 8 | |
| 9 | ( |
| 10 | |
| 11 |
Performance of the selected meta-rule generated from the set of 70 genes on the validation set and result metrics.
| 6§ | 2 | |
| 4 | 19† |
§ True positive subjects.
† True negative subjects.
Accuracy: 81%
Sensitivity: 60%
Specificity: 90%
Positive Predictive Value: 75%
Negative Predictive Value: 83%
Figure 3Histogram of Gene Usage Frequencies. Examination of the gene usage frequencies among the best of 220 rules drawn from 20 runs of 11 folds showed a strong preference for the KDR, MAP2K6 and ICAM1 genes which were also components of some of the major gene expression motifs. Rules created using only the top three genes showed a comparatively better performance, indicating their importance in the genesis of nodal metastasis.
Probability of gene usage from the set of 70 genes due to random chance.
| 159 | 9.69E-130 | <0.00001 | |
| 146 | 1.13E-110 | <0.00001 | |
| 121 | 4.10E-78 | <0.00001 | |
| 60 | 7.04E-20 | <0.00001 | |
| 56 | 3.38E-17 | <0.00001 | |
| 55 | 1.49E-16 | <0.00001 | |
| 49 | 6.56E-13 | <0.00001 | |
| 41 | 1.08E-08 | <0.00001 | |
| 24 | 1.11E-02 | 0.01490 | |
| 23 | 1.76E-02 | 0.02599 | |
| 19 | 6.73E-02 | 0.16020 | |
| 18 | 8.23E-02 | 0.22749 | |
| 15 | 1.04E-01 | 0.49264 | |
| 15 | 1.04E-01 | 0.49264 | |
| 12 | 7.05E-02 | 0.20295 | |
| 11 | 5.26E-02 | 0.13246 |
Number of genes per rule = 5 (approximately)
Arranged in decreasing order of their frequencies of occurrence in 220 rules, the genes show a general trend towards increasing probability of being selected in a rule by random chance. The postulated frequency of occurrence of each gene if all have equal probabilities of being selected is 15.71. PTGS2 and PDGFB have smaller probabilities than the two genes preceding them because they were actually used less frequently than random chance would suggest.
Performance of the meta-rule generated using the three most frequently used genes, viz. KDR, MAP2K6 and ICAM1, on the validation set and result metrics.
| 7§ | 3 | |
| 3 | 18† |
§ True positive subjects
† True negative subjects
Accuracy: 81%
Sensitivity: 70%
Specificity:86%
Positive Predictive Value: 70%
Negative Predictive Value: 86%
Advantages of genetic programming.
| Statistical Analysis | Yes | Limited | No | Limited |
| Cluster Analysis | Yes | No | No | No |
| Support Vector Machine | No | No | No | Yes |
| Neural Networks | No | No | No | Yes |
| Yes | Yes | Yes | Yes |