| Literature DB >> 16608523 |
Chaolin Zhang1, Hai-Ri Li, Jian-Bing Fan, Jessica Wang-Rodriguez, Tracy Downs, Xiang-Dong Fu, Michael Q Zhang.
Abstract
BACKGROUND: Prostate cancer is one of the leading causes of cancer illness and death among men in the United States and world wide. There is an urgent need to discover good biomarkers for early clinical diagnosis and treatment. Previously, we developed an exon-junction microarray-based assay and profiled 1532 mRNA splice isoforms from 364 potential prostate cancer related genes in 38 prostate tissues. Here, we investigate the advantage of using splice isoforms, which couple transcriptional and splicing regulation, for cancer classification.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16608523 PMCID: PMC1458362 DOI: 10.1186/1471-2105-7-202
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Prostate tumor and normal samples can be separated into distinct groups. (A) A thumbnail overview of the result of the two-way average-linkage hierarchical clustering of 38 arrays (columns) and 1532 isoforms (rows), as described in ref [30]. (B) Zoom-in view of the array clustering dendrogram. The two array clusters, C1 and C2, are enriched by normal samples and tumor samples, respectively. Cluster C2 is formed by two sub-clusters, reflecting differences in tumor percentage and stroma. (C-E) Isoform signatures up- or down-regulated in different array clusters. (F and G) The result of SVD. (F) The percentage of variation (y-axis) captured by each principal component (x-axis). (G) The low dimensional projection of arrays in the 3D space spanned by the first three principal components. SVD identified the same hierarchical structure as revealed by hierarchical clustering.
Figure 2Profiling splice isoforms provides additional useful information for prostate cancer classification. (A) The validity of estimating the overall mRNA abundance level from the isoform abundance level. The overall mRNA level was estimated by summing up the abundances of individual isoforms for each gene. The estimated mRNA abundances of 107 genes were compared with direct measurements by an independent expression microarray design (described in main text). Plotted are the scatter-plot of log expression ratios of these genes in two prostate cancer cell lines, LNCaP and PC-3. These two approaches show good agreement (R= 0.80, p = 2.2e-16). (B) 159 genes out of 364 profiled genes in the DASL assay exhibit differential expression between tumors and normal samples at the overall mRNA level (q-value = 0.05). Most of them (92%) have isoforms with significant differential expression. (C and D) 464 isoforms from 222 genes are reported as being differentially expressed between tumors and normal tissues (q-value = 0.05), which may be prostate cancer marker candidates. 32% of these genes (corresponding to 22% significant isoforms) do not show differential expression at the overall mRNA level, therefore can not be detected by conventional microarrays.
Figure 3The performance is measured by leave-one-out cross validation. To get unbiased result, the variable selection and training are done in training arrays, which is completely independent with the testing array. (A) The comparison in classification performance of SVM-RFE selected variables using individual isoforms and the overall mRNAs. (B) The comparison in classification performance of variable subsets selected by SVM-RFE and t-test, using individual isoforms.
Top prostate cancer marker candidates selected by both t-test and SVM-RFE.
| ALDH1A2-0004 | -1.21 | 1.3E-04 | 35 | Aldehyde dehydrogenase 1A2 |
| AMACR-2094 | 1.41 | 6.7E-05 | 38 | Alpha-methylacyl-CoA racemase |
| AMACR-2097 | 1.08 | 9.2E-04 | 38 | |
| AMACR-2098 | 0.99 | 1.8E-03 | 17 | |
| ANXA2-0914 | -1.04 | 1.8E-03 | 36 | Annexin A2 |
| APBB3-0185 | 1.01 | 1.5E-03 | 38 | Amyloid beta (A4) precursor protein-binding family B member 3 |
| BC008967-0877 | -1.38 | 7.9E-05 | 26 | |
| C21ORF5-0239 | 1.24 | 6.0E-04 | 35 | Chromosome 21 open reading frame 5 |
| C7ORF24-0062 | 1.30 | 8.4E-05 | 17 | |
| CALCR-1180 | 1.05 | 5.2E-04 | 37 | Calcitonin receptor |
| CCT8-0334 | 1.21 | 1.5E-04 | 32 | Protein with high similarity to C. elegans Y55F3AR.3 |
| CDC42BPA-1048 | -1.19 | 6.0E-04 | 38 | CDC42 binding protein kinase alpha |
| CDK7-0899 | 1.35 | 8.4E-05 | 37 | Cyclin-dependent protein kinase 7 |
| CES1-0937 | -1.34 | 7.9E-05 | 32 | Cat eye syndrome chromosome region candidate 1 |
| CLU-0197 | -1.11 | 1.2E-03 | 38 | Clusterin (apolipoprotein J) |
| EDNRB-1187 | -1.24 | 4.7E-04 | 26 | Endothelin type B receptor |
| FGFR2-0094 | -1.13 | 4.0E-04 | 19 | Fibroblast growth factor receptor 2 |
| FGFR2-0099 | -1.03 | 7.7E-04 | 28 | |
| HEBP2-0472 | 1.08 | 7.8E-04 | 24 | Heme binding protein 2 (placental protein 23) |
| HSPD1-0152 | 1.10 | 1.8E-03 | 37 | Chaperonin 60 |
| HSPD1-0154 | 1.17 | 2.8E-04 | 31 | |
| IGSF4-0722 | 0.72 | 2.1E-03 | 38 | Immunoglobulin superfamily member 4 |
| IMPDH2-0144 | 1.25 | 1.3E-04 | 34 | Inosine monophosphate dehydrogenase type 2 |
| IQGAP2-0234 | 1.17 | 5.6E-04 | 22 | IQ motif containing GTPase activating protein 2 |
| LAMR1-0523 | 1.20 | 1.3E-04 | 38 | Laminin receptor 1 |
| LTBP4-0746 | -1.27 | 1.5E-04 | 33 | Latent transforming growth factor beta binding protein 4 |
| LTBP4-0748 | -1.10 | 1.4E-03 | 38 | |
| LYPLA1-0860 | 1.38 | 7.9E-05 | 35 | Lysophospholipase 1 |
| NELL2-0805 | -1.10 | 1.2E-03 | 24 | Nel-like 2 |
| PGR-1166 | -1.16 | 4.0E-04 | 32 | Progesterone receptor |
| PGR-1555 | 0.85 | 7.5E-04 | 38 | |
| PPIB-0969 | 0.94 | 2.2E-03 | 34 | Cyclophilin B |
| PTS-0059 | -1.07 | 2.2E-03 | 31 | 6-pyruvoyltetrahydropterin synthase |
| PYCR1-0058 | 1.28 | 4.1E-04 | 38 | Pyrroline-5-carboxylate reductase 1 |
| RING1-0217 | -0.93 | 1.7E-03 | 22 | Ring finger protein 1 |
| SFRS10-1126 | 0.95 | 2.0E-03 | 34 | Splicing factor arginine/serine rich 10 |
| SMPDL3B-2030 | 1.09 | 2.2E-04 | 38 | Protein containing a calcineurin-like phosphoesterase domain |
| STAC-1044 | -1.31 | 7.9E-05 | 34 | Src homology three and cysteine rich domain |
| TGFB2-0085 | -1.11 | 6.5E-04 | 38 | Transforming growth factor beta 2 |
| TRIM29-1350 | -1.29 | 1.5E-04 | 35 | Ataxia telangiectasia mutated |
| TRIM29-1353 | -1.20 | 1.7E-04 | 34 | |
| TXNIP-1116 | 1.09 | 1.3E-03 | 38 | Thioredoxin interacting protein |
¶ detail information of each isoform, such as the exon junction and probe design, can be accessed at the MAASE database [48];
# FDR is calculated using all 38 samples;
§SVM-RFE freq.: the number of times that an isoform is included in 38 selected subsets in leave-one-out cross validation.
Pathological information of tumor and normal prostate samples
| T5 | 67 | low | 50 | 0 | 0 | 20 | 0 | 8.48 | 5 + 4 = 9 | T3bN1Mx |
| T21 | 74 | Low | 60 | 10 | 10 | 20 | 0 | 6.7 | 4+4 = 8 | T2bNxMx |
| N22 | 74 | Low | 0 | 10 | 40 | 50 | 0 | 6.7 | T2bNxMx | |
| N30 | 55 | Int | 0 | 10 | 30 | 68 | 0 | 11.68 | T2bN1Mx | |
| N44 | 61 | low | 0 | 10 | 2 | 88 | 0 | 5.46 | T2cNxMx | |
| N46 | 74 | High | 0 | 45 | 20 | 35 | 0 | 8.06 | T2aNxMx | |
| N56 | 67 | High | 0 | 5 | 0 | 94 | 0 | 5.7 | T2aN0Mx | |
| T72 | 68 | Int | 70 | 0 | 0 | 30 | 0 | 8.27 | 4+3 = 7 | T3bN1Mx |
| N77 | 66 | Int | 0 | 0 | 10 | 89 | 1 | 3.15 | T2cNxMx | |
| T78 | 66 | Int | 35 | 5 | 5 | 55 | 0 | 3.15 | 3+4 = 7 | T2cNxMx |
| T84 | 60 | high | 70 | 5 | 0 | 25 | 0 | 9.99 | 4+5 = 9 | T3bN0Mx |
| N85 | 66 | Int | 0 | 30 | 0 | 70 | 0 | 4.37 | T3bN0Mx | |
| T86 | 66 | Int | 90 | 5 | 0 | 5 | 0 | 4.37 | 4+4 = 8 | T3bN0Mx |
| T87 | 61 | High | 25 | 45 | 5 | 25 | 0 | 2.23 | 4+3 = 7 | T2bN0Mx |
| N88 | 61 | High | 0 | 10 | 30 | 60 | 0 | 2.23 | T2bN0Mx | |
| T107 | 68 | Int | 60 | 10 | 0 | 30 | 0 | 7.4 | 4+3 = 7 | T2bNxMx |
| N109 | 67 | Low | 0 | 5 | 0 | 90 | 5 | 7 | T2bNxMx | |
| T110 | 67 | Low | 40 | 0 | 0 | 58 | 0 | 7 | 3+4 = 7 | T2bNxMx |
| N113 | 70 | Low | 0 | 10 | 5 | 85 | 0 | 4.78 | T3aNxMx | |
| T114 | 70 | Low | 40 | 0 | 5 | 55 | 0 | 4.78 | 4+4 = 8 | T3aNxMx |
| N121 | 50 | 0 | 30 | 2 | 68 | 0 | 0.22 | |||
| T122 | 67 | Low | 70 | 0 | 5 | 25 | 0 | 7 | 3+4 = 7 | T2bNxMx |
| T123 | 78 | 80 | 0 | 0 | 20 | 0 | 17.7 | 5+5 = 10 | NR | |
| N133 | 0 | 25 | 5 | 75 | 0 | |||||
| T147 | 78 | Int | 70 | 0 | 0 | 30 | 0 | 6.9 | 4+4 = 8 | T2bNoMx |
| N148 | 67 | Low | 0 | 35 | 10 | 55 | 0 | 4.68 | T2aNxMx | |
| N155 | 70 | Int | 0 | 40 | 10 | 48 | 2 | 8.4 | T2cNxMx | |
| T167 | 72 | Int | 80 | 0 | 10 | 10 | 0 | 18 | 4+4 = 8 | T2bNoMx |
| T174 | 83 | high | 70 | 5 | 0 | 25 | 0 | 15 | 5+4 = 9 | T4 |
| T177 | 67 | Int | 40 | 0 | 30 | 30 | 0 | 10.87 | 4+4 = 8 | T2cNoMx |
| T189 | 77 | N/A | 70 | 0 | 0 | 0 | 30 | 2.51 | 5+5 = 10 | T2bN2Mx |
| T192 | 61 | Int | 50 | 5 | 10 | 35 | 0 | 5.7 | 4+4 = 8 | T3aNxMx |
| N196 | 73 | low | 0 | 40 | 5 | 55 | 0 | 4.59 | T2bNxMx | |
| T197 | 67 | high | 95 | 0 | 0 | 5 | 0 | 21.82 | 4+4 = 8 | T3aN1Mx |
| T198 | 60 | 60 | 0 | 10 | 25 | 0 | 4.06 | 4+4 = 8 | T3bNxMx | |
| N201 | 64 | 0 | 20 | 5 | 45 | 0 | UNK | T2bNxMx | ||
| T202 | 67 | Int | 90 | 0 | 5 | 5 | 0 | 12.34 | 4+4 = 8 | T3bNxMx |
| T204 | 54 | low | 80 | 0 | 5 | 15 | 0 | 3.91 | 4+5 = 9 | T3cNxMx |