| Literature DB >> 19393097 |
Hideaki Mizuno1, Kunio Kitada, Kenta Nakai, Akinori Sarai.
Abstract
BACKGROUND: In cancer research, the association between a gene and clinical outcome suggests the underlying etiology of the disease and consequently can motivate further studies. The recent availability of published cancer microarray datasets with clinical annotation provides the opportunity for linking gene expression to prognosis. However, the data are not easy to access and analyze without an effective analysis platform. DESCRIPTION: To take advantage of public resources in full, a database named "PrognoScan" has been developed. This is 1) a large collection of publicly available cancer microarray datasets with clinical annotation, as well as 2) a tool for assessing the biological relationship between gene expression and prognosis. PrognoScan employs the minimum P-value approach for grouping patients for survival analysis that finds the optimal cutpoint in continuous gene expression measurement without prior biological knowledge or assumption and, as a result, enables systematic meta-analysis of multiple datasets.Entities:
Year: 2009 PMID: 19393097 PMCID: PMC2689870 DOI: 10.1186/1755-8794-2-18
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Dataset content from PrognoScan
| GSE13507 | Bladder cancer | Transitional cell carcinoma | Cheongju | Kim | Human-6 v2 | n = 165 | GEO |
| GSE5287 | Bladder cancer | Aarhus (1995–2004) | Als | HG-U133A | n = 30 | GEO | |
| GSE12417-GPL570 | Blood cancer | AML | AMLCG (2004) | Metzeler | HG-U133_Plus_2 | n = 79 | GEO |
| GSE12417-GPL96 | Blood cancer | AML | AMLCG (1999–2003) | Metzeler | HG-U133A | n = 163 | GEO |
| GSE12417-GPL97 | Blood cancer | AML | AMLCG (1999–2003) | Metzeler | HG-U133B | n = 163 | GEO |
| GSE8970 | Blood cancer | AML | San Diego | Raponi | HG-U133A | n = 34 | GEO |
| GSE4475 | Blood cancer | B-cell lymphoma | Berlin (2003–2005) | Hummel | HG-U133A | n = 158 | GEO |
| E-TABM-346 | Blood cancer | DLBCL | GELA (1998–2000) | Jais | HG-U133A | n = 53 | ArrayExpress |
| GSE2658 | Blood cancer | Multiple myeloma | Arkansas | Zhan | HG-U133_Plus_2 | n = 559 | GEO |
| E-TABM-158 | Breast cancer | UCSF, CPMC (1989–1997) | Chin | HG-U133A | n = 129 | ArrayExpress | |
| GSE11121 | Breast cancer | Mainz (1988–1998) | Schmidt | HG-U133A | n = 200 | GEO | |
| GSE1378 | Breast cancer | MGH (1987–2000) | Ma | Arcturus 22 k | n = 60 | GEO | |
| GSE1379 | Breast cancer | MGH (1987–2000) | Ma | Arcturus 22 k | n = 60 | GEO | |
| GSE1456-GPL96 | Breast cancer | Stockholm (1994–1996) | Pawitan | HG-U133A | n = 159 | GEO | |
| GSE1456-GPL97 | Breast cancer | Stockholm (1994–1996) | Pawitan | HG-U133B | n = 159 | GEO | |
| GSE2034 | Breast cancer | Rotterdam (1980–1995) | Wang | HG-U133A | n = 286 | GEO | |
| GSE2990 | Breast cancer | Uppsala, Oxford | Sotiriou | HG-U133A | n = 187 | GEO | |
| GSE3143 | Breast cancer | Duke | Bild | HG-U95A | n = 158 | GEO | |
| GSE3494-GPL96 | Breast cancer | Uppsala (1987–1989) | Miller | HG-U133A | n = 236 | GEO | |
| GSE3494-GPL97 | Breast cancer | Uppsala (1987–1989) | Miller | HG-U133B | n = 236 | GEO | |
| GSE4922-GPL96 | Breast cancer | Uppsala (1987–1989) | Ivshina | HG-U133A | n = 249 | GEO | |
| GSE4922-GPL97 | Breast cancer | Uppsala (1987–1989) | Ivshina | HG-U133B | n = 249 | GEO | |
| GSE6532-GPL570 | Breast cancer | GUYT | Loi | HG-U133_Plus_2 | n = 87 | GEO | |
| GSE7378 | Breast cancer | UCSF | Zhou | U133AAofAv2 | n = 54 | GEO | |
| GSE7390 | Breast cancer | Uppsala, Oxford, Stockholm, IGR, GUYT, CRH (1980–1998) | Desmedt | HG-U133A | n = 198 | GEO | |
| GSE7849 | Breast cancer | Duke (1990–2001) | Anders | HG-U95A | n = 76 | GEO | |
| GSE9195 | Breast cancer | GUYT2 | Loi | HG-U133_Plus_2 | n = 77 | GEO | |
| GSE9893 | Breast cancer | Montpellier, Bordeaux, Turin (1989–2001) | Chanrion | MLRG Human 21 K V12.0 | n = 155 | GEO | |
| GSE11595 | Esophagus cancer | Adenocarcinoma | Sutton | Giddings | CRUKDMF_22 K_v1.0.0 | n = 34 | GEO |
| GSE7696 | Glioma | Glioblastoma | Lausanne | Murat | HG-U133_Plus_2 | n = 70 | GEO |
| GSE4271-GPL96 | Glioma | MDA | Phillips | HG-U133A | n = 77 | GEO | |
| GSE4271-GPL97 | Glioma | MDA | Phillips | HG-U133B | n = 77 | GEO | |
| GSE2837 | Head and neck cancer | Squamous cell carcinoma | VUMC, VAMC, UTMDACC (1992–2005) | Chung | U133_X3P | n = 28 | GEO |
| HARVARD-LC | Lung cancer | Adenocarcinoma | Harvard | Beer | HG-U95A | n = 84 | Author's web site |
| MICHIGAN-LC | Lung cancer | Adenocarcinoma | Michigan (1994–2000) | Beer | HuGeneFL | n = 86 | Author's web site |
| GSE11117 | Lung cancer | NSCLC | Basel | Baty | Novachip human 34.5 k | n = 41 | GEO |
| GSE3141 | Lung cancer | NSCLC | Duke | Bild | HG-U133_Plus_2 | n = 111 | GEO |
| GSE4716-GPL3694 | Lung cancer | NSCLC | Nagoya (1995–1996) | Tomida | GF200 | n = 50 | GEO |
| GSE4716-GPL3696 | Lung cancer | NSCLC | Nagoya (1995–1996) | Tomida | GF201 | n = 50 | GEO |
| GSE8894 | Lung cancer | NSCLC | Seoul | Son | HG-U133_Plus_2 | n = 138 | GEO |
| GSE4573 | Lung cancer | Squamous cell carcinoma | Michigan (1991–2002) | Raponi | HG-U133A | n = 129 | GEO |
| DUKE-OC | Ovarian cancer | Duke | Bild | HG-U133A | n = 134 | Author's web site | |
| GSE8841 | Ovarian cancer | Milan | Mariani | G4100A | n = 83 | GEO | |
| E-DKFZ-1 | Renal cell carcinoma | RZPD | Sueltmann | A-RZPD-20 | n = 74 | ArrayExpress |
Abbreviations: AML, Acute myelocytic leukemia; DLBCL, Diffuse large B-cell lymphoma; NSCLC, Non-small cell lung cancer
Figure 1PrognoScan screenshot and sample search results (part 1). (A) The top page is quite simple and only requires entering the gene identifier(s). (B) Summary table for MKI67, shown here in part (See Additional file 1 for the full table.). Column headings include dataset, cancer type, subtype, endpoint, cohort, contributor, array type, probe ID, number of patients, optimal cutpoint, Pmin and Pcor. A statistically significant value of Pcor is given in red font. Each dataset has a link to the public domain where the raw data is archived. By clicking a probe ID in the summary table, a detailed report for the test is displayed. The table can be downloaded in a tab delimited file from the button at bottom.
Figure 2PrognoScan screenshot and sample search results (part 2). (A) Annotation table. Row headings are color-coded. For example, headings of details such as therapy history, sample type and pathological parameters are highlighted in yellow and basic attributes in blue. (B) Expression plot. Patients are ordered by the expression values of the given gene. The X-axis represents the accumulative number of patients and the Y-axis represents the expression value. Straight lines (cyan) show the optimal cutpoints that dichotomize patients into high (red) and low (blue) expression groups. (C) Expression histogram. The distribution of the expression value is presented where the X-axis represents the number of patients and the Y-axis represents the expression value on the same scale as the expression plot. The line of the optimal cutpoint is also shown (cyan). (D) P-value plot. For each potential cutpoint of expression measurement, patients are dichotomized and survival difference between high and low expression groups is calculated by log-rank test. The X-axis represents the accumulative number of patients on the same scale as the expression plot and the Y-axis represents raw P-values on a log scale. The cutpoint to minimize the P-value is determined and indicated by the cyan line. The gray line indicates the 5% significance level. (E) Kaplan-Meier plot. Survival curves for high (red) and low (blue) expression groups dichotomized at the optimal cutpoint are plotted. The X-axis represents time and the Y-axis represents survival rate. 95% confidence intervals for each group are also indicated by dotted lines.
Figure 3Kaplan-Meier plots for high and low SIX1-expressing groups in breast cancers.
Figure 4Kaplan-Meier plots for high and low MCTS1-expressing groups in breast, lung, blood and brain cancers.