| Literature DB >> 15579197 |
Zhifu Sun1, Ping Yang, Marie-Christine Aubry, Farhad Kosari, Chiaki Endo, Julian Molina, George Vasmatzis.
Abstract
BACKGROUND: Lung cancer remains to be the leading cause of cancer death worldwide. Patients with similar lung cancer may experience quite different clinical outcomes. Reliable molecular prognostic markers are needed to characterize the disparity. In order to identify the genes responsible for the aggressiveness of squamous cell carcinoma of the lung, we applied DNA microarray technology to a case control study. Fifteen patients with surgically treated stage I squamous cell lung cancer were selected. Ten were one-to-one matched on tumour size and grade, age, gender, and smoking status; five died of lung cancer recurrence within 24 months (high-aggressive group), and five survived more than 54 months after surgery (low-aggressive group). Five additional tissues were included as test samples. Unsupervised and supervised approaches were used to explore the relationship among samples and identify differentially expressed genes. We also evaluated the gene markers' accuracy in segregating samples to their respective group. Functional gene networks for the significant genes were retrieved, and their association with survival was tested.Entities:
Mesh:
Year: 2004 PMID: 15579197 PMCID: PMC544571 DOI: 10.1186/1476-4598-3-35
Source DB: PubMed Journal: Mol Cancer ISSN: 1476-4598 Impact factor: 27.401
Figure 1Hierarchical clustering for 15 samples. 2810 probe sets filtered by: standard deviation/mean across all samples > 0.06; and the expression level on the log2 scale ≥ 4.00 in ≥ 60% of the samples. H: indicates high aggressive tumors.
Figure 2Distribution of significant genes from matched analysis. 294 significant genes (P < 0.05) selected by matched analysis are plotted by fold difference (x-axis) vs. p value using t-test (y-axis) A y-axis greater than 1.3 is equivalent to a p value less than 0.05, and greater than 2 is equivalent to a p value less than 0.01. A positive or negative value at the x-axis indicates genes are up or down regulated in the high-aggressive group compared to the low aggressive group.
Figure 3Distribution of significant genes from unmatched analysis. 246 significant genes (p < 0.05) selected by unmatched analysis are plotted by fold difference (x-axis) vs. p value using t-test (y-axis) A y-axis greater than 1.3 is equivalent to a p value less than 0.05, and greater than 2 is equivalent to a p value less than 0.01. A positive or negative value at the x-axis indicates genes are up or down regulated in the high-aggressive group compared to the low aggressive group.
Figure 4Leave-one-out prediction on training samples. The x-axis represents different numbers of significant genes from matched analysis that was used to predict a membership of a sample by the leave-one-out algorithm. The y-axis shows the correct prediction rate for the 10 training samples.
The top 27 unique genes with highest signal-to-noise ratios
| ATPase, Na+/K+ transporting, beta 1 polypeptide | ||||
| tumor protein p53 (Li-Fraumeni syndrome) | ||||
| cytochrome P450, family 26, subfamily A, polypeptide 1 | ||||
| synaptonemal complex protein 2 | ||||
| insulin-like growth factor binding protein 3 | ||||
| coproporphyrinogen oxidase (coproporphyria, harderoporphyria) | ||||
| melanoma antigen, family A, 1 (directs expression of antigen MZ2-E) | ||||
| H1 histone family, member 0 | ||||
| melanoma antigen, family A, 12 | ||||
| * | Homo sapiens clone 23705 mRNA sequence | |||
| * | Homo sapiens cDNA: FLJ21672 fis, clone COL09025. | |||
| Homo sapiens cDNA FLJ39734 fis, clone SMINT2016146. | ||||
| purinergic receptor P2Y, G-protein coupled, 5 | ||||
| hypothetical protein DKFZp586G0123 | ||||
| erythrocyte membrane protein band 4.1-like 3 | ||||
| DKFZP586A0522 protein | ||||
| cysteine sulfinic acid decarboxylase | ||||
| BRCA1-interacting protein 1 | ||||
| v-myc myelocytomatosis viral oncogene homolog (avian) | ||||
| programmed cell death 4 (neoplastic transformation inhibitor) | ||||
| TAF6-like RNA polymerase II, p300/CBP-associated factor (PCAF)-associated factor, 65 kDa | ||||
| ATP-binding cassette, sub-family A (ABC1), member 12 | ||||
| zinc finger protein 198 | ||||
| Notch homolog 2 (Drosophila) | ||||
| Human clone 137308 mRNA, partial cds. | ||||
| CDC-like kinase 1 | ||||
| pogo transposable element with ZNF domain | ||||
Significant in that particular analysis (matched or unmatched)
* No gene symbol for these genes
LDA classification using 27 top genes
| 48521 | -0.52 | 1 | 1 | 0.75 | Yes |
| 48536 | -0.94 | 1 | 1 | 0.87 | Yes |
| 41923 | -0.26 | 1 | 1 | 0.63 | Yes |
| 48549 | -0.52 | 1 | 1 | 0.74 | Yes |
| 44680 | -2.9 | 1 | 1 | 1 | Yes |
| 42613 | 1.0 | 2 | 2 | 0.89 | Yes |
| 76981 | 0.52 | 2 | 2 | 0.75 | Yes |
| 44661 | 2.08 | 2 | 2 | 0.99 | Yes |
| 86043 | -0.19 | 2 | 1 | 0.59 | No |
| 86011 | 1.71 | 2 | 2 | 0.97 | Yes |
| 42616 | 0.12 | ? | 2 | 0.56 | No |
| 48556 | 0.05 | ? | 2 | 0.52 | No |
| 41932 | -0.88 | ? | 1 | 0.86 | Yes |
| 42081 | -0.52 | ? | 1 | 0.75 | Yes |
| 44656 | -0.08 | ? | 1 | 0.54 | Yes |
LD score: Linear discrimination function calculated value for a given sample; Class: a sample's membership to the low aggressive group (1) or high aggressive group (2), "?" is a test sample whose membership is not known for the procedure and needs to be predicted. Prediction: predicted sample membership. Probability: probability of a sample belonging to a given class based on the classifiers. "Correct?": Whether the prediction is correct compared to the true class of a sample.
Gene networks associated with survival
| 17 | 0.01* | 7/15 | |
| 9 | 0.02* | 3/15 | |
| 2 | 0.01* | 4/15 | |
| 2 | 0.03* | 4/15 | |
| 2 | 0.01* | 3/15 | |
| 2 | 0.01* | 0/15 | |
| 1 | 0.004* | 3/15 |
† Genes in bold face are focus genes (among 126 genes submitted to the Ingenuity knowledge base). ‡ Score indicates the probability that a collection of focus genes could be found in a given network by chance. It is the negative logarithm of the possibility. A score of 2 indicates that chance is only 1%. * Indicates a strong association between the expressions of genes in a network and survival.