| Literature DB >> 24851193 |
Chengjie Xiong1, Yan Yan2, Feng Gao1.
Abstract
Two crucial problems arise from a microarray experiment in which the primary objective is to locate differentially expressed genes for the diagnosis of diseases such as cancer and Alzheimer's. The first problem is the detection of a subset of genes which provides an optimum discriminatory power between diseased and normal subjects, and the second problem is the statistical estimation of discriminatory power from the optimum subset of genes between two groups of subjects. We develop a new method to select an optimum subset of discriminatory genes by searching over possible linear combinations of gene expression profiles and locating the one which provides the maximum discriminatory power between two sources of RNA as measured by the area under the receiver operating characteristic (ROC) curve. We further provide an estimate to the optimum discriminatory power between the diseased and the healthy subjects over the selected subsets of genes. The proposed stepwise approach takes in account of the gene-to-gene correlations in the estimation of discriminating power as well as the associated variability and allows the number of genes to be selected based on the increment of the discriminating power. Finally, the proposed methodology is applied to a benchmark microarray experiment and compared to the results obtained through existing approaches in the literature.Entities:
Keywords: Area under curve; Confidence interval estimate; Eigenvalue; Eigenvector; Fisher’s -transformation; Maximum likelihood estimate; Receiver Operating Characteristic (ROC) curve
Year: 2013 PMID: 24851193 PMCID: PMC4026209
Source DB: PubMed Journal: J Biom Biostat
Optimum 15 genes for colon tumor/normal class discrimination
| Gene ID | zAUC to column 1 | Gene ID | zAUC to column 3 |
|---|---|---|---|
| R87126 | 2.76 (0.40) | R87126 | 2.76 (0.40) |
| X55715 | 4.09 (0.83) | M63391 | 2.82 (0.56) |
| T86444 | 4.61 (0.94) | M26383 | 3.12 (0.64) |
| T87527 | 5.24 (1.07) | H08393 | 4.41 (0.97) |
| R62549 | 5.85 (1.21) | X12671 | 4.75 (1.03) |
| L37792 | 6.46 (1.36) | R36977 | 4.77 (1.03) |
| H08393 | 7.90 (1.80) | J02854 | 4.78 (1.04) |
| H16096 | 9.63 (2.21) | J05032 | 4.79 (1.04) |
| X52151 | 10.85 (2.59) | Z50753 | 5.06 (1.14) |
| R88740 | 12.19 (3.04) | M76378 | 5.08 (1.11) |
| D14812 | 15.07 (3.83) | M22382 | 5.23 (1.16) |
| X93510 | 17.48 (4.52) | X63629 | 5.42 (1.24) |
| J04794 | 21.22 (5.41) | M76378 | 5.57 (1.29) |
| U09564 | 25.62 (6.04) | H43887 | 5.60 (1.30) |
| R80966 | 30.04 (7.30) | M36634 | 5.61 (1.29) |
Comparison of top-ranked 25 pairs of genes for colon tumor/normal class discrimination (zAUC= z-transformation of the maximum area under ROC curve)
| Pair | Gene ID | Gene ID | zAUC | |
|---|---|---|---|---|
| 1 | J05032 | Z50753 | ||
| 1 | U19969 | 4.27 (0.79) | H09719 | 4.43 (0.97) |
| 2 | X86693 | X12671 | ||
| 2 | D14812 | 4.20 (0.78) | X12369 | 4.33 (0.89) |
| 3 | M63391 | J05032 | ||
| 3 | H08393 | 4.19 (0.90) | U19969 | 4.27 (0.79) |
| 4 | R87126 | H08393 | ||
| 4 | X63629 | 4.05 (0.84) | M63391 | 4.19 (0.90) |
| 5 | M36634 | J02854 | ||
| 5 | H11084 | 4.05 (0.81) | T57882 | 4.16 (0.91) |
| 6 | X12671 | R87126 | ||
| 6 | Z50753 | 4.04 (0.90) | X55715 | 4.09 (0.83) |
| 7 | H06524 | R36977 | ||
| 7 | U22055 | 3.87 (0.73) | H20709 | 3.90 (0.70) |
| 8 | M76378 | M22382 | ||
| 8 | T62947 | 3.86 (0.70) | R55310 | 3.83 (0.80) |
| 9 | J02854 | M76378 | ||
| 9 | R54097 | 3.85 (0.77) | T62947 | 3.77 (0.72) |
| 10 | Z48541 | X14958 | ||
| 10 | D25217 | 3.67 (0.62) | L06895 | 3.61 (0.65) |
| 11 | D21261 | U30825 | ||
| 11 | H20709 | 3.57 (0.82) | T62878 | 3.52 (0.61) |
| 12 | T90280 | X63629 | ||
| 12 | T51534 | 3.48 (0.62) | M36634 | 3.50 (0.64) |
| 13 | T92451 | T71025 | ||
| 13 | U09587 | 3.46 (0.66) | L11706 | 3.45 (0.68) |
| 14 | H09719 | U09564 | ||
| 14 | L07648 | 3.45 (0.73) | T64467 | 3.43 (0.66) |
| 15 | T51023 | R84411 | ||
| 15 | D31716 | 3.45 (0.61) | M92287 | 3.37 (0.63) |
| 16 | T71025 | M76378 | ||
| 16 | L11706 | 3.45 (0.67) | D00860 | 3.36 (0.63) |
| 17 | X12369 | M26697 | ||
| 17 | R98842 | 3.44 (0.70) | T47424 | 3.34 (0.60) |
| 18 | X14958 | T86749 | ||
| 18 | X87159 | 3.43 (0.66) | M74491 | 3.33 (0.71) |
| 19 | J04102 | M26383 | ||
| 19 | U14631 | 3.34 (0.70) | T47377 | 3.30 (0.59) |
| 20 | M76378 | X54942 | ||
| 20 | D00860 | 3.30 (0.65) | R44301 | 3.28 (0.70) |
| 21 | M26383 | H43887 | ||
| 21 | T47377 | 3.29 (0.59) | U26312 | 3.28 (0.55) |
| 22 | X54942 | M76378 | ||
| 22 | R44301 | 3.28 (0.69) | T56604 | 3.27 (0.58) |
| 23 | M76378 | T86473 | ||
| 23 | T56604 | 3.26 (0.58) | D25217 | 3.20 (0.53) |
| 24 | R96357 | H40095 | ||
| 24 | R46753 | 3.26 (0.66) | H06524 | 3.03 (0.52) |
| 25 | T63133 | T95018 | ||
| 25 | T61661 | 2.66 (0.45) | Z49269 | 2.85 (0.53) |
Annotations of genes appeared in Table 1 and Table 2
| Gene ID | Annotations |
|---|---|
| H20709 | MYOSIN LIGHT CHAIN ALKALI, SMOOTH-MUSCLE ISOFORM (HU- |
| T95018 | 40S RIBOSOMAL PROTEIN S18 (Homo sapiens). |
| T61661 | PROFILIN I (HUMAN). |
| T71025 | Human (HUMAN). |
| T51534 | CYSTATIN C PRECURSOR (HUMAN). |
| X55715 | Human Hums3 mRNA for 40S ribosomal protein s3. |
| T62878 | CYTOCHROME C OXIDASE POLYPEPTIDE IV PRECURSOR |
| D25217 | Human mRNA (KIAA0027) for ORF, partial cds. |
| M26697 | Human nucleolar protein (B23) mRNA, complete cds. |
| D21261 | SM22-ALPHA HOMOLOG (HUMAN);. |
| T51023 | HEAT SHOCK PROTEIN HSP 90-BETA (HUMAN). |
| T63133 | THYMOSIN BETA-10 (HUMAN);. |
| T64467 | P33477 ANNEXIN XI ;. |
| T47424 | INSULIN RECEPTOR SUBSTRATE-1 (Homo sapiens) |
| M76378 | Human cysteine-rich protein (CRP) gene, exons 5 and 6. |
| M63391 | Human desmin gene, complete cds. |
| M76378 | Human cysteine-rich protein (CRP) gene, exons 5 and 6. |
| T87527 | HEAT SHOCK PROTEIN HSP 84 (Mus musculus) |
| T57882 | MYOSIN HEAVY CHAIN, NONMUSCLE TYPE A (Homo sapiens) |
| X14958 | Human hmgI mRNA for high mobility group protein Y. |
| Z50753 | H.sapiens mRNA for GCAP-II/uroguanylin precursor. |
| R80966 | CLATHRIN LIGHT CHAIN B (HUMAN). |
| U30825 | Human splicing factor SRp30c mRNA, complete cds. |
| X87159 | H.sapiens mRNA for beta subunit of epithelial amiloride-sensitive sodium |
| M74491 | Human ADP-ribosylation factor 3 mRNA, complete cds. |
| R87126 | MYOSIN HEAVY CHAIN, NONMUSCLE (Gallus gallus) |
| M22382 | MITOCHONDRIAL MATRIX PROTEIN P1 PRECURSOR (HUMAN). |
| T56604 | TUBULIN BETA CHAIN (Haliotis discus) |
| R46753 | CYCLIN-DEPENDENT KINASE INHIBITOR 1 (Homo sapiens) |
| D14812 | Human mRNA for ORF, complete cds. |
| J04794 | Human aldehyde reductase mRNA, complete cds. |
| L06895 | Homo sapiens antagonizer of myc transcriptional activity (Mad) mRNA, |
| X12671 | Human gene for heterogeneous nuclear ribonucleoprotein (hnRNP) |
| Z48541 | H.sapiens mRNA for protein tyrosine phosphatase. |
| X12369 | TROPOMYOSIN ALPHA CHAIN, SMOOTH MUSCLE (HUMAN). |
| M76378 | Human cysteine-rich protein (CRP) gene, exons 5 and 6. |
| H40095 | MACROPHAGE MIGRATION INHIBITORY FACTOR (HUMAN). |
| R88740 | ATP SYNTHASE COUPLING FACTOR 6, MITOCHONDRIAL |
| Z49269 | H.sapiens gene for chemokine HCC-1. |
| T92451 | TROPOMYOSIN, FIBROBLAST AND EPITHELIAL MUSCLE-TYPE |
| H43887 | COMPLEMENT FACTOR D PRECURSOR (Homo sapiens) |
| T86473 | NUCLEOSIDE DIPHOSPHATE KINASE A (HUMAN) |
| T90280 | RIBOPHORIN II PRECURSOR (HUMAN). |
| R36977 | P03001 TRANSCRIPTION FACTOR IIIA. |
| U09587 | Human glycyl-tRNA synthetase mRNA, complete cds. |
| U09564 | Human serine kinase mRNA, complete cds. |
| R98842 | PROTHYMOSIN ALPHA (Homo sapiens) |
| D31716 | Human mRNA for GC box bindig protein, complete cds. |
| R84411 | SMALL NUCLEAR RIBONUCLEOPROTEIN ASSOCIATED |
| R96357 | POLYADENYLATE-BINDING PROTEIN (Xenopus laevis) |
| R55310 | S36390 MITOCHONDRIAL PROCESSING PEPTIDASE . |
| R62549 | PUTATIVE SERINE/THREONINE-PROTEIN KINASE B0464.5 IN |
| X52151 | Homo sapiens arylsulphatase A mRNA, complete cds. |
| U22055 | Human 100 kDa coactivator mRNA, complete cds. |
| T47377 | S-100P PROTEIN (HUMAN). |
| T62947 | 60S RIBOSOMAL PROTEIN L24 (Arabidopsis thaliana) |
| H09719 | TUBULIN ALPHA-6 CHAIN (Mus musculus) |
| M92287 | Homo sapiens cyclin D3 (CCND3) mRNA, complete cds. |
| U26312 | Human heterochromatin protein HP1Hs-gamma mRNA, partial cds. |
| J02854 | MYOSIN REGULATORY LIGHT CHAIN 2, SMOOTH MUSCLE |
| D00860 | RIBOSE-PHOSPHATE PYROPHOSPHOKINASE I (HUMAN);. |
| R54097 | TRANSLATIONAL INITIATION FACTOR 2 BETA SUBUNIT (HUMAN). |
| X86693 | H.sapiens mRNA for hevin like protein. |
| H11084 | VASCULAR ENDOTHELIAL GROWTH FACTOR (Cavia porcellus) |
| X63629 | H.sapiens mRNA for p cadherin. |
| L37792 | Human syntaxin 1A mRNA, complete cds. |
| T86444 | PROBABLE NUCLEAR ANTIGEN (Pseudorabies virus) |
| M36634 | Human vasoactive intestinal peptide (VIP) mRNA, complete cds. |
| T86749 | Human (clone PSK-J3) cyclin-dependent protein kinase mRNA, complete |
| M26383 | Human monocyte-derived neutrophil-activating protein (MONAP) |
| X54942 | H.sapiens ckshs2 mRNA for Cks1 protein homologue. |
| H16096 | MITOCHONDRIAL PROCESSING PROTEASE BETA SUBUNIT PRECURSOR |
| J05032 | Human aspartyl-tRNA synthetase alpha-2 subunit mRNA, complete |
| H08393 | COLLAGEN ALPHA 2(XI) CHAIN (Homo sapiens) |
| X93510 | H.sapiens mRNA for 37 kDa LIM domain protein. |
| U14631 | Human 11 beta-hydroxysteroid dehydrogenase type II mRNA, |
| H06524 | GELSOLIN PRECURSOR, PLASMA (HUMAN);. |
| L07648 | Human MXI1 mRNA, complete cds. |
| R44301 | MINERALOCORTICOID RECEPTOR (Homo sapiens) |
| U19969 | Human two-handed zinc finger protein ZEB mRNA, partial cds. |
| J04102 | Human erythroblastosis virus oncogene homolog 2 (ets-2) mRNA, |
| L11706 | Human hormone-sensitive lipase (LIPE) gene, complete cds. |