| Literature DB >> 17914110 |
Krzysztof Fujarewicz1, Michal Jarzab, Markus Eszlinger, Knut Krohn, Ralf Paschke, Małgorzata Oczko-Wojciechowska, Małgorzata Wiench, Aleksandra Kukulska, Barbara Jarzab, Andrzej Swierniak.
Abstract
Selection of novel molecular markers is an important goal of cancer genomics studies. The aim of our analysis was to apply the multivariate bioinformatical tools to rank the genes - potential markers of papillary thyroid cancer (PTC) according to their diagnostic usefulness. We also assessed the accuracy of benign/malignant classification, based on gene expression profiling, for PTC. We analyzed a 180-array dataset (90 HG-U95A and 90 HG-U133A oligonucleotide arrays), which included a collection of 57 PTCs, 61 benign thyroid tumors, and 62 apparently normal tissues. Gene selection was carried out by the support vector machines method with bootstrapping, which allowed us 1) ranking the genes that were most important for classification quality and appeared most frequently in the classifiers (bootstrap-based feature ranking, BBFR); 2) ranking the samples, and thus detecting cases that were most difficult to classify (bootstrap-based outlier detection). The accuracy of PTC diagnosis was 98.5% for a 20-gene classifier, its 95% confidence interval (CI) was 95.9-100%, with the lower limit of CI exceeding 95% already for five genes. Only 5 of 180 samples (2.8%) were misclassified in more than 10% of bootstrap iterations. We specified 43 genes which are most suitable as molecular markers of PTC, among them some well-known PTC markers (MET, fibronectin 1, dipeptidylpeptidase 4, or adenosine A1 receptor) and potential new ones (UDP-galactose-4-epimerase, cadherin 16, gap junction protein 3, sushi, nidogen, and EGF-like domains 1, inhibitor of DNA binding 3, RUNX1, leiomodin 1, F-box protein 9, and tripartite motif-containing 58). The highest ranking gene, metallophosphoesterase domain-containing protein 2, achieved 96.7% of the maximum BBFR score.Entities:
Mesh:
Year: 2007 PMID: 17914110 PMCID: PMC2216417 DOI: 10.1677/ERC-06-0048
Source DB: PubMed Journal: Endocr Relat Cancer ISSN: 1351-0088 Impact factor: 5.678
Figure 1Accuracy of bootstrapping-estimated benign–malignant classification for different gene set sizes. The 95% confidence interval is marked by dashed lines.
Figure 2Accuracy of classification obtained by successive gene set reduction. The accuracy of the best 500 genes was evaluated in one iteration using the bootstrap technique, then the selected 500-gene set was removed from the whole dataset, and the next 500 genes were selected in the following iteration. This procedure was repeated seven times, thus 3500 genes were excluded (line no. 8). To speed up the procedure, only neighbourhood analysis (NA) was used for gene selection.
Figure 3Result of bootstrap-based feature ranking (BBFR). Each dot represents one gene, dashed lines define the subset of 43 genes with BBFR score larger than half of the maximum one (black dots).
Ranking of papillary thyroid cancer (PTC) genes as assessed by bootstrap-based feature ranking (BBFR) approach. For each transcript selected, rank and score obtained by the BBFR method are given, together with basic univariate statistics (log2 mean and log2 ratio)
| MPPED2 | Metallophosphoesterase domain-containing protein 2 | 205413_at | 1 | 48 449 | 4.66 | 7.82 | −3.16 | −3.46 | −3.26 | Fetal brain protein of unknown function | |||
| H/HBA2 | Hemoglobin, α-1/hemoglobin, α-2 | 209458_x_at | 2 | 45 521 | 9.79 | 12.04 | −2.25 | −2.28 | −1.87 | Oxygen transport | |||
| MET | Met proto-oncogene (hepatocyte growth factor receptor) | 213807_x_at | 3 | 45 363 | 8.15 | 5.22 | 2.93 | 1.73 | 2.60 | Membrane tyrosine kinase receptor enhances cell motility, invasiveness, and chemokine production ( | |||
| FN1 | Fibronectin 1 | 210495_x_at | 4 | 44 017 | 12.24 | 8.72 | 3.52 | 2.75 | 3.91 | Extracellular matrix glycoprotein participates in cell adhesion, regulates proliferation and survival of thyroid cells via integrin receptors ( | |||
| GALE | UDP-galactose-4-epimerase | 202528_at | 5 | 43 974 | 7.12 | 3.70 | 3.42 | 2.41 | 3.50 | Converts glucose to galactose and | |||
| QPCT | Glutaminyl-peptide cyclotransferase (glutaminyl cyclase) | 205174_s_at | 6 | 43 317 | 7.60 | 4.99 | 2.61 | 3.14 | 2.56 | Converts glutaminyl peptides to cyclic pyroglutamyl ones | |||
| NELL2 | NEL-like 2 (chicken) | 203413_at | 7 | 42 953 | 9.68 | 7.56 | 2.12 | 2.02 | 2.53 | Brain protein with six EGF-like repeats | |||
| PGCP | Plasma glutamate carboxypeptidase | 203501_at | 8 | 42 153 | 7.70 | 9.23 | −1.53 | −1.05 | −1.33 | Breakdown of secreted peptides, homologous to prostate membrane-specific antigen ( | |||
| DPP4 | Dipeptidylpeptidase 4 (CD26, adenosine deaminase complexing protein 2) | 203717_at | 9 | 42 115 | 7.87 | 3.77 | 4.11 | 3.21 | 3.81 | Membrane enzyme, participates in breakdown of secreted peptides | |||
| ADORA1 | Adenosine A1 receptor | 205481_at | 10 | 41 699 | 7.16 | 4.85 | 2.30 | 2.00 | 2.81 | Membrane receptor, stimulates motility and modulates proliferation | |||
| HMGA2 | High-mobility group AT-hook 2 | 208025_s_at | 11 | 40 713 | 7.90 | 4.66 | 3.24 | 3.58 | 2.62 | Architectural transcription factor ( | |||
| RYR1 | Ryanodine receptor 1 (skeletal) | 205485_at | 12 | 40 473 | 6.96 | 4.70 | 2.27 | 2.53 | 1.92 | Present mainly in excitable cells | Calcium release channel of the sarcoplasmic reticulum | ||
| CDH16 | Cadherin 16, KSP-cadherin | 206517_at | 13 | 39 770 | 3.47 | 8.07 | −4.60 | −4.68 | −1.43 | Thought to be kidney specific ( | Calcium-dependent, membrane-associated glycoprotein, participates in cell adhesion | ||
| GJB3 | Gap junction protein β-3, 31 kDa (connexin 31) | 205490_x_at | 14 | 39 526 | 6.49 | 4.04 | 2.44 | 2.71 | 0.62 | Does not normally appear in thyroid, in adult mouse becomes restricted to epidermis, testis and placenta ( | Forms incompatible hemichannels with thyroidal connexin 43 ( | ||
| EMID1 | EMI domain containing 1 | 213779_at | 15 | 39 505 | 6.44 | 8.12 | −1.68 | −1.09 | −0.76 | Extracellular matrix protein, able to promote cell movements ( | |||
| NRIP1 | Nuclear receptor-interacting protein 1 | 202599_s_at | 16 | 39 358 | 8.31 | 6.33 | 1.98 | 1.36 | 2.06 | Interacts with nuclear receptors | |||
| MET | Met proto-oncogene (hepatocyte growth factor receptor) | 211599_x_at | 17 | 39 348 | 8.44 | 5.68 | 2.76 | 1.53 | 2.54 | See the information given above for another probeset of the same gene | |||
| DTX4 | Deltex 4 homolog ( | 212611_at | 18 | 39 298 | 10.24 | 8.24 | 2.00 | 2.07 | 1.45 | Participates in protein ubiquination | |||
| RAB27A | RAB27A, member | 210951_x_at | 19 | 38 913 | 8.62 | 5.62 | 3.00 | 1.60 | 1.24 | Prenylated membrane bound protein with GTP-ase function | |||
| – | CDNA clone IMAGE:4152983 | 214803_at | 20 | 37 397 | 7.30 | 5.39 | 1.90 | 1.94 | 1.23 | Not identified | |||
| BCL2 | B-cell CLL/lymphoma 2 | 203684_s_at | 21 | 36 483 | 2.92 | 5.88 | −2.95 | −2.74 | −1.39 | Anti-apoptotic protein | |||
| TACSTD2 | Tumor-associated calcium signal transducer 2 | 202286_s_at | 22 | 36 170 | 10.42 | 6.29 | 4.13 | 4.02 | 4.02 | May serve as cell surface receptor | |||
| DIO1 | Deiodinase, iodothyronine, type I | 206457_s_at | 23 | 35 971 | 6.19 | 9.94 | −3.75 | −3.79 | −4.33 | 5′ Deiodination of thyroxine | |||
| ITPR1 | Inositol 1,4,5-triphosphate receptor, type 1 | 203710_at | 24 | 34 804 | 6.75 | 8.84 | −2.09 | −2.06 | −1.83 | Signal transducer coupled with calcium channels, participates in apoptosis ( | |||
| HBB | Hemoglobin β | 209116_x_at | 25 | 34 591 | 9.62 | 12.13 | −2.51 | −2.48 | −1.24 | See above the | |||
| SNED1 | Sushi, nidogen, and EGF-like domains 1 | 213493_at | 26 | 33 625 | 2.87 | 5.88 | −3.01 | −2.14 | −2.06 | Participates in cell–matrix adhesion, contains sushi, nidogen-and calcium-binding domains | |||
| AHR | Aryl hydrocarbon receptor | 202820_at | 27 | 33 003 | 7.52 | 6.02 | 1.50 | 1.20 | 1.59 | A ligand-activated transcription factor able to form complexes with other nuclear receptors ( | |||
| HGD | Homogentisate 1,2-dioxygenase (homogentisate oxidase) | 205221_at | 28 | 32 816 | 4.57 | 7.83 | −3.26 | −3.17 | −3.92 | Fe(II)-dependent enzyme responsible for aromatic ring cleavage | |||
| RXRG | Retinoid X receptor, γ | 205954_at | 29 | 32 444 | 7.35 | 4.69 | 2.66 | 2.80 | 2.62 | Heterodimer partner of several nuclear receptors | |||
| CA4 | Carbonic anhydrase IV | 206209_s_at | 30 | 31 332 | 6.33 | 8.51 | −2.18 | −2.62 | −1.41 | An ancient isozyme | |||
| SDC4 | Syndecan 4 (amphiglycan, ryudocan) | 202071_at | 31 | 28 036 | 10.76 | 8.31 | 2.45 | 1.86 | 2.41 | Transmembrane heparan sulfate proteoglycan involved in the organization of the actin cytoskeleton and in cell–matrix interactions, binds fibronectin, behaves as CXCL12 receptor ( | |||
| ENTPD1 | Ectonucleoside triphosphate diphosphohydrolase 1 | 209473_at | 32 | 27 859 | 8.71 | 6.75 | 1.97 | 1.49 | 1.48 | Membrane bound enzyme converts adenine nucleotides to adenosine, interacts with caveolin 1 and 2 | |||
| TPO | Thyroid peroxidase | 210342_s_at | 33 | 27 658 | 7.29 | 12.24 | −4.95 | −4.93 | −3.75 | Thyroid-specific enzyme crucial for organification of iodine and synthesis of thyroid hormones | |||
| KRT19 | Keratin 19 | 201650_at | 34 | 27 398 | 8.92 | 5.71 | 3.22 | 3.55 | 3.07 | The smallest known keratin expressed in some types of cancer | |||
| ID3 | Inhibitor of DNA binding 3, dominant negative helix-loop-helix protein | 207826_s_at | 35 | 26 271 | 9.17 | 11.25 | −2.08 | −1.26 | −1.29 | Downstream target of pituitary tumor transforming gene ( | |||
| RUNX1 | Runt-related transcription factor 1 (acute myeloid leukemia 1; aml1 oncogene) | 209360_s_at | 36 | 26 202 | 7.37 | 4.80 | 2.58 | 3.50 | 2.01 | Transcription factor may promote E-cadherin expression ( | |||
| LMOD1 | Leiomodin 1 (smooth muscle) | 203766_s_at | 37 | 26 044 | 5.60 | 7.80 | −2.20 | −2.77 | −0.95 | Present both in thyroid cells and eye muscle ( | 64 kDa antigen, considered for its role in thyroid autoimmunity | ||
| RAB27A | RAB27A, member | 209514_s_at | 38 | 25 684 | 8.57 | 6.29 | 2.28 | 1.43 | 1.53 | See above information on the alternative probeset identifying the same gene | |||
| FBXO9 | F-box protein 9 | 212987_at | 39 | 25 331 | 8.47 | 9.29 | −0.83 | −0.50 | −0.57 | Members of this gene family in complexes may act as protein–ubiquitin ligases | |||
| TRIM58 | Tripartite motif-containing 58 | 215047_at | 40 | 25 304 | 3.91 | 6.99 | −3.08 | −2.27 | −1.74 | Not identified | |||
| – | – | 210524_x_at | 41 | 25 302 | 9.73 | 12.70 | −2.97 | −2.95 | −2.12 | Not identified | |||
| MT1G | Metallothionein 1G | 204745_x_at | 42 | 24 688 | 9.94 | 12.39 | −2.45 | −1.97 | −4.00 | Low molecular weight, cysteine-rich, zinc-donating protein. Associated with protection against DNA damage, stress, and apoptosis ( | |||
| ICAM1 | Intercellular adhesion molecule 1 (CD54), human rhinovirus receptor | 202638_s_at | 43 | 24 534 | 8.18 | 5.61 | 2.57 | 1.70 | 2.40 | Epithelial adhesion molecule plays a key role in lymphocyte infiltration into the thyroid |
The original papers (Eszlinger , 2004, Huang , Jarzab ) containing datasets included in the present study were not cited here. RXRG was listed in our previous microarray-based analysis (Jarzab ), together with FN1, MET, KRT19, DPP4, HBB, QPCT, GJB3, and DTX4, also occurring in this table.
OMIM-based information if not otherwise specified.
Denotes immunohistochemistry studies.
Ranking of thyroid samples by tumor–normal misclassification frequency, assessed by bootstrap-based outlier detection (BBOD) approach. The BBOD rank and score Q, as defined in Material and methods, is given
| 154 | PTC | U133 | B | 1 | 0.04 |
| 97 | Benign | U133 | A | 2 | 7.23 |
| 148 | PTC | U133 | B | 3 | 65.34 |
| 95 | Benign | U133 | A | 4 | 68.25 |
| 88 | PTC | U95v1 | B | 5 | 88.28 |
| 166 | PTC | U133 | B | 6 | 90.02 |
| 84 | PTC | U95v1 | B | 7 | 93.11 |
| 161 | PTC | U133 | A | 8 | 95.96 |
| 94 | Benign | U133 | B | 9 | 97.26 |
| 116 | Normal | U133 | B | 10 | 97.30 |
| 120 | PTC | U133 | A | 11 | 97.98 |
| 77 | Normal | U95v1 | B | 12 | 98.30 |
| 100 | Benign | U133 | B | 13 | 98.70 |
| 139 | PTC | U133 | A | 14 | 98.91 |
| 90 | PTC | U95v1 | B | 15 | 99.09 |
| 42 | CTN | U95v2 | B | 16 | 99.22 |
| 3 | AFTN | U95v2 | A | 17 | 99.28 |
| 37 | CTN | U95v2 | A | 18 | 99.36 |
| 147 | PTC | U133 | A | 19 | 99.38 |
| 40 | CTN | U95v2 | B | 20 | 99.41 |
| 64 samples (28 PTCs, 36 benign/normal) | 21–84 | 99.46–99.98 | |||
| 96 samples (19 PTCs, 77 benign/normal) | 85–180 | 100 |
Comparison of results obtained by different class prediction methods
| Compound covariate predictor | 89 | 85 | 88 | 77 | 93 |
| Nearest centroid | 90 | 86 | 89 | 79 | 93 |
| Linear diagonal discriminant analysis | 92 | 87 | 92 | 83 | 94 |
| One-nearest neighbor | 98 | 94 | 99 | 98 | 97 |
| Three-nearest neighbors | 98 | 93 | 100 | 99 | 97 |
| Support vector machines | 99 | 95 | 99 | 98 | 98 |
PPV, positive predictive value; NPV, negative predictive value.