Literature DB >> 17914110

A multi-gene approach to differentiate papillary thyroid carcinoma from benign lesions: gene selection using support vector machines with bootstrapping.

Krzysztof Fujarewicz1, Michal Jarzab, Markus Eszlinger, Knut Krohn, Ralf Paschke, Małgorzata Oczko-Wojciechowska, Małgorzata Wiench, Aleksandra Kukulska, Barbara Jarzab, Andrzej Swierniak.   

Abstract

Selection of novel molecular markers is an important goal of cancer genomics studies. The aim of our analysis was to apply the multivariate bioinformatical tools to rank the genes - potential markers of papillary thyroid cancer (PTC) according to their diagnostic usefulness. We also assessed the accuracy of benign/malignant classification, based on gene expression profiling, for PTC. We analyzed a 180-array dataset (90 HG-U95A and 90 HG-U133A oligonucleotide arrays), which included a collection of 57 PTCs, 61 benign thyroid tumors, and 62 apparently normal tissues. Gene selection was carried out by the support vector machines method with bootstrapping, which allowed us 1) ranking the genes that were most important for classification quality and appeared most frequently in the classifiers (bootstrap-based feature ranking, BBFR); 2) ranking the samples, and thus detecting cases that were most difficult to classify (bootstrap-based outlier detection). The accuracy of PTC diagnosis was 98.5% for a 20-gene classifier, its 95% confidence interval (CI) was 95.9-100%, with the lower limit of CI exceeding 95% already for five genes. Only 5 of 180 samples (2.8%) were misclassified in more than 10% of bootstrap iterations. We specified 43 genes which are most suitable as molecular markers of PTC, among them some well-known PTC markers (MET, fibronectin 1, dipeptidylpeptidase 4, or adenosine A1 receptor) and potential new ones (UDP-galactose-4-epimerase, cadherin 16, gap junction protein 3, sushi, nidogen, and EGF-like domains 1, inhibitor of DNA binding 3, RUNX1, leiomodin 1, F-box protein 9, and tripartite motif-containing 58). The highest ranking gene, metallophosphoesterase domain-containing protein 2, achieved 96.7% of the maximum BBFR score.

Entities:  

Mesh:

Year:  2007        PMID: 17914110      PMCID: PMC2216417          DOI: 10.1677/ERC-06-0048

Source DB:  PubMed          Journal:  Endocr Relat Cancer        ISSN: 1351-0088            Impact factor:   5.678


Introduction

Discrimination between benign thyroid nodules and cancer is an important aspect of determining the optimal extent of thyroid surgery. Currently, this is achieved by routine morphologic assessment of cytopathology samples. However, this method does not allow proper classification of all thyroid tumors (Baloch & Livolsi 2002, Franc ). At several institutions, genomic studies have been undertaken which besides focusing on basic biological issues (Huang , Giordano ), also explore potential diagnostic applications (Aldred , Chevillard , Finley ,). Our recent microarray-based analysis brought a 20-gene classifier to differentiate between papillary thyroid cancer (PTC) and normal thyroid tissue (Jarzab ), further verified using three independent datasets (Eszlinger ). Very large and easily distinguishable differences between the molecular profiles of PTC and normal thyroid have clearly demonstrated the applicability of gene expression findings to diagnostic purposes. However, even more desirable for the clinician would be genomic profiling-based capability to discriminate between malignant tumors and various benign lesions. Therefore, we decided to use a balanced mixture of samples from malignant and benign tumors and normal thyroid tissue to mimic the clinical situation, where the material from any of these may be obtained and shall be properly classified. This large 180-array dataset is derived respectively from de novo studies (n=40), previously published own microarray data (n=124; Eszlinger , 2004, Jarzab ), and accessible datasets published by other authors (n=16; Huang ). We set the following goals for the study: To assess accuracy of benign/malignant classification of thyroid specimens in relation to gene set size, in the context of PTC and To optimize the list of diagnostically relevant genes in PTC. To answer both questions, we used the support vector machines (SVMs) method with bootstrapping. This approach relies on iterative construction of SVM classifiers based on randomly selected sets of specimens (bootstrap samples) and testing the classifiers on remaining samples. We applied bootstrap to obtain both gene (feature) ranking and outlier detection. The ranking of the genes that are most important for classification quality was based on the frequency of their occurrence in the classifiers of different size (bootstrap-based feature ranking, BBFR). The ranking of the misclassified samples allowed to detect outliers (bootstrap-based outlier detection, BBOD) and to obtain a reliable estimate of classification accuracy with appropriate confidence intervals (CI) for gene sets of different size.

Material and methods

Microarray data used in the study

Microarray datasets from three sources were included in the analysis: Dataset obtained in Gliwice, Poland; in total, 90 specimens analyzed with GeneChip HG-U133A microarrays. The specimens were collected from 71 patients with PTC (9 males and 40 females; mean age 36 years, range 6–71 years) and 22 with other thyroid diseases, 6 with follicular adenoma, 13 with nodular or colloid goiter and 3 with chronic thyroiditis (9 males and 13 females; mean age 45 years, range 11–71 years). The thyroid tissue specimens included 49 PTC tumors and 41 normal/benign thyroid tissue samples. The latter samples were from patients with PTC (n=17) or other benign thyroid lesions (n=24), among them six follicular adenomas, four nodular goiters, nine colloid goiters, and five cases of thyroiditis, two of them taken from the contralateral lobe from patients with PTC. Fifty microarrays were included in our previously published study and publicly available at www.genomika.pl/thyroidcancer (Jarzab ); 40 microarrays were from de novo studies. All new samples were processed according to description given in Jarzab et al. (2005). Dataset obtained in Leipzig, Germany; 74 specimens analyzed with GeneChip HG-U95Av2 microarrays. The specimens included 15 autonomously functioning thyroid nodules, 22 cold thyroid nodules, and 37 samples of their respective surrounding thyroid tissues. The analysis of these datasets was published previously (Eszlinger , 2004) and the datasets are available at http://www.uni-leipzig.de/innere/_forschung/schwerpunkte/etiology.html. Dataset obtained in Columbus, OH, USA; 16 specimens analyzed with GeneChip HG-U95A microarrays. The specimens were derived from eight patients and included both PTC tumors and their surrounding thyroid tissues. The dataset (Huang ) is publicly available at http://thinker.med.ohio-state.edu. In total, the three analyzed datasets comprised 57 PTCs, 61 benign thyroid lesions, and 62 apparently normal thyroid tissues analyzed on 180 GeneChips of two different generations. Half of them were U133A and the rest U95A platforms.

Data pre-processing and generation of datasets

Each dataset was pre-processed by the MAS5 algorithm. To compare the expression data generated using the U95A GeneChips (12 625 probe sets) with those from the U133A GeneChips (22 283 probe sets), we used the ‘Human Genome U95 to Human Genome U133 Best Match Comparison Spreadsheet’ (www.affymetrix.com/support/technical/comparison_spreadsheets.affx) which yielded an intersection of 9530 probe sets. The obtained data were log2 transformed.

Neighborhood analysis and recursive elimination in gene selection

For selection of gene sets with diagnostic potential, we applied here the recursive feature elimination (RFE) algorithm (Guyon ) which is computationally less demanding than recursive feature replacement used in our previous studies (Jarzab , Eszlinger ). The introductory gene selection was performed using neighborhood analysis (200 genes; Golub , Slonim ), further selection of the 100 best genes set was carried out by RFE.

SVMs and classification

The linear SVM (Boser , Vapnik 1995) was used for developing the classification rule. As mentioned earlier, the classifier was independently trained for different numbers of selected genes (from 1 to 100).

Bootstrap for estimation of classifier accuracy and its CI

In order to determine the accuracy of the developed classifier, we performed classical bootstrap procedure in 500 resampling iterations (selection with equal probability and return of samples; Efron 1979). Iterations of all stages of the classifier construction (i.e. gene preselection, gene selection, and classifier learning) were performed in each bootstrap, as suggested previously (Simon ). The accuracy of the classifier was calculated using the 0.632 bootstrap estimator (Efron 1983). The distribution of the misclassification rate obtained during all bootstrap runs was used to estimate the 95% CI. The accuracy of the classifier and the CI were calculated for different numbers of selected genes (up to 100).

Bootstrap based feature ranking (BBFR) and outlier detection (BBOD)

The primary purpose of the bootstrap used in this study was to estimate the accuracy of the molecular classifier for different sizes of gene subsets with appropriate CIs. However, the computational effort for the bootstrap technique may also be exploited to derive some additional information. We apply two methods that use the information collected during bootstrapping: BBFR and BBOD. They are similar to the methods of statistical learning based on resampling, such as bagging and boosting. In both techniques, an ensemble of many base classifiers is created. Each base classifier is trained on different bootstrap subsamples. The final decision is based on decisions of all base classifiers. The simplest approach is bagging (bootstrap aggregating) originally proposed by Breiman (1996). In bagging, the subsamples are randomly drawn as in classical bootstrapping where each observation is picked with the same probability 1/m, where m is the number of all observations. The final decision is the decision of most base classifiers. In boosting, different observations may be picked with different probability and the final decision is weighted sum of decisions of base classifiers. The well-known boosting algorithm is AdaBoost (Freund & Schapire 1996). In our approach, we do not create an ensemble (committee) of many base classifiers but we use the information collected during bootstrap-based validation step of the SVM classifier. Let the data contain m instances (observations). One instance is a vector of Nmax features (gene expression values) with a corresponding class label specified by an expert. Let L be the number of bootstrap iterations. In each run, we select (with equal probability and return of samples) m instances from the dataset (bootstrap sample). Then, the bootstrap sample is used for feature selection and classifier learning. Finally, the classifier is tested on the test set containing all instances not belonging to the bootstrap sample. To find the optimal size for the feature set, we select N feature sets Ω1,Ω2,…,Ω of sizes 1,2,…,N respectively. In general, selected sets may not overlap, but in most commonly used feature selection methods, based on feature ranking or backward/forward searching, feature subsets satisfy the relation

BBFR

Let r(i) be a number of subsets Ω, i=1,2,…,N where the gene j belongs to. For gene selection methods satisfying equation (1), we have The BBFR score R of the feature j is defined as a sum of r(i) over all bootstrap runs as follows: The maximum possible value of the BBFR score is LBN.

BBOD

Let q be the number of bootstrap iterations where the observation k is chosen as a test instance (not a member of the bootstrap sample). Let q true be the number of bootstrap iterations where the instance j is correctly classified at the test stage. The BBOD score for k-th observation is The value of Q belongs to the interval 〈0,1〉 and the low value indicates outliers.

Comparison of different class prediction methods

We used BRB ArrayTools (developed by Dr Richard Simon and Amy Peng Lam) to compare different class prediction algorithms (Compound Covariate Predictor, Linear Diagonal Discriminant Analysis, Nearest Centroid, 1-Nearest Neighbor, 3-Nearest Neighbors and SVMs). To compute misclassification rate, 0.632 bootstrap cross-validation method was used. All genes with univariate misclassification rate below 0.2 were used for this analysis.

Results

Accuracy of malignant/benign classification and redundancy of PTC gene classifiers

The huge difference in gene expression between PTC and benign/normal thyroid tissues implies that many multi-gene classifiers with similar classification ability may be created. For preliminary assessment of accuracy of the differentiation between PTC and benign lesions or normal thyroid, we randomly divided the 180-array dataset into two subgroups, according to sample number: A (odd numbers) and B (even numbers). Each subgroup contained data from similar number of benign and malignant tumor specimens analyzed with U133A or U95A GeneChips. We used set A to obtain a 20-gene classifier; this classifier was tested on set B and the procedure was repeated, using set B as a training set and testing the classifier on set A. Using the classifier obtained from set A, we were able to correctly predict 86 out of 90 samples (95.6%) within set B, while using the classifier obtained from set B, we accurately diagnosed 88 out of 90 samples in set A (97.8%). Both classifiers differed partly from our previous 20-gene classifier (37) obtained on a smaller dataset. To avoid a bias in gene selection and accuracy estimation, related to the arbitrary selection of the training set, we carried out the procedure of accuracy estimation by bootstrapping, i.e. randomly selecting large numbers of slightly different training sets and validating them on the remaining samples. This procedure allows using sufficiently large training sets while simultaneously obtaining a reliable estimation of classification accuracy. By applying this method, we estimated the accuracy of discrimination between benign and malignant samples to be 98.6%, with a rather narrow CI (see Fig. 1). For small gene sets, the accuracy was a bit lower (93.7% for one-gene set, 96.9% for two-gene set, 97.9 for three-gene set, and from 98.3 to 98.6 for larger sets, up to n=100). For the 20-gene classifier, the accuracy was 98.5% and the estimated 95% CI was 95.9–100% for the classifiers built from more than five genes.
Figure 1

Accuracy of bootstrapping-estimated benign–malignant classification for different gene set sizes. The 95% confidence interval is marked by dashed lines.

We compared the results of classification by the best 500 genes (Fig. 2) with the classification by consecutive 500-gene sets (i.e. first 500, 500–1000, 1000–1500, etc). We noted that only the first 500 genes allow accurately classifying samples by single genes or small gene sets. Genes ranked 500–1000 achieved 90% accuracy only for classifiers larger than 50 genes, while genes beyond the first 1000 hardly achieve this limit of accuracy. When we excluded all genes analyzed in Fig. 2 (8×500=4000), the accuracy obtained for small sets was only ∼60%, close to random. However, the accuracy rose with gene set size, and for classifier sets larger than 700 genes it achieved 90% (data not shown). These results support the conclusion that the PTC transcriptome differs from the normal one in thousands of genes; they also provide evidence that optimizing a diagnostic gene set is a necessary step of analysis in order to make this set useful for molecular PTC classification.
Figure 2

Accuracy of classification obtained by successive gene set reduction. The accuracy of the best 500 genes was evaluated in one iteration using the bootstrap technique, then the selected 500-gene set was removed from the whole dataset, and the next 500 genes were selected in the following iteration. This procedure was repeated seven times, thus 3500 genes were excluded (line no. 8). To speed up the procedure, only neighbourhood analysis (NA) was used for gene selection.

Ranking of PTC genes for their classification ability

To obtain the ranking of genes based on their usefulness in the diagnostic context, we performed subsequent repetitive gene selection process by bootstrapping of the whole dataset. We ranked all genes according to the frequency of appearance within the selected gene sets (BBFR). Genes important for the majority of diagnostic datasets were highly ranked, while less importance was given to complementing transcripts, which exhibited higher variability (Fig. 3). During the selection process, 365 transcripts occurred at least once within the obtained classifiers and some of them were present in nearly all classifiers. The maximum theoretical score to be obtained by a gene was 5×104 and the gene with the best rank, encoding metallophosphoesterase domain-containing protein 2 (MPPED2), had a score of 4.84×104, i.e. 96.7% of the maximum one. The first 20 genes were given scores >3.74×104 (>77% of the maximum score), only slightly lower than the top gene, and the first 100 transcripts were characterized by scores >0.64×104, which is >13.2% of the maximum score obtained. In total, 43 transcripts representing 41 genes scored higher than half of the value for the top gene (>2.42×104, Fig. 3). Among them, there were both genes known for their changed expression in PTC or described in previous microarray studies, some used already as single markers, as well as new genes, not considered previously for their diagnostic potential (Table 1).
Figure 3

Result of bootstrap-based feature ranking (BBFR). Each dot represents one gene, dashed lines define the subset of 43 genes with BBFR score larger than half of the maximum one (black dots).

Table 1

Ranking of papillary thyroid cancer (PTC) genes as assessed by bootstrap-based feature ranking (BBFR) approach. For each transcript selected, rank and score obtained by the BBFR method are given, together with basic univariate statistics (log2 mean and log2 ratio)

Gene symbolGene nameAffy_ID (U133)RankScorePTC mean log2Benign mean log2Log ratioLog ratio U133Log ratio U95References of microarray or other high throughput studiesaReferred to in single studies of thyroid cancerOther data relevant for functional role in thyroid cancerGene functionb
MPPED2Metallophosphoesterase domain-containing protein 2205413_at148 4494.667.82−3.16−3.46−3.26Aldred et al. (2003, 2004), Mazzanti et al. (2004) and Griffith et al. (2006)Fetal brain protein of unknown function
H/HBA2Hemoglobin, α-1/hemoglobin, α-2209458_x_at245 5219.7912.04−2.25−2.28−1.87Griffith et al. (2006)Onda et al. (2005)Oxygen transport
METMet proto-oncogene (hepatocyte growth factor receptor)213807_x_at345 3638.155.222.931.732.60Barden et al. (2003), Wasenius et al. (2003), Finley et al. (2004a,b), Prasad et al. (2004), Zou et al. (2004) and Giordano et al. (2005)cBelfiore et al. (1997) and cIppolito et al. (2001)Ramirez et al. (2000), Ruco et al. (2001) and Scarpino et al. (2004)Membrane tyrosine kinase receptor enhances cell motility, invasiveness, and chemokine production (Ruco et al. (2001))
FN1Fibronectin 1210495_x_at444 01712.248.723.522.753.91Chen et al. (2001), Barden et al. (2003), Wasenius et al. (2003)Finley et al. (2004a,b), Prasad et al. (2004), Giordano et al. (2005), Hamada et al. (2005) and Griffith et al. (2006)Takano et al. (1998), 1999 and cPrasad et al. (2005)Ghinea et al. (2002) and Liu et al. (2005)Extracellular matrix glycoprotein participates in cell adhesion, regulates proliferation and survival of thyroid cells via integrin receptors (Illario et al. (2003))
GALEUDP-galactose-4-epimerase202528_at543 9747.123.703.422.413.50Converts glucose to galactose and N-acetylglucosamine to its UDP-derivatives
QPCTGlutaminyl-peptide cyclotransferase (glutaminyl cyclase)205174_s_at643 3177.604.992.613.142.56Barden et al. (2003), Chevillard et al. (2004), Finley et al. (2004a,b) and Griffith et al. (2006)Converts glutaminyl peptides to cyclic pyroglutamyl ones
NELL2NEL-like 2 (chicken) 203413_at742 9539.687.562.122.022.53Barden et al. (2003) and Finley et al. (2004a,b)Brain protein with six EGF-like repeats
PGCPPlasma glutamate carboxypeptidase203501_at842 1537.709.23−1.53−1.05−1.33Aldred et al. (2003, 2004), Barden et al. (2003), Finley et al. (2004a,b), Weber et al. (2005) and Sarquis et al. (2006)Breakdown of secreted peptides, homologous to prostate membrane-specific antigen (Gingras et al. (1999))
DPP4Dipeptidylpeptidase 4 (CD26, adenosine deaminase complexing protein 2)203717_at942 1157.873.774.113.213.81Huang et al. (2001), Takano et al. (2002, 2004), Prasad et al. (2004) and Griffith et al. (2006)Kehlen et al. (2003), cKholova et al. (2003a,b) and Ozog et al. (2006)Aratake et al. (2006) and Schagdarsurengin et al. (2006)Membrane enzyme, participates in breakdown of secreted peptides
ADORA1Adenosine A1 receptor205481_at1041 6997.164.852.302.002.81Aldred et al. (2003, 2004) and Prasad et al. (2004)Lelievre et al. (1998), Woodhouse et al. (1998) and Schnurr et al. (2004)Membrane receptor, stimulates motility and modulates proliferation
HMGA2High-mobility group AT-hook 2 208025_s_at1140 7137.904.663.243.582.62Baris et al. (2004, 2005) and Jacques et al. (2005)Fedele et al. (2001), Berlingieri et al. (2002) and Musholt et al. (2006)Architectural transcription factor (Noro et al. (2003))
RYR1Ryanodine receptor 1 (skeletal)205485_at1240 4736.964.702.272.531.92Barden et al. (2003) and Finley et al. (2004a,b)Present mainly in excitable cellsCalcium release channel of the sarcoplasmic reticulum
CDH16Cadherin 16, KSP-cadherin206517_at1339 7703.478.07−4.60−4.68−1.43Thought to be kidney specific (Thomson et al. (1995))Calcium-dependent, membrane-associated glycoprotein, participates in cell adhesion
GJB3Gap junction protein β-3, 31 kDa (connexin 31)205490_x_at1439 5266.494.042.442.710.62Does not normally appear in thyroid, in adult mouse becomes restricted to epidermis, testis and placenta (Tonoli et al. (2000), Plum et al. (2002) and Green et al. (2005) Forms incompatible hemichannels with thyroidal connexin 43 (Dahl et al. (1996))
EMID1EMI domain containing 1213779_at1539 5056.448.12−1.68−1.09−0.76Barden et al. (2003), Cerutti et al. (2004) and Finley et al. (2004a,b)Extracellular matrix protein, able to promote cell movements (Spessotto et al. (2003))
NRIP1Nuclear receptor-interacting protein 1202599_s_at1639 3588.316.331.981.362.06Barden et al. (2003) and Finley et al. (2004a,b)Interacts with nuclear receptors
METMet proto-oncogene (hepatocyte growth factor receptor) 211599_x_at1739 3488.445.682.761.532.54Barden et al. (2003), Wasenius et al. (2003), Finley et al. (2004a,b), Prasad et al. (2004), Zou et al. (2004) and Giordano et al. (2005)See the information given above for another probeset of the same gene
DTX4Deltex 4 homolog (Drosophila)212611_at1839 29810.248.242.002.071.45Prasad et al. (2004)Participates in protein ubiquination
RAB27ARAB27A, member RAS oncogene family210951_x_at1938 9138.625.623.001.601.24Barden et al. (2003), Finley et al. (2004a,b), Weber et al. (2005), Musholt et al. (2006) and Sarquis et al. (2006)Prenylated membrane bound protein with GTP-ase function
CDNA clone IMAGE:4152983214803_at2037 3977.305.391.901.941.23Not identified
BCL2B-cell CLL/lymphoma 2203684_s_at2136 4832.925.88−2.95−2.74−1.39Hoos et al. (2002), Baris et al. (2004, 2005), Prasad et al. (2004), Wreesmann et al. (2004), Giordano et al. (2005) and Jacques et al. (2005)Mitselou et al. (2004), cAksoy et al. (2005) and cLetsas et al. (2005)Stassi et al. (2003) and Basolo et al. (1999)Anti-apoptotic protein
TACSTD2Tumor-associated calcium signal transducer 2202286_s_at2236 17010.426.294.134.024.02Giordano et al. (2005)May serve as cell surface receptor
DIO1Deiodinase, iodothyronine, type I206457_s_at2335 9716.199.94−3.75−3.79−4.33Eszlinger et al. (2001, 2004), Huang et al. (2001), Barden et al. (2003), Finley et al. (2004a,b), Prasad et al. (2004), Wreesmann et al. (2004), Giordano et al. (2005) and Griffith et al. (2006)cDe Micco et al. (1999), cCzarnocka et al. (2001), cLe Fourn et al. (2004), cAmbroziak et al. (2005) and cArnaldi et al. (2005)Kohrle (1999)5′ Deiodination of thyroxine
ITPR1Inositol 1,4,5-triphosphate receptor, type 1203710_at2434 8046.758.84−2.09−2.06−1.83Barden et al. (2003), Finley et al. (2004a,b), Prasad et al. (2004), Wreesmann et al. (2004) and Hamada et al. (2005)Signal transducer coupled with calcium channels, participates in apoptosis (Sedlak & Snyder (2006))
HBBHemoglobin β209116_x_at2534 5919.6212.13−2.51−2.48−1.24Aldred et al. (2003, 2004) and Onda et al. (2005)See above the HBA gene
SNED1Sushi, nidogen, and EGF-like domains 1213493_at2633 6252.875.88−3.01−2.14−2.06Participates in cell–matrix adhesion, contains sushi, nidogen-and calcium-binding domains
AHRAryl hydrocarbon receptor202820_at2733 0037.526.021.501.201.59Barden et al. (2003), Wasenius et al. (2003) and Finley et al. (2004a,b)A ligand-activated transcription factor able to form complexes with other nuclear receptors (Widerak et al. (2005)
HGDHomogentisate 1,2-dioxygenase (homogentisate oxidase)205221_at2832 8164.577.83−3.26−3.17−3.92Huang et al. (2001), Aldred et al. (2003), Barden et al. (2003), Aldred et al. (2004), Finley et al. (2004a,b), Prasad et al. (2004) and Giordano et al. (2005)Fe(II)-dependent enzyme responsible for aromatic ring cleavage
RXRGRetinoid X receptor, γ205954_at2932 4447.354.692.662.802.62Haugen et al. (2004)Klopper et al. (2004), Schmutzler et al. (2004) and Frohlich et al. (2005)Heterodimer partner of several nuclear receptors
CA4Carbonic anhydrase IV206209_s_at3031 3326.338.51−2.18−2.62−1.41Barden et al. (2003), Finley et al. (2004a,b), Weber et al. (2005) and Sarquis et al. (2006)An ancient isozyme
SDC4Syndecan 4 (amphiglycan, ryudocan)202071_at3128 03610.768.312.451.862.41Barden et al. (2003), Chevillard et al. (2004), Finley et al. (2004a,b), Prasad et al. (2004) and Griffith et al. (2006)Transmembrane heparan sulfate proteoglycan involved in the organization of the actin cytoskeleton and in cell–matrix interactions, binds fibronectin, behaves as CXCL12 receptor (Lin et al. (2005))
ENTPD1Ectonucleoside triphosphate diphosphohydrolase 1209473_at3227 8598.716.751.971.491.48Weber et al. (2005) and Sarquis et al. (2006)Membrane bound enzyme converts adenine nucleotides to adenosine, interacts with caveolin 1 and 2 (Kittel et al. (2004))
TPOThyroid peroxidase210342_s_at3327 6587.2912.24−4.95−4.93−3.75Barden et al. (2003), Cerutti et al. (2004), Finley et al. (2004a,b) and Griffith et al. (2006)Arturi et al. (1997), Lazar et al. (1999) andFuruya et al. (2004)Thyroid-specific enzyme crucial for organification of iodine and synthesis of thyroid hormones
KRT19Keratin 19201650_at3427 3988.925.713.223.553.07Barden et al. (2003), Chevillard et al. (2004), Finley et al. (2004a,b), Prasad et al. (2004) and Griffith et al. (2006)Schelfhout et al. (1989)The smallest known keratin expressed in some types of cancer
ID3Inhibitor of DNA binding 3, dominant negative helix-loop-helix protein207826_s_at3526 2719.1711.25−2.08−1.26−1.29Downstream target of pituitary tumor transforming gene (PTTG)
RUNX1Runt-related transcription factor 1 (acute myeloid leukemia 1; aml1 oncogene)209360_s_at3626 2027.374.802.583.502.01Kim et al. (2007)Transcription factor may promote E-cadherin expression (Liu et al. (2005))
LMOD1Leiomodin 1 (smooth muscle)203766_s_at3726 0445.607.80−2.20−2.77−0.95Present both in thyroid cells and eye muscle (Kromminga et al. (1998))64 kDa antigen, considered for its role in thyroid autoimmunity
RAB27ARAB27A, member RAS oncogene family209514_s_at3825 6848.576.292.281.431.53Barden et al. (2003), Finley et al. (2004a,b), Weber et al. (2005) and Sarquis et al. (2006)See above information on the alternative probeset identifying the same gene
FBXO9F-box protein 9212987_at3925 3318.479.29−0.83−0.50−0.57Members of this gene family in complexes may act as protein–ubiquitin ligases
TRIM58Tripartite motif-containing 58215047_at4025 3043.916.99−3.08−2.27−1.74Not identified
210524_x_at4125 3029.7312.70−2.97−2.95−2.12Not identified
MT1GMetallothionein 1G204745_x_at4224 6889.9412.39−2.45−1.97−4.00Baris et al. (2004, 2005), Prasad et al. (2004), Jacques et al. (2005) and Griffith et al. (2006)Cherian et al. (2003)Low molecular weight, cysteine-rich, zinc-donating protein. Associated with protection against DNA damage, stress, and apoptosis (Theocharis et al. (2004))
ICAM1Intercellular adhesion molecule 1 (CD54), human rhinovirus receptor202638_s_at4324 5348.185.612.571.702.40Kawai et al. (1998)Epithelial adhesion molecule plays a key role in lymphocyte infiltration into the thyroid

The original papers (Eszlinger , 2004, Huang , Jarzab ) containing datasets included in the present study were not cited here. RXRG was listed in our previous microarray-based analysis (Jarzab ), together with FN1, MET, KRT19, DPP4, HBB, QPCT, GJB3, and DTX4, also occurring in this table.

OMIM-based information if not otherwise specified.

Denotes immunohistochemistry studies.

We analyzed fold-change differences between PTC and benign thyroid samples for the 43 selected transcripts to evaluate the potential influence of inter-platform differences on the obtained gene selection. Twenty of them showed more than fourfold increase (log ratio >2) and four transcripts were increased more than twice, whereas the remaining 19 transcripts were decreased. Generally, the consistency between fold-changes observed in subsets from U95 and U133 arrays was good, although for some genes (e.g. the well-known thyroid cancer markers fibronectin 1 (FN1) and MET or novel genes cadherin 16 (CDH16) or gap junction protein β-3 (GJB3)) there were inter-platform differences between the log ratios. However, 40 out of 43 selected genes exhibited more than twofold change in both the U133 and the U95 subsets. For all 43 genes, the PTC–benign difference was larger than the difference between fold-changes obtained with different GeneChip generation subsets. This confirms that the selection performed was robust to inter-array differences.

Misclassified thyroid samples

The algorithm with bootstrapping allows ranking the samples according to the frequency of their misclassification (Table 2). BBOF showed very frequent misclassifications for two samples. One of them was not properly classified by any gene set selected, and this was sample no. 154 from the U133 dataset no. 1, a small (10 mm in diameter) familial PTC found within a larger follicular adenoma. It was observed in an 18-year-old woman. A year later her mother, 43 years old, was diagnosed with 0.7 cm PTC (follicular variant). The other one, properly classified only in 8% of runs, was a benign follicular adenoma (diagnosed as atypical) from the same dataset (sample no. 97) which was derived from a 15-year-old boy of another family with familial PTC. In this family, there were two PTC cases (mother of the patient, diagnosed with pT2BNxM0 PTC and her aunt who died of a dissemination of PTC) and one follicular thyroid cancer case (pT2bNxM0, 11 years old, sister of the patient). These were the only two cases with a positive family history of thyroid cancer among 49 Polish patients included in the study. Two further samples were properly classified in 65–68% of runs (one from dataset no. 1 and one from dataset no. 3), again one benign adenoma and one PTC, respectively. For the fifth sample, the accuracy was much higher and it was properly classified in 88% of the runs. Thus, only 5 out of 180 samples (2.8%) were misclassified in more than 10% of the runs, while a total of 14 samples (7.8%) were misclassified in more than 1% of the runs. Seventy samples were classified with an excellent accuracy between 99 and 100%, and for further 64 cases no misclassification occurred during the bootstrapping process.
Table 2

Ranking of thyroid samples by tumor–normal misclassification frequency, assessed by bootstrap-based outlier detection (BBOD) approach. The BBOD rank and score Q, as defined in Material and methods, is given

Sample numberStatusArraySetRankScore (%)
154PTCU133B10.04
97BenignU133A27.23
148PTCU133B365.34
95BenignU133A468.25
88PTCU95v1B588.28
166PTCU133B690.02
84PTCU95v1B793.11
161PTCU133A895.96
94BenignU133B997.26
116NormalU133B1097.30
120PTCU133A1197.98
77NormalU95v1B1298.30
100BenignU133B1398.70
139PTCU133A1498.91
90PTCU95v1B1599.09
42CTNU95v2B1699.22
3AFTNU95v2A1799.28
37CTNU95v2A1899.36
147PTCU133A1999.38
40CTNU95v2B2099.41
64 samples (28 PTCs, 36 benign/normal)21–8499.46–99.98
96 samples (19 PTCs, 77 benign/normal)85–180100

Comparison of classification accuracy by different class prediction methods

To evaluate our method, we compared the accuracy of prediction by different class prediction methods implemented in BRB-Array software. We based the class prediction on all genes that showed the univariate misclassification rate lower than 20%. We found out that the classification accuracy ranged from 89% (compound covariate predictor method) to 99% (SVM), and confirmed the best performance of SVM-based methods to analyze these data (Table 3).
Table 3

Comparison of results obtained by different class prediction methods

MethodAccuracy (%)Sensitivity (%)Specificity (%)PPV (%)NPV (%)
Compound covariate predictor8985887793
Nearest centroid9086897993
Linear diagonal discriminant analysis9287928394
One-nearest neighbor9894999897
Three-nearest neighbors98931009997
Support vector machines9995999898

PPV, positive predictive value; NPV, negative predictive value.

Discussion

Transcripts important for discriminating PTC from benign and normal thyroid samples

In the study, we performed an advanced optimization of putative PTC markers using a large group of benign thyroid lesions and normal thyroid tissues and proposed a list of 43 transcripts, selected by their most frequent appearance in the classifiers. An additional proof of their efficacy was obtained by hierarchical clustering (all samples clustered correctly, data shown in the web appendix to this article, www.genomika.pl/thyroidcancer). Forty-one of them (95.3%) could be attributed to 39 known genes, 32 well-defined ones, and 7 of unknown or not well-defined function. There were 12 genes which had never before been related to the thyroid gland nor mentioned in genomic studies of thyroid cancer, while 29 genes (74%) were identified in previous thyroid microarray studies. However, only ten of them were discussed in the original papers for their putative role in thyroid carcinoma. Within the list of the well-known genes which received high scores by BBFR, one should mention gene encoding FN1, met proto-oncogene (MET; both scored 4.4×104), dipeptidylpeptidase 4 (DPP4), adenosine A1 receptor (ADORA1), keratin 19, and B-cell CLL (BCL2) genes (Huang , Wasenius , Baris , Chevillard , Finley , Wreesmann , Giordano ), all up-regulated with the exception of BCL2. Their inclusion in our classifier positively validates the applied criteria. All these genes except ADORA1 were previously found by single gene studies (see Table 1) and later confirmed by microarray approaches. Moreover, in the recent meta-analysis of thyroid cancer gene expression profile, MET and FN1 were included into top 12 candidates for consistent gene expression markers (Griffith ). Similarly, thyroid-specific (down-regulated) genes, deiodinase, iodothyronine, type I and thyroid peroxidase, were widely recognized previously for their diagnostic significance both in microarray-based (Eszlinger , Huang , Baris , Cerutti , Finley , Wreesmann ) and single gene studies (Arturi , Lazar , De Micco , Czarnocka , Le Fourn , Ambroziak , Arnaldi ). Nevertheless, neither our approach nor the meta-analysis mentioned earlier indicated other thyroid-specific genes, confirming the lesser diagnostic potency of sodium iodide symporter, thyroglobulin, thyrotrophin receptor, or thyroid-specific transcription factors, shown to be down-regulated in previous single gene studies (Arturi , Lazar , Shimura , Scouten , Ambroziak , Wagner ). The top gene identified by our effort, MPPED2, which is lost in PTC, was not previously considered for its role in PTC, although it was previously listed by Aldred et al. (2004, in the context of FTC) and by Mazzanti et al. (2004). It is an ancient gene highly conserved from Caenorhabditis elegans to mammals and expressed in fetal brain. Its function is unknown. Already the first microarray-based analysis of a PTC gene expression profile (Huang ) indicated the dominant position of genes controlling cell–matrix adhesion and cell–cell communication. Besides, FN1 mentioned earlier, and intercellular adhesion molecule 1 (ICAM-1; Kawai ), it seems important to mention syndecan 4 (SDC4), a transmembrane heparan sulfate proteoglycan known to bind FN1 and functioning also as CXCL12 receptor in signal transduction (Huang , Chevillard , Finley ). Loss of CDH16 (kidney-specific cadherin; Thomson ) was indicated for the first time in our study, a gene closely related to cadherin E (CDH1), which is well known to be lost in a subgroup of PTCs with negative prognostic significance (Rocha ), while cadherin P (CDH3) is up-regulated in PTC (Jarzab ). Other genes involved in cell adhesion and present in our list comprise ectonucleoside triphosphate diphosphohydrolase 1 (ENTPD1) (up-regulated) and less known genes such as NEL-like 2 (up-regulated) and sushi, nidogen, and EGF-like domains 1 (down-regulated), both exhibiting EGF-like repeats (Watanabe ). The GJB3 gene (connexin 31) encodes the protein subunit of gap junctions, essential for cell–cell communication. DPP4 (CD26), ICAM1, and ENTPD1 (CD39) may be considered as immune-related genes, although their expression is not confined to immune or endothelial cells. ICAM1 was shown to be present in thyroid cancer cells (Kawai ). ENTPD1 (ecto-ATPase), in turn, has not been described before for the thyroid gland; its expression was shown in some other organs like salivary glands or exocrine pancreas (Kittel ). It converts adenine nucleotides to adenosine, thus participating in the control of signal transduction. DPP4, another membrane-bound enzyme which hydrolyzes peptides engaged in paracrine and autocrine regulation, is up-regulated in PTCs both on RNA and protein level (Huang , Kholova ). The contribution of various enzymes to our list is striking: others, not described previously in the context of thyroid gland, comprise UDP-galactose epimerase (GALE) and glutaminyl-peptide cyclotransferase (QPCT), both with virtually unknown expression patterns. The latter was also indicated by the meta-analysis of Griffith et al. Among gene encoding enzymes lost in PTC are plasma glutamate carboxypeptidase, plasma glutamate carboxypeptidase (Gingras ), not mentioned in any thyroid-related study before; carbonic anhydrase 4 (CA4), and even the well-known homogentisate oxidase (encoding HGD), not previously related to the thyroid in any context, although listed in many microarray-based reports (Table 1). Underexpression of hemoglobin transcripts (HBA1/A2 and HBB scored at positions 2 and 25 respectively) was already discussed in our papers as a very characteristic feature of PTC gene expression profile (Jarzab ). We believe that the down-regulation of hemoglobin gene could be associated with tumor hypoxia; HBA has also been considered a tumor suppressor since transduction of this gene in an anaplastic thyroid cancer cell line induces an anti-proliferative effect (Onda ). Many of the genes listed in Table 1 participate in signal transduction; among them are MET, ADORA1,RAB27A as well as tumor-associated calcium signal transducer 2, inositol 1,4,5-triphosphate receptor, type 1 (ITPR1), ryanodine receptor 1, all up-regulated in PTC except for ITPR1. Some enzymes mentioned above (DPP4, ENTPD1, and QPCT) contribute to synthesis or breakdown of signaling molecules. On the other hand, the list also includes many genes participating in transcription regulation, among them high-mobility group AT-hook 2, aryl hydrocarbon receptor, retinoid X receptor, γ, ID3, nuclear receptor-interacting protein 1, and RUNX1. Both of these functional classes are typical for cancer genes. We noted only one gene clearly related to apoptosis (and lost in PTC), the well-known BCL2. Interestingly enough, some immunohistochemical studies report its up-regulation in PTC (Aksoy ). Although the selected genes were obtained by analysis of PTC, many of them may be found also in other types of thyroid tumors (M Oczko-Wojciechowska, J Starzyński, M Jarząb, Z Wygoda, A Czarniecka, G Gala, M Kalemba, E Gubala & B Jarząb, unpublished data). This is convincingly illustrated by the overlapping results of our analysis and one of the studies which dealt with follicular thyroid tumors only (Barden ).

Accuracy of discriminating PTC from benign/normal thyroid tissue

Our study is the first to define the classification accuracy for thyroid cancer by 95% CIs and one of the few dealing with the problem of diagnostic accuracy of microarray-derived classifiers (Kerr & Churchill 2001). Although the estimation of CIs by Monte Carlo analysis has not gained a general acceptance still, it is necessary to stress the very good accuracy of PTC diagnosis in our study with the lower range of the CI at 95%, obtained using a sufficiently large study group, mimicking the real clinical setting. From a clinical point of view, for a PTC classifier, an even higher accuracy is required, as the risk of diagnosing PTC in a thyroid nodule is only about 5% (Hegedus 2004). Our results stress the importance of multi-gene approaches for the molecular diagnosis of cancer. We observed that lower limits of accuracy CIs were decreased in case of classification by gene sets with less than ten genes. The initial conclusion from these data is that any combination of more than five to ten genes increases the reliability of distinguishing between malignant and benign tissue samples. This result is similar to that obtained by Hua , who demonstrated on simulated and real breast cancer data that for different classifiers the number of features lower than five was usually much less effective than larger classifiers. Recent paper reports a six-gene molecular classifier, efficient for molecular diagnosis of thyroid cancer (Kebebew ).

Bootstrap-based multi-gene classification of PTC microarray data

Selection of genes is an important goal of microarray studies contributing to broader understanding of the cancer transcriptome as well as yielding novel molecular cancer markers. Such studies have been successfully performed in PTC and large numbers of discriminating physiologically relevant genes were proposed (Huang , Wasenius , Aldred , Chevillard , Finley ,, Wreesmann , Baris , Detours , Giordano ). However, in the majority of these studies, the selection of important genes was based on either fold-change or significance criteria obtained using classical statistical tests. These approaches either favor genes with large amplitudes, sometimes coming from a minor proportion of samples, or genes with low within-group variance, thus rather stably expressed in all analyzed tumor samples. Bearing in mind, complexity of molecular changes in tumors, the widespread skepticism about a single ‘cancer marker’ as well as possible differences in histological subtypes or other features of PTC, we decided to use SVM, a routine machine-learning approach to construct classifiers based on multiple features of the analyzed objects. This method allows integrating the information carried by many genes in the gene sets. Thus, effective molecular multi-gene classifiers may be built that rely on inter-gene interactions rather than on combining single ‘best markers’. SVMs have been confirmed as an effective method of multi-gene set selection and this is supported by our comparison to other class prediction methods. Our procedure helps us to optimize the list of markers which are to be implemented to real-time quantitative PCR-supported fine needle biopsy (Lubitz & Fahey 2006). From the diagnostic point of view, the major drawback of the SVM-based methods are the fluctuations of gene content between classifiers of different size or based on slightly different training sets. To overcome this problem, we extended the original algorithm with bootstrap iterations, as recommended (Braga-Neto & Dougherty 2004). A bootstrap iteration depends on creating a temporary learning set (bootstrap sample) by performing selection from the original set with return of samples. Then, the classification rule is derived based on a bootstrap sample and applied to the rest of the original set. Multiple selections of slightly different training sets represent the variability, which may be observed between different thyroid cancer collections, laboratories, etc. Indeed, our current data generated using the bootstrap technique show much better agreement with the results of other thyroid cancer studies (Oczko-Wojciechowska et al. submitted) than data created by leave-one-out cross-validation of the whole dataset (Jarzab ). Originally, in a bootstrap iteration one counts only the number of misclassifications. Since in all bootstrap iterations every step of data processing (gene selection and classifier training) has to be repeated (Simon ), some additional knowledge can be gained. The procedure used by us enables ranking of genes which are most often present in the classifiers obtained from the different subsets of the training set (BBFR). Furthermore, it also estimates the accuracy with appropriate CIs. Moreover, it allows ranking the samples according to the frequency of misclassifications (BBOD). The use of BBFR resulted in delineation of genes, which were either novel or not recognized before for their contribution to the PTC gene expression profile, even if they were included in the large gene lists given in previous genomic studies. BBOD allowed us to reveal ‘difficult’ samples in the analyzed group. The two thyroid samples with the poorest accuracy of diagnosis were derived from patients with familial thyroid tumors, which suggest that their gene expression profiles may differ from sporadic ones. For the remaining samples, in 175 out of 180 cases (>97%) the percentage of correct diagnoses was >90%. Recently, Zhang have published a SVM-based recursive method of gene selection. This method, called R-SVM, differs from the standard RFE algorithm, used here, in modified criteria applied in elimination steps. Moreover, the final gene subset is created on the basis of any resample method used at the validation stage, which is similar to our approach presented here. Nevertheless, our bootstrap-based method allows detecting outlier samples and provides the estimation of CIs for the classification accuracy, which is much more informative than the accuracy estimator alone.

PTC and normal/benign difference versus inter-platform difference

To assure a sufficient number of tissue samples, it was necessary to combine data obtained using different generations of GeneChips, which cannot be compared by a direct approach (Eszlinger ). The use of multi-gene classifiers allows, however, overcoming this difficulty. We showed earlier that the classifier selected using the U133 platform (Jarzab ) performs well on U95-obtained data and has high classification accuracy (Eszlinger ). In the present paper, we demonstrate that it is possible, after correctly matching genes from two different generation microarrays, to derive an efficient multi-gene classifier. When we included both benign and malignant samples from both platforms, the vast majority of these samples were properly classified. Using Affymetrix GeneChips, Barden and Finley , had previously reported 20 of 43 genes now confirmed by us as diagnostically relevant for PTC. This is a level of agreement rarely noted for inter-group comparisons of microarray results. Our analysis has been performed on microarray data pre-processed by the standard MAS5 algorithm. Although many authors demonstrate the superiority of other pre-processing methods (e.g. RMA or GC-RMA; Irizarry ), for inter-platform comparisons, the MAS5 method still seems to be a reasonable approach. In the MAS5 algorithm, each array is processed independently and the bootstrap procedure does not have to involve this step. Use of RMA pre-processing, which has to operate on the whole dataset, would pose the question of whether this step should also be bootstrapped. Presently, this is not feasible due to huge computational demand of pre-processing for large sample sets.

Redundancy of multi-gene cancer classifiers

This is inherently linked to the huge differences in gene expression profiles of several tumors, originating from the same tissue. This was indicated for the first time by Ein-Dor in breast cancer. These authors re-analyzed the data of van't Veer and showed that multiple similar classifiers may be obtained; they have comparable classification potency as van’t Veer’s original 70-gene classifier but a different gene content. Ein-Dor et al. stressed also that even slight differences in the training set composition influenced the selected genes. Our analysis demonstrates that similar redundancy is present in PTC. This fact is frequently overlooked by authors interpreting the results of gene expression profile studies that involved only a few genes or which were obtained in small groups of patients. In this paper, we propose a method of ranking genes according to their importance in multi-gene classifiers and with appropriate CIs indicating the robustness of the result. To conclude, the primary goal of this study was to validate a novel SVM-based approach to differentiation of PTC from benign thyroid lesions. This goal was achieved with a very satisfactory degree of accuracy, over 95%. Simultaneously, we were able to rank the genes most essential for the molecular diagnosis of PTC. Although the presented list of genes can be enlarged, we believe the first 40 genes are especially suitable for further prospective studies in fine needle biopsy material and may serve to construct multi-gene classifiers with potential application in clinical setting. The comparison with other published microarray studies yields sufficient validation for the vast majority of them.
  111 in total

1.  Bootstrapping cluster analysis: assessing the reliability of conclusions from microarray experiments.

Authors:  M K Kerr; G A Churchill
Journal:  Proc Natl Acad Sci U S A       Date:  2001-07-24       Impact factor: 11.205

2.  Complementary DNA expression array analysis suggests a lower expression of signal transduction proteins and receptors in cold and hot thyroid nodules.

Authors:  M Eszlinger; K Krohn; R Paschke
Journal:  J Clin Endocrinol Metab       Date:  2001-10       Impact factor: 5.958

3.  Overexpression of proteins HMGA1 induces cell cycle deregulation and apoptosis in normal rat thyroid cells.

Authors:  M Fedele; G M Pierantoni; M T Berlingieri; S Battista; G Baldassarre; N Munshi; M Dentice; D Thanos; M Santoro; G Viglietto; A Fusco
Journal:  Cancer Res       Date:  2001-06-01       Impact factor: 12.701

4.  Identifying differentially expressed genes associated with metastasis of follicular thyroid cancer by cDNA expression array.

Authors:  K T Chen; J D Lin; T C Chao; C Hsueh; C A Chang; H F Weng; E C Chan
Journal:  Thyroid       Date:  2001-01       Impact factor: 6.568

5.  Transcriptional activation of the thyroglobulin promoter directing suicide gene expression by thyroid transcription factor-1 in thyroid cancer cells.

Authors:  H Shimura; H Suzuki; A Miyazaki; F Furuya; K Ohta; K Haraguchi; T Endo; T Onaya
Journal:  Cancer Res       Date:  2001-05-01       Impact factor: 12.701

Review 6.  Met protein and hepatocyte growth factor (HGF) in papillary carcinoma of the thyroid: evidence for a pathogenetic role in tumourigenesis.

Authors:  L P Ruco; A Stoppacciaro; F Ballarini; M Prat; S Scarpino
Journal:  J Pathol       Date:  2001-05       Impact factor: 7.996

7.  Gene expression in papillary thyroid carcinoma reveals highly consistent profiles.

Authors:  Y Huang; M Prasad; W J Lemon; H Hampel; F A Wright; K Kornacker; V LiVolsi; W Frankel; R T Kloos; C Eng; N S Pellegata; A de la Chapelle
Journal:  Proc Natl Acad Sci U S A       Date:  2001-12-18       Impact factor: 11.205

8.  Over-expression of hepatocyte growth factor/scatter factor (HGF/SF) and the HGF/SF receptor (cMET) are associated with a high risk of metastasis and recurrence for children and young adults with papillary thyroid carcinoma.

Authors:  R Ramirez; D Hsu; A Patel; C Fenton; C Dinauer; R M Tuttle; G L Francis
Journal:  Clin Endocrinol (Oxf)       Date:  2000-11       Impact factor: 3.478

9.  Immunostaining for Met/HGF receptor may be useful to identify malignancies in thyroid lesions classified suspicious at fine-needle aspiration biopsy.

Authors:  A Ippolito; V Vella; G L La Rosa; G Pellegriti; R Vigneri; A Belfiore
Journal:  Thyroid       Date:  2001-08       Impact factor: 6.568

10.  Is there loss or qualitative changes in the expression of thyroid peroxidase protein in thyroid epithelial cancer?

Authors:  B Czarnocka; D Pastuszko; M Janota-Bzowski; A P Weetman; P F Watson; E H Kemp; R S McIntosh; M S Asghar; B Jarzab; E Gubala; J Wloch; D Lange
Journal:  Br J Cancer       Date:  2001-09-14       Impact factor: 7.640

View more
  17 in total

1.  Three-gene molecular diagnostic model for thyroid cancer.

Authors:  Nijaguna B Prasad; Jeanne Kowalski; Hua-Ling Tsai; Kristin Talbot; Helina Somervell; Guennadi Kouniavsky; Yongchun Wang; Alan P B Dackiw; William H Westra; Douglas P Clark; Steven K Libutti; Christopher B Umbricht; Martha A Zeiger
Journal:  Thyroid       Date:  2012-01-26       Impact factor: 6.568

Review 2.  Iodothyronine deiodinases and cancer.

Authors:  A Piekiełko-Witkowska; A Nauman
Journal:  J Endocrinol Invest       Date:  2011-05-27       Impact factor: 4.256

3.  Interleukins as markers of inflammation in malignant and benign thyroid disease.

Authors:  Xeni Provatopoulou; Despoina Georgiadou; Theodoros N Sergentanis; Eleni Kalogera; John Spyridakis; Antonia Gounaris; George N Zografos
Journal:  Inflamm Res       Date:  2014-05-03       Impact factor: 4.575

4.  Multiplex analysis of cytokines as biomarkers that differentiate benign and malignant thyroid diseases.

Authors:  Faina Linkov; Robert L Ferris; Zoya Yurkovetsky; Adele Marrangoni; Lyudmila Velikokhatnaya; William Gooding; Brian Nolan; Matthew Winans; Eric R Siegel; Anna Lokshin; Brendan C Stack
Journal:  Proteomics Clin Appl       Date:  2008-10-10       Impact factor: 3.494

5.  Identification of SERPINA1 as single marker for papillary thyroid carcinoma through microarray meta analysis and quantification of its discriminatory power in independent validation.

Authors:  Klemens Vierlinger; Markus H Mansfeld; Oskar Koperek; Christa Nöhammer; Klaus Kaserer; Friedrich Leisch
Journal:  BMC Med Genomics       Date:  2011-04-06       Impact factor: 3.063

6.  Strategy to find molecular signatures in a small series of rare cancers: validation for radiation-induced breast and thyroid tumors.

Authors:  Nicolas Ugolin; Catherine Ory; Emilie Lefevre; Nora Benhabiles; Paul Hofman; Martin Schlumberger; Sylvie Chevillard
Journal:  PLoS One       Date:  2011-08-11       Impact factor: 3.240

Review 7.  Application of metabolomics in thyroid cancer research.

Authors:  Anna Wojakowska; Mykola Chekan; Piotr Widlak; Monika Pietrowska
Journal:  Int J Endocrinol       Date:  2015-04-20       Impact factor: 3.257

8.  Stable feature selection and classification algorithms for multiclass microarray data.

Authors:  Sebastian Student; Krzysztof Fujarewicz
Journal:  Biol Direct       Date:  2012-10-02       Impact factor: 4.540

9.  Prediction of breast cancer by profiling of urinary RNA metabolites using Support Vector Machine-based feature selection.

Authors:  Carsten Henneges; Dino Bullinger; Richard Fux; Natascha Friese; Harald Seeger; Hans Neubauer; Stefan Laufer; Christoph H Gleiter; Matthias Schwab; Andreas Zell; Bernd Kammerer
Journal:  BMC Cancer       Date:  2009-04-05       Impact factor: 4.430

10.  Expression profiles of pivotal microRNAs and targets in thyroid papillary carcinoma: an analysis of The Cancer Genome Atlas.

Authors:  Dan Cong; Mengzi He; Silin Chen; Xiaoli Liu; Xiaodong Liu; Hui Sun
Journal:  Onco Targets Ther       Date:  2015-08-26       Impact factor: 4.147

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.