Literature DB >> 17572360

Analysis of pathway activity in primary tumors and NCI60 cell lines using gene expression profiling data.

Xing-Dong Feng1, Shu-Guang Huang, Jian-Yong Shou, Bi-Rong Liao, Jonathan M Yingling, Xiang Ye, Xi Lin, Lawrence M Gelbert, Eric W Su, Jude E Onyia, Shu-Yu Li.   

Abstract

To determine cancer pathway activities in nine types of primary tumors and NCI60 cell lines, we applied an in silica approach by examining gene signatures reflective of consequent pathway activation using gene expression data. Supervised learning approaches predicted that the Ras pathway is active in approximately 70% of lung adenocarcinomas but inactive in most squamous cell carcinomas, pulmonary carcinoids, and small cell lung carcinomas. In contrast, the TGF-beta, TNF-alpha, Src, Myc, E2F3, and beta-catenin pathways are inactive in lung adenocarcinomas. We predicted an active Ras, Myc, Src, and/or E2F3 pathway in significant percentages of breast cancer, colorectal carcinoma, and gliomas. Our results also suggest that Ras may be the most prevailing oncogenic pathway. Additionally, many NCI60 cell lines exhibited a gene signature indicative of an active Ras, Myc, and/or Src, but not E2F3, beta-catenin, TNF-alpha, or TGF-beta pathway. To our knowledge, this is the first comprehensive survey of cancer pathway activities in nine major tumor types and the most widely used NCI60 cell lines. The "gene expression pathway signatures" we have defined could facilitate the understanding of molecular mechanisms in cancer development and provide guidance to the selection of appropriate cell lines for cancer research and pharmaceutical compound screening.

Entities:  

Mesh:

Year:  2007        PMID: 17572360      PMCID: PMC5054081          DOI: 10.1016/S1672-0229(07)60010-2

Source DB:  PubMed          Journal:  Genomics Proteomics Bioinformatics        ISSN: 1672-0229            Impact factor:   7.691


Introduction

Cancer is a genetic disease driven by mutations in three types of genes: oncogenes, tumor suppressors, and genome stability genes involved in DNA repair and mitotic processes (. It has been estimated that three to seven mutations are required for the development of cancers (. At the molecular level, these mutations drive the neoplastic process through deregulation of cellular pathways and biological processes that control cell fate, growth, differentiation, and survival. Mutations of oncogenes and tumor suppressors increase tumor cell number by stimulating cell proliferation and inhibiting differentiation and apoptosis pathways (. For example, the activation of Ras proteins by mutations of the ras oncogene recruits the Raf kinase that subsequently activates transcription factors Fos and Jun through MAP kinase signaling pathways. Fos and Jun in turn form AP1 and up-regulate growth promoting genes (. Mutations of different oncogenes or tumor suppressors have been associated with different cancer types, suggesting that specific pathways may be responsible for the development of specific cancers (. Therefore, determining pathway activities in cancers is critical not only for understanding molecular mechanisms in tumor progression but also for designing targeted therapeutic strategies. Cell lines derived from primary tumor tissues have provided a valuable tool for the understanding of cancer biology at the molecular level. Much of the knowledge that we have today on fundamental processes in cancer cells has largely depended on the use of cell lines (. In addition, since cancer cell lines provide an unlimited source of malignant cells, they are widely used in screening for anti-cancer drugs. However, because cells cultured in vitro lack the overall tissue architecture and relevant microenvironment, and cells continuously maintained in culture may lose the attributes of the tumors from which they are derived (, the value of cancer cell lines is limited by the extent to which they represent the primary tumors’ origin and activities. Several approaches have been utilized to characterize cancer cell lines. The ability to form tumors when cell lines were transplanted subcutaneously into nude mice allows a direct comparison of histopathology between tumors formed in nude mice and the human tumors of origin (. Efforts have been made to delineate morphological features of cell lines in comparison with archival tumor tissues that the cell lines are derived from 6., 7.. At molecular levels, expression of key proteins such as HER2/neu and p53 in breast and non-small cell lung cancer cell lines as well as their corresponding tumors have been assessed using immunohistochemistry 6., 7.. Previously, we carried out a direct comparison between NCI60 cell lines and 9 primary tumor types using gene expression profiling data generated from more than 500 primary tumor samples (. Our computational analysis suggested that 51 of the 59 NCI60 cell lines represent their presumed tumors of origin. These cell lines were also classified into tumor subtypes or different stages in cancer development (. However, it remains unclear that what pathways are activated in each of these cell lines. Therefore, further analysis of pathway activation status in cancer cell lines could provide guidance to the selection of cell lines as appropriate models for studying cancer pathways and for target-based drug screening. DNA microarray technology has created a new paradigm for understanding cancer biology by simultaneous measurement of tens of thousands of genes in malignant or normal cells. Gene expression profiles have been utilized to identify gene signatures that are associated with tumor progression and alterations in cancer pathways. Recently, gene expression signatures have been identified to reflect the activities of five oncogenic pathways, namely Ras, Myc, Src, E2F3, and β-catenin (. These signatures derived from primary cell cultures have been validated in transgenic animal models and are correlated with sensitivity to therapeutic agents targeting specific pathways (. Here we exploited the gene expression signatures for these five oncogenic pathways and two receptor-mediated signaling pathways, namely transforming growth factor (TGF)-β and tumor necrosis factor (TNF)-α, to predict pathways in nine major types of primary cancers and NCI60 cell lines. Supervised learning-based prediction suggested that different pathways are involved in the development of different tumor types. Moreover, our assessment of pathway activation status in NCI60 cells highlights the value of specific cell lines in studying these pathways and their roles in oncogenesis.

Results

Developing gene signatures representing active pathways and building supervised models for classification

We used gene expression profiles generated in primary mammary epithelial cell cultures ( to derive signatures for the activated Ras, Myc, Src, E2F3, or β-catenin pathways. The training dataset includes two groups, cells transfected with adenovirus expressing green fluorescent protein (GFP) or one of the oncogenes. Gene signatures for the activated TGF-β or TNF-α pathways were identified using gene expression profiles of TGF-β or TNF-α treated by a non-small cell lung cancer cell line Calu6 or of the vehicle control (Yingling and Ye, unpublished results). Two criteria were considered in our selection of signature gene sets for the pathways. First, several candidate signatures were determined, which would give rise to a minimal cross validation error rate. Second, from multiple signature gene sets that satisfy a threshold of cross validation error rates, we selected the one with the smallest number of genes. As a result, there is limited overlap between the gene signatures for different pathways. Unlike the previous study on the five oncogenic pathways where authors built gene classifiers that are overlapping between different pathways (, we believe our approach has generated signature gene sets that are more specific for each pathway and may provide more accurate predictions. Lists of genes selected for subsequent principle component analysis (PCA) and classification are provided in Supporting Online Material (Table S1). Many of these genes are known downstream targets for each of the pathways. To predict what pathways are active in each of the primary tumor samples and NCI60 cell lines, we used supervised learning approaches (Figure 1). After gene features were selected from the training dataset, supervised predictors were built using a support vector machine (SVM) algorithm. Parameters were adjusted in model building to ensure minimal leave-one-out cross validation (LOOCV) error rates. Table S2 illustrates an example of this process for the Ras pathway. Analysis of variance was carried out to identify genes differentially expressed between the two groups in the training dataset, that is, cells transfected with adenovirus expressing GFP or the activated H-Ras. Figure 2A clearly depicts a completely opposite expression pattern of these genes in the control group and in the group with a constitutively active Ras pathway. Then the data reduction using PCA and the subsequent building of classification models were carried out. Multiple models were evaluated using different numbers of principle components, different SVM kernel functions, and different cost parameters. Based on the criteria described in Materials and Methods, we chose three principle components as the discriminants, the Sigmoid kernel function, and a cost parameter of 8 that gave rise to the optimal error rate in LOOCV. Supervised models for other pathways were also built and tested using the same approach (data not shown).
Fig. 1

Feature classification using supervised learning methods. PCA: principal component analysis; LOOCV: leave-one-out cross validation.

Fig. 2

Classification of primary lung cancers and NCI60 cell lines with respect to active vs. inactive Ras pathways. A. A 30-gene signature developed from the training dataset for the Ras pathway. Red and blue represent high and low levels of expression respectively. The y-axis represents the 30 genes and the x-axis represents two groups in the training dataset, that is, cells transfected with adenovirus expressing the activated H-ras or GFP as a control. B. Gene expression patterns of the signature genes in 186 lung cancers and 59 NCI60 cell lines with an activated or inactive Ras pathway.

Classification of primary cancers

We first attempted to classify lung cancers into an active vs. inactive status for each pathway. The testing gene expression profiling data were previously published using 186 primary lung cancer samples, including 139 adenocarcinomas, 21 squamous cell lung carcinomas, 20 pulmonary carcinoids, and 6 small cell lung cancers ( (Table 1). Our prediction results (Table 2) suggest that the Ras pathway is activated in almost 70% of lung adenocarcinoma patients, but is inactive in most squamous cell carcinomas, pulmonary carcinoids, and small cell lung carcinomas. In contrast, the Src, Myc, E2F3, β-catenin, TGF-β, and TNF-α pathways are inactive in almost all of the lung adenocarcinomas. Figure 2B is a graphic illustration of gene expression patterns in lung cancers with an active or inactive Ras pathway. It is noteworthy that differential expression of these signature genes in active vs. inactive primary tumors (Figure 2B) has less magnitude than that observed in the primary cell cultures (Figure 2A), raising the possibility that subtle changes in the pathways may be sufficient to trigger tumorigenesis. An alternative explanation is that tumor biopsy samples often contain a certain percentage of tumor cells and other non-tumor cell types. Therefore, gene expression patterns in tumors are mixed with noise from non-tumor cells. Significant numbers of pulmonary carcinoid and small cell lung cancer samples exhibited a gene signature representing an active E2F3 pathway (Table 2). Our prediction of the activity status of the Ras pathway in lung adenocarcinomas and squamous cell lung carcinomas using the dataset from Bhattacharjee et al. ( is consistent with the results reported by Bild and colleagues based on a different cohort of patients (.
Table 1

Gene expression profiling datasets on NCI60 cell lines and primary tumors analyzed in this study

Cancer typeSample sizeData formatURL for data downloadingRef.
NCI60 cell linesMAS5http://dtp.nci.nih.gov/mtargets/madownload.html
Lung186MAS5http://www.broad.mit.edu/cgi-bin/cancer/datasets.cgi10
Prostate52MAS5http://www.broad.mit.edu/cgi-bin/cancer/datasets.cgi25
Leukemia72MAS5http://www.broad.mit.edu/cgi-bin/cancer/datasets.cgi26
CNS50MAS5http://www.broad.mit.edu/cgi-bin/cancer/datasets.cgi27
Melanoma29MAS5http://www.mskcc.org/genomic/ccsmsp/28
Breast171MAS5http://data.cgt.duke.edu/oncogene.php9
Ovary146MAS5http://data.cgt.duke.edu/oncogene.php9
Colon23MAS4http://www.gnf.org/cancer/epican/29
Kidney11MAS4http://www.gnf.org/cancer/epican/29
Table 2

Pathway activity in lung cancers*

PathwayAdenocarcinomaSquamous cell carcinomaPulmonary carcinoidSmall cell lung cancer
Ras0.683 (95/139)0.14 (3/21)0 (0/20)0.17 (1/6)
Myc0.029 (4/139)0 (0/21)0 (0/20)0.17 (1/6)
Src0.029 (4/139)0 (0/21)0 (0/20)0 (0/6)
E2F30.022 (3/139)0 (0/21)0.40 (8/20)0.50 (3/6)
β-catenin0 (0/139)0 (0/21)0 (0/20)0 (0/6)
TGF-β0 (0/139)0 (0/21)0 (0/20)0 (0/6)
TNF-α0.065 (9/139)0 (0/21)0 (0/20)0 (0/6)

The percentages of patients with predicted active pathways are shown. The numbers in parentheses are the numbers of patients with active pathways vs. the total numbers of patient samples in each subtype of lung cancers. Bolded numbers indicate a significant percentage (> 20% for sample size ≥ 20) of samples exhibiting a gene signature of active pathways.

We next carried out classification of other tumor types with publicly available oligonucleotide microarray data (Table 1). Analysis results (Table 3) reveal that Ras may be the most prevailing oncogenic pathway, since gene expressions in significant amount of tumor samples in each cancer type are indicative of an active Ras pathway according to our computational prediction. Different cancer types, however, behave differently with respect to activities of other pathways. For example, while Ras is the only active pathway in lung adenocarcinomas, we predicted an active Ras, Myc, Src, and E2F3 pathway in 73%, 70%, 21%, and 30% of breast cancer patients, respectively. Upon further investigation, 74%, 67%, and 69% of Myc, Src, and E2F3 active samples, respectively, also have an active Ras pathway, suggesting multiple oncogenic pathways may coordinately promote breast cancer progression in these patients. The observations of multiple and overlapping activated pathways in breast tumors reflect the heterogenous nature of cancer. An active status in multiple pathways has also been predicted in brain, colon, kidney, and ovarian cancers. In contrast, Ras is the only active pathway in leukemia, melanoma, and prostate cancers, which is similar to what was observed in lung adenocarcinomas. Except for ovarian cancers, the β-catenin, TGF-β, and TNF-α pathways are inactive in almost all of the primary tumors. Collectively, these results substantiate the notion that different pathways may play critical roles in the development of different cancer types.
Table 3

Pathway activity in other primary cancers*

PathwayBreast cancerCNS cancerColon cancerKidney cancerLeukemiaMelanomaOvarian cancerProstate cancer
Ras0.730.440.430.720.360.310.480.50
(125/171)(22/50)(10/23)(8/11)(26/72)(9/29)(70/146)(26/52)
Myc0.700.420.350.0910.0550.140.220.19
(120/171)(21/50)(8/23)(1/11)(4/72)(4/29)(32/146)(10/52)
Src0.210.560.910.910.0140.06900.019
(36/171)(28/50)(21/23)(10/11)(1/72)(2/29)(0/146)(1/52)
E2F30.300.120.04300.170.170.280.17
(51/171)(6/50)(1/23)(0/11)(12/72)(5/29)(41/146)(9/52)
β-catenin00000000
(0/171)(0/50)(0/23)(0/11)(0/72)(0/29)(0/146)(0/52)
TGF-β0000000.820
(0/171)(0/50)(0/23)(0/11)(0/72)(0/29)(120/146)(0/52)
TNF-α0.0580.040.08700.0550.1000.038
(10/171)(2/50)(2/23)(0/11)(4/72)(3/29)(0/146)(2/52)

The percentages of patients with predicted active pathways are shown. The numbers in parentheses are the numbers of patients with active pathways vs. the total numbers of patient samples in each cancer type. Bolded numbers indicate a significant percentage (> 20%) of samples exhibiting a gene signature of active pathways.

Classification of NCI60 cell lines

NCI60 represents the most commonly used cancer cell lines in cancer research and drug screening. In order to evaluate them as models for primary tumors, we estimated pathway activities in NCI60 cell lines using the supervised learning-based classification. Listed in Table 4 are the cell lines with predicted active pathways. Although these results await further experimental validation, they could provide directions to the selection of specific cell lines to study specific pathways in cancer cells. Even though most of the NCI60 cell lines were suggested to represent their corresponding tumor origin (, we postulate that distinct pathways are active in each of these cell lines according to our in silico analysis. For example, except for NCI/ADR-RES, all of the breast cell lines in the NCI60 panel have global gene expression profiles more similar to that of primary breast cancers than other tumor types (, but their expression patterns for pathway specific gene signatures are different. BT-549, MDA-MB-231, and HS578 exhibited an active expression signature for the Ras pathway, and MCF7 is the only line that we predicted to have an active Src pathway (Table 4). Interestingly, many cell lines are active in only one or two pathways. This is not unexpected given the homogeneity of the cultured cells due to clonal selection.
Table 4

NCI60 cell lines with predicted active pathways*

Tumor typeRasMycSrcE2F3TNF-α
BreastBT-549, MDA-MB-231, HS578TMDA-MB-435, BT-549, NCI/ADR-RESMCF7-MDA-MB-231, HS578T

CNS-SF-268--SF-268

ColonHT-29, COLO205, HCT-15, KM12, HCT-116COLO205, KM12, HCT-116, SW-620KM12, HCC-2998HCT-15-

Kidney786-0-RXF-393, 786-0--

Leukemia-RPMI-8226, CCRF-CEM, K-562, MOLT-4, HL-60RPMI-8226, SR, K-562, HL-60CCRF-CEM, MOLT-4-

LungNCI-H460, NCI-H23, NCI-H522, HOP-92NCI-H460, EKVX, NCI-H522NCI-H23EKVX, NCI-H522-

MelanomaLOX IMVI, UACC-257, SK-MEL-28LOX IMVI, UACC-62, SK-MEL-2, SK-MEL-5LOX IMVI, UACC-62, UACC-257, SK-MEL-5--

OvaryOVCAR-5IGROV1, OVCAR-4, OVCAR-8IGROV1, OVCAR-8--

ProstatePC3, DU-145PC3PC3--

The β-catenin and TGF-β pathways were predicted to be inactive in all of the cell lines and thus are omitted in the table.

Discussion

Although genome-wide expression profiling has become a mainstay in cancer research, it remains a challenge to extract biological insight from gene expression data. In a typical experiment, individual genes are identified according to their differential expression between the control group and the experimental group, followed by mapping of these genes to biological pathways. However, it has been demonstrated that a biological pathway could play a significant role in physiological processes even though each gene in the pathway only exhibits subtle gene expression changes to external perturbations but collectively they exert significant impact to the cells (. Several algorithms have been proposed to analyze expression data focusing on pathways rather than on individual genes 12., 13.. However, before we consider applying these methods to cancer microarray data, two issues need to be addressed. First, it has been a common practice to measure pathway activity by analyzing expression of genes involved in signal transduction. We believe this approach is problematic in studying signaling pathways in cancers. Activation of those pathways often involves post-translational modification of proteins in the signaling but does not depend on an increased expression of genes encoding those proteins. A more sensitive and robust approach would be interrogating downstream genes, that is, gene expression changes that reflect pathway activation. Second, the frequently used computational methods for pathway analysis compare gene expression patterns between the control group (such as normal tissues) and the experimental group (such as cancerous cells). Given the variability between individuals and limited sample sizes typical of human studies, it could be difficult to distinguish true difference from noise. In this study, we developed a strategy to overcome the above mentioned shortcomings in the current methodology. Gene signatures for the seven pathways were developed from experimental data. Alterations in signature gene expression are associated with and can be used as a direct “readout” of pathway activation. Furthermore, we applied supervised learning methods to predict pathway status in individual samples, which should provide more accurate and sensible results. Computational analysis requires laboratory experimentation to validate the results. Some of our predictions have already been confirmed by experimental data reported in the literature. It shows that 68% of lung adenocarcinomas exhibited a gene expression signature of active Ras pathways (Table 2). This is consistent with the finding that PCR-based method has detected ras mutations in non-small cell lung cancers at frequencies that may exceed 50% (. We predicted an active Src pathway in the majority (21 of 23, 91%) of colorectal carcinoma samples (Table 3), which is supported by studies that described over expression of c-src and deregulation of the Src pathway in more than 70% of human colon cancers (. Previously, gene amplification has been examined in glioblastomas using an array-based comparative genomic hybridization, and Myc amplification was detected in 42% of the samples (. This again is consistent with our in silico pathway analysis indicating 42% of the gliomas have an active Myc pathway. Gene expression patterns in 70% of breast cancer samples represent an active Myc pathway (Table 3). This is not surprising since immunohistochmistry has detected over expression of c-Myc proteins in 45% of 440 primary breast carcinomas (. Even though some oncogenes are not mutated or amplified in certain cancer types, it is still possible that the oncogenic pathways are active in these cancers through other mechanisms. For example, we report here that half of the prostate cancers may have an activated Ras pathway, yet it has been well documented in the literature that ras mutations are rare in prostate cancers 18., 19., 20.. However, in a very recent study Ras downstream MAP kinase activity in prostate cancers was investigated using immunohistochemistry for p44/ERK1 and p42/ERK2, and active MAPK signaling was detected in 51% of the analyzed tumors (, strikingly similarly to our predictions. High and low frequencies of an activated Ras pathway in lung adenocarcinomas and squamous lung cell carcinomas respectively reported by us in this study are in agreement with recent results also based on computational prediction but using gene expression data generated from a different cohort of patients (. Taken together, these evidence strongly supports our approach to examine pathway activity using gene expression profiling data. We also recognize the limitations in our study. First, gene signatures were developed from an in vitro system where the pathways were experimentally activated. The differential expressions of the signature genes are augmented artificially. While the control group and the experimental group in the training dataset can be clearly defined into two classes, there is a significantly greater variability of pathway activity in primary tumors. Therefore, our prediction of a pathway in cancers into either the inactive or the active status is rather arbitrary. Second, gene signatures were derived from data using the primary mammary epithelial cells or non-small cell lung cancer cells Calu6. However, downstream genes regulated by these pathways could be cell type specific. As a result, using the gene signature identified from one cell type to predict pathway activity in other cell types may cause high rate of false negatives. Third, although mutations occur primarily in tumor cells, some pathways play a pivotal role in non-tumor cells to provide a microenvironment for promoting tumor progression and angiogenesis. For example, an activated TGF-β pathway creates a favorable microenvironment for tumor growth and invasion (. The effects of TGF-β pathway activation is mainly executed in tumor microenvironment but not in tumor cells. Consequently, the importance of TGF-β pathway in cancer development should not be undermined even though it is in an inactive state in primary tumor cells. Fourth, one of our main goals is to predict pathway activation status in NCI60 cell lines. An inactive pathway in a cell line, however, only indicates a low baseline activity and does not necessarily exclude the cell line as an ideal model to study the pathway. In fact, TGF-β target genes in Calu6 cells are expressed at minimal levels but are robustly up-regulated in response to the TGF-β ligand. Accordingly, if a cell line has intact signaling components of a pathway and responds to ligand stimulation, it should be still considered as a good model system even the basal pathway activity is minimal. Finally, gene expression profiles in cell culture in vitro may not reflect gene expressions evaluated when cells are grown in vivo, as evidenced by a recent study that although two glioblastoma cell lines (U251 and U87) have disparate gene expression profiles when grown in monolayer cell cultures, they share similar gene expression patterns when grown as intracerebral xenografts in nude mice (. Therefore, the next level approach to evaluate cell lines would be using gene expression profiles of cell lines grown in xenograft models when such data become available. Nevertheless, we believe that with more gene expression profiling studies being carried out, gene signatures for more pathways can be developed in multiple cell types. Our computational approach in predicting pathway activities provides a valuable tool that can be generally applied to studying biological pathways under normal and pathological conditions.

Materials and Methods

Data source

The gene expression profiling data on NCI60 cell lines provided by NCI’s DTP program (http://dtp.nci.nih.gov/mtargets/madownload.html) are based on Affymetrix U95Av2 oligonucleotide array platforms. While oligonucleotide arrays measure the amount of mRNA in a single sample, gene expression data generated using cDNA array platforms are ratios of expression values in experimental samples over those in a reference sample. The fundamental difference between the two array platforms poses a technical barrier in integrative analysis of gene expression data based on these two different platforms. Therefore, we chose only Affymetrix oligonucleotide array-based data in publicly available gene expression profiling databases on primary tumors (Table 1). Gene expression data on NCI60 cell lines and primary tumor samples were downloaded from the URL addresses shown in Table 1. Gene expression data of lung, prostate, central nervous system (CNS) cancers, and leukemia were originally generated with Affymetrix MAS4 software. The breast and ovarian cancer datasets were in gcRMA format. We downloaded the .cel files and analyzed them using Affymetrix MAS5 algorithm with trimmed mean values normalized to 500. A trimmed mean is the average value after removing the lowest 2% and the highest 2% of all expression values. The downloaded array data for NCI60 cell lines and melanomas were in MAS5 format and we re-normalized the data by setting the trimmed means to 500. Data were only available for 59 of the NCI60 cell lines. For colon and kidney cancers, we were only able to obtain MAS4 gene expression data and similarly, these data were normalized with trimmed means equal to 500. We compiled the gene expression data for a total of 799 samples after averaging the expression values over the technical replicates in the lung dataset and in NCI60 cell lines. Expression of each probe set was standardized to a mean of 0 and standard deviation of 1. The standardization procedure is performed for the training dataset and the testing dataset separately.

Feature selection and classification

All statistical analysis was implemented using SAS and R statistical languages. Pair-wise t-tests were used to identify genes differentially expressed between the control group and the active pathway group in the training dataset. Probe sets were ranked by p-values. Multiple p-values were tested as thresholds to select gene features for subsequent PCA and classifications. For each pathway, we chose a cutoff p-value that gave rise to minimal cross validation error rates in classifications (see below). SVM was used as the classification method (. We performed PCA after gene features were selected. The number of principal components p and the parameter cost that corresponding to constant of the regularization term in the Lagrange formulation was determined based on LOOCV error rate. LOOCV is a procedure in which we trained classifier based on the training dataset after one object is removed and the classifier was tested on the removed one. The procedure was implemented for each object in the dataset and the proportion of errors counted throughout the process is called LOOCV error rate. The value p and the cost parameter were chosen to be the smallest one that satisfies two criteria: (1) the LOOCV error rate of the classifier is smaller than 0.05; and (2) the three consecutive classifiers built on the features with p, p+1, and p+2 principal components give consistent predictions. Four most commonly used SVM kernel functions were tested: linear, polynomial with degree 3, radial bases, and neural network. Similarly, we chose a kernel function that minimizes the LOOCV error rate for analysis of each pathway.

Authors’ contributions

SGH, JYS, BRL, and SYL designed the study. XDF and SYL carried out data analysis. JMY, XY, XL, and LMG generated microarray data for the TGF-β and TNF-α pathways. XDF, SGH, JTS, JEO, and SYL interpreted the results. XDF and SYL drafted the manuscript. SGH, JYS, BRL, XY, LMG, EWS, and JEO revised the manuscript. All authors read and approved the final manuscript.

Competing interests

The authors have declared that no competing interests exist.
  27 in total

Review 1.  HeLa cells 50 years on: the good, the bad and the ugly.

Authors:  John R Masters
Journal:  Nat Rev Cancer       Date:  2002-04       Impact factor: 60.716

2.  Influence of in vivo growth on human glioma cell line gene expression: convergent profiles under orthotopic conditions.

Authors:  Kevin Camphausen; Benjamin Purow; Mary Sproull; Tamalee Scott; Tomoko Ozawa; Dennis F Deen; Philip J Tofilon
Journal:  Proc Natl Acad Sci U S A       Date:  2005-05-31       Impact factor: 11.205

3.  Protein expression and molecular analysis of c-myc gene in primary breast carcinomas using immunohistochemistry and differential polymerase chain reaction.

Authors:  Rakesh Naidu; Norhanom Abdul Wahab; Manmohan Yadav; Methil Kannan Kutty
Journal:  Int J Mol Med       Date:  2002-02       Impact factor: 4.101

4.  ras gene mutations in human prostate cancer.

Authors:  B S Carter; J I Epstein; W B Isaacs
Journal:  Cancer Res       Date:  1990-11-01       Impact factor: 12.701

5.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

6.  Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses.

Authors:  A Bhattacharjee; W G Richards; J Staunton; C Li; S Monti; P Vasa; C Ladd; J Beheshti; R Bueno; M Gillette; M Loda; G Weber; E J Mark; E S Lander; W Wong; B E Johnson; T R Golub; D J Sugarbaker; M Meyerson
Journal:  Proc Natl Acad Sci U S A       Date:  2001-11-13       Impact factor: 11.205

7.  Molecular classification of human carcinomas by use of gene expression signatures.

Authors:  A I Su; J B Welsh; L M Sapinoso; S G Kern; P Dimitrov; H Lapp; P G Schultz; S M Powell; C A Moskaluk; H F Frierson; G M Hampton
Journal:  Cancer Res       Date:  2001-10-15       Impact factor: 12.701

8.  Analysis of K-ras gene mutations in malignant and nonmalignant endobronchial tissue obtained by fiberoptic bronchoscopy.

Authors:  N C Clements; M A Nelson; J A Wymer; C Savage; M Aguirre; H Garewal
Journal:  Am J Respir Crit Care Med       Date:  1995-10       Impact factor: 21.405

9.  Gene expression correlates of clinical prostate cancer behavior.

Authors:  Dinesh Singh; Phillip G Febbo; Kenneth Ross; Donald G Jackson; Judith Manola; Christine Ladd; Pablo Tamayo; Andrew A Renshaw; Anthony V D'Amico; Jerome P Richie; Eric S Lander; Massimo Loda; Philip W Kantoff; Todd R Golub; William R Sellers
Journal:  Cancer Cell       Date:  2002-03       Impact factor: 31.743

10.  Comparative analysis and integrative classification of NCI60 cell lines and primary tumors using gene expression profiling data.

Authors:  Huixia Wang; Shuguang Huang; Jianyong Shou; Eric W Su; Jude E Onyia; Birong Liao; Shuyu Li
Journal:  BMC Genomics       Date:  2006-07-03       Impact factor: 3.969

View more
  8 in total

1.  EGFR signals downregulate tumor suppressors miR-143 and miR-145 in Western diet-promoted murine colon cancer: role of G1 regulators.

Authors:  Hongyan Zhu; Urszula Dougherty; Victoria Robinson; Reba Mustafi; Joel Pekow; Sonia Kupfer; Yan Chun Li; John Hart; Kathleen Goss; Alessandro Fichera; Loren Joseph; Marc Bissonnette
Journal:  Mol Cancer Res       Date:  2011-06-08       Impact factor: 5.852

Review 2.  Oxygenomics in environmental stress.

Authors:  H Sone; H Akanuma; T Fukuda
Journal:  Redox Rep       Date:  2010       Impact factor: 4.412

3.  Emerging Roles for SSeCKS/Gravin/AKAP12 in the Control of Cell Proliferation, Cancer Malignancy, and Barriergenesis.

Authors:  Irwin H Gelman
Journal:  Genes Cancer       Date:  2010-11

4.  Comparative Membranome expression analysis in primary tumors and derived cell lines.

Authors:  Paolo Uva; Armin Lahm; Andrea Sbardellati; Anita Grigoriadis; Andrew Tutt; Emanuele de Rinaldis
Journal:  PLoS One       Date:  2010-07-23       Impact factor: 3.240

Review 5.  Small cell lung cancer: significance of RB alterations and TTF-1 expression in its carcinogenesis, phenotype, and biology.

Authors:  Hitoshi Kitamura; Takuya Yazawa; Hanako Sato; Koji Okudela; Hiroaki Shimoyamada
Journal:  Endocr Pathol       Date:  2009       Impact factor: 3.943

6.  KRAS-dependent suppression of MYC enhances the sensitivity of cancer cells to cytotoxic agents.

Authors:  Irene Ischenko; Jizu Zhi; Michael J Hayman; Oleksi Petrenko
Journal:  Oncotarget       Date:  2017-03-14

7.  Influence of handling conditions on the establishment and propagation of head and neck cancer patient derived xenografts.

Authors:  Andrew P Stein; Sandeep Saha; Cheng Z Liu; Gregory K Hartig; Paul F Lambert; Randall J Kimple
Journal:  PLoS One       Date:  2014-06-26       Impact factor: 3.240

Review 8.  Systems biology approaches to develop innovative strategies for lung cancer therapy.

Authors:  K Viktorsson; R Lewensohn; B Zhivotovsky
Journal:  Cell Death Dis       Date:  2014-05-29       Impact factor: 8.469

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.