Literature DB >> 17498291

Genome-wide identification of novel expression signatures reveal distinct patterns and prevalence of binding motifs for p53, nuclear factor-kappaB and other signal transcription factors in head and neck squamous cell carcinoma.

Bin Yan1, Xinping Yang, Tin-Lap Lee, Jay Friedman, Jun Tang, Carter Van Waes, Zhong Chen.   

Abstract

BACKGROUND: Differentially expressed gene profiles have previously been observed among pathologically defined cancers by microarray technologies, including head and neck squamous cell carcinomas (HNSCCs). However, the molecular expression signatures and transcriptional regulatory controls that underlie the heterogeneity in HNSCCs are not well defined.
RESULTS: Genome-wide cDNA microarray profiling of ten HNSCC cell lines revealed novel gene expression signatures that distinguished cancer cell subsets associated with p53 status. Three major clusters of over-expressed genes (A to C) were defined through hierarchical clustering, Gene Ontology, and statistical modeling. The promoters of genes in these clusters exhibited different patterns and prevalence of transcription factor binding sites for p53, nuclear factor-kappaB (NF-kappaB), activator protein (AP)-1, signal transducer and activator of transcription (STAT)3 and early growth response (EGR)1, as compared with the frequency in vertebrate promoters. Cluster A genes involved in chromatin structure and function exhibited enrichment for p53 and decreased AP-1 binding sites, whereas clusters B and C, containing cytokine and antiapoptotic genes, exhibited a significant increase in prevalence of NF-kappaB binding sites. An increase in STAT3 and EGR1 binding sites was distributed among the over-expressed clusters. Novel regulatory modules containing p53 or NF-kappaB concomitant with other transcription factor binding motifs were identified, and experimental data supported the predicted transcriptional regulation and binding activity.
CONCLUSION: The transcription factors p53, NF-kappaB, and AP-1 may be important determinants of the heterogeneous pattern of gene expression, whereas STAT3 and EGR1 may broadly enhance gene expression in HNSCCs. Defining these novel gene signatures and regulatory mechanisms will be important for establishing new molecular classifications and subtyping, which in turn will promote development of targeted therapeutics for HNSCC.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17498291      PMCID: PMC1929156          DOI: 10.1186/gb-2007-8-5-r78

Source DB:  PubMed          Journal:  Genome Biol        ISSN: 1474-7596            Impact factor:   13.583


Background

Numerous basic and clinical studies suggest that development and malignant progression of cancer is rarely due to a defect in a single gene or pathway. Multiple genetic alterations accumulate during carcinogenesis, potentially leading to aberrant activation or suppression of multiple pathways and downstream genes that have important functions in determining the malignant phenotypes of cancer. Microarray technology has enabled us to study global gene expression profiles of cancers and identify gene programs or 'signatures' that are critical to the heterogeneous characteristics and malignant phenotypes of cancers, even of the same pathologic type [1-3]. In head and neck squamous cell carcinomas (HNSCCs), gene expression profiling has been used in attempts to identify biomarkers for diagnosis [4], differential sensitivity to chemotherapy [5], risk for recurrence [6], survival [7], malignant phenotype [8], and metastasis [9]. Although considerable variability in the composition of gene signatures was observed in these studies, they provided evidence for subsets within HNSCCs, which are possibly due to differences in molecular pathogenesis that affect malignant potential. However, the transcriptional regulatory mechanisms that control the heterogeneous and shared patterns of gene expression profiles observed, and their relationship to malignant phenotypes, are not well defined. The transcriptional regulation of gene expression is mainly dependent on the composition of transcription factor binding site (TFBSs), and complex interactions among transcription factors and regulatory proteins that bind to gene promoters [10]. In murine and human squamous cell carcinoma (SCC), we and others have identified transcription factors that are inactivated or mutated (for instance, the tumor suppressor p53), or are constitutively activated (such as nuclear factor-κB [NF-κB], activator protein [AP]-1, signal transducer and activator of transcription [STAT]-3, and early growth response [EGR]1). These transcription factors have been independently implicated as tumor suppressor or oncogenic transcription factors that regulate the expression of individual genes related to phenotypic characteristics that are important in cancer development. Among these transcription factors, p53 has been implicated as a master regulator of genomic stability, cell cycle, apoptosis, and DNA repair [11,12]. Mutation or silencing of the p53 gene is an important molecular event in tumorigenesis, which has been associated with nearly 50% incidence among all cancers [13-15], including HNSCC [16-20]. NF-κB is a nuclear transcription factor that is activated in HNSCCs and other cancers. We and others have shown that constitutive activation of NF-κB1/RelA is among the important factors that control expression of genes that regulate cellular proliferation, apoptosis, angiogenesis, immune and proinflammatory responses, and therapeutic resistance in HNSCCs [21-26] and other cancers [27-29]. AP-1, STAT3, and EGR1 are considered important transcription factors that are involved in regulating gene expression in human cancers, including HNSCCs. Constitutive activation of AP-1 and STAT3 appear to be important factors for tumor cell proliferation, survival, and angiogenesis in vitro or in vivo [21,24,29-34]. EGR1 is a zinc-finger transcription factor that is rapidly and transiently induced in response to a number of stimuli, including growth factors, cytokines, and mechanical stresses [35,36]. The study of regulatory controls involving multiple transcription factors for clustered gene expression obtained from microarray data meets with many experimental challenges. In a previous study of step-wise progression of murine SCCs, we combined gene expression profiling data with a bioinformatic analysis of promoter TFBSs and ontology limited to NF-κB regulated genes, and provided evidence that this transcription factor is one of the critical regulatory determinants of expression of multiple genes and malignant phenotype [37,38]. However, this approach involving analysis of a single pathway and this TFBS appears far from providing a complete explanation for the heterogeneity and multiplicity of genes expressed in clusters in HNSCCs and other cancers, or associated differences in phenotypic and biologic behavior observed. Identification of common TFBSs in gene clusters through in silico analysis can provide a framework for further elucidating the network and complex interactions of regulatory mechanisms that are involved in gene expression in cancer [39,40]. In the present study, microarray combined with computational prediction was utilized to define gene expression patterns and putative TFBSs for genes that are differentially expressed among ten HNSCC cell lines and nonmalignant keratinocytes. The differentially expressed microarray profiles classified subsets of HNSCC cells related to differences in p53 genotype, protein expression, and unique gene signatures. The potential relationship of novel gene expression signatures and prevalence of TFBSs for p53, NF-κB, AP-1, STAT3, and EGR1 were identified, and novel transcription regulatory modules for specific gene clusters were predicted. The predicted results were then validated by real-time reverse transcription (RT)-polymerase chain reaction (PCR) and chromatin immunoprecipitation (ChIP) assay. Our study suggests that integration of genome-wide microarray profiling and computational analyses is a powerful way to identify gene signatures as determinants for cancer heterogenicity and malignant phenotypes, and their underlying regulatory control mechanisms.

Results

Identification of novel gene clusters in University of Michigan SCC cells with different p53 status by cDNA microarray expression profiling

cDNA microarray analysis was performed using a panel of ten HNSCC cell line series from the University of Michigan (UM-SCC), derived from eight patients with aggressive HNSCC (survival <2 years), and representing a distribution of different anatomic sites (Table 1). Many of the molecular alterations and biologic characteristics of these UM-SCC cell lines have been confirmed to reflect those identified in HNSCC tumors from patients in laboratory and clinical studies. These include the roles of activation of epidermal growth factor receptor, IL-1, and IL-6 signal transduction pathways; altered activation of transcription factors p53, NF-κB, AP-1, and STAT3; expression of cytokines and other genes; and variation in radiation and chemosensitivity [5,21-26,30-33,41-43]. The p53 mutation and expression status of UM-SCC cells lines were evaluated using bidirectional genomic sequencing of exons 4 to 9 (Figure 1a), and confirmed with immunocytochemistry using monoclonal antibody to p53 (DO-1 clone) (Figure 1b). No mutation was detected in those exons in four cell lines, namely UM-SCC 1, 6, 9, and 11A. Mutation of p53 was detected in five cell lines, namely UM-SCC 5, 22A, 22B, 38, and 46 (Figure 1a). A mutation was also detected in UM-SCC 11B cells, but immunocytochemistry of p53 protein suggested there might be a mixed population of UM-SCC 11B cells with a heterogeneous expression pattern for nuclear p53 protein (Figure 1b). The findings regarding p53 mutation in UM-SCC 1, 5, 6, 11B, and 46 cells are consistent with a previous report by Bradford and coworkers [41].
Table 1

Tumor, treatment, and outcome characteristics of patients providing human SCC cell lines

Cell lineAge (years at diagnosis)SexStageTNMPrimary siteSpecimen sitePrior therapyStatusSurvival (months)
UM-SCC 172MIT1N0M0FOMLocal recurRDWOD15
UM-SCC 559MIIIT2N1M0Supraglottic larynxPri bxSDOD8
UM-SCC 637MIIT2N0M0TonguePri bxNLTF
UM-SCC 972FIIT2N0M0Tonsil/BOTLocal recurRDOD15
UM-SCC 11A65MVT2N2aM0HypopharynxPri bxNDOD14
UM-SCC 11BPri resectC
UM-SCC 22A59FIIIT2N1M0HypopharynxPri bxNDOD10
UM-SCC 22BLN metN
UM-SCC 3860MIVT2N2aM0Tonsil/BOTPriNDOD11
UM-SCC 4657FIIINo TMN GivenSuprglottic larynxLocal recurR, SDOD6

The clinical information was kindly provided by Drs Thomas E Carey and Carol R Bradford, and some information was previously presented in the literature. 'Primary sites' refers to the origin of the primary tumor. 'Specimen site' refers to origin of tissue used to establish cultures. 'Prior therapy' refers to therapy given before the specimen used for culture was obtained. 'Survival' represents time in months from diagnosis to last follow up. BOT, base of tongue; bx, biopsy; C, chemotherapy; DOD, died with disease; DWOD, died without disease; F, female; FOM, floor of mouth; LN, lymph nodes; LTF, lost to follow-up; M, male; met, metastasis; N, none; NED, no evidence of disease; Pri, primary tumor site; R, radiation; recur, recurrence; resect, surgical resection specimen; SCC, squamous cell carcinoma; S, surgery; TNM, tumor-node-metastasis (staging system); UM-SCC, University of Michigan series head and neck squamous cell carcinoma.

Figure 1

p53 genotype and protein expression in UM-SCC cell lines. (a) The p53 genotype of ten University of Michigan series head and neck squamous cell carcinoma (UM-SCC) cell lines was analyzed by two-directional sequencing of four to nine exons. (b) Immunohistochemistry for p53 was performed on the UM-SCC cell lines using anti-p53 monoclonal antibody (DO-1, clone), and the panels were segregated according to minimal or weaker staining pattern typical for wild-type p53 (upper panels, except UM-SCC 11B) and strong nuclear staining typical for mutant p53 status of cells (lower panels). The cells stained with the isotype control primary antibody as negative control are presented in the small pictures located at the lower right corner of each image. The pictures were taken at a magnification of 100×.

Tumor, treatment, and outcome characteristics of patients providing human SCC cell lines The clinical information was kindly provided by Drs Thomas E Carey and Carol R Bradford, and some information was previously presented in the literature. 'Primary sites' refers to the origin of the primary tumor. 'Specimen site' refers to origin of tissue used to establish cultures. 'Prior therapy' refers to therapy given before the specimen used for culture was obtained. 'Survival' represents time in months from diagnosis to last follow up. BOT, base of tongue; bx, biopsy; C, chemotherapy; DOD, died with disease; DWOD, died without disease; F, female; FOM, floor of mouth; LN, lymph nodes; LTF, lost to follow-up; M, male; met, metastasis; N, none; NED, no evidence of disease; Pri, primary tumor site; R, radiation; recur, recurrence; resect, surgical resection specimen; SCC, squamous cell carcinoma; S, surgery; TNM, tumor-node-metastasis (staging system); UM-SCC, University of Michigan series head and neck squamous cell carcinoma. p53 genotype and protein expression in UM-SCC cell lines. (a) The p53 genotype of ten University of Michigan series head and neck squamous cell carcinoma (UM-SCC) cell lines was analyzed by two-directional sequencing of four to nine exons. (b) Immunohistochemistry for p53 was performed on the UM-SCC cell lines using anti-p53 monoclonal antibody (DO-1, clone), and the panels were segregated according to minimal or weaker staining pattern typical for wild-type p53 (upper panels, except UM-SCC 11B) and strong nuclear staining typical for mutant p53 status of cells (lower panels). The cells stained with the isotype control primary antibody as negative control are presented in the small pictures located at the lower right corner of each image. The pictures were taken at a magnification of 100×. Gene expression profiles were determined using a 24,000 element cDNA microarray by comparing 10 UM-SCC cell lines with four cultured primary human keratinocyte (HKC) lines as normal controls. The expression of 9,273 of 12,270 evaluable known genes was submitted for principal components analysis and hierarchical clustering [44]. Both methods grouped UM-SCC 11B together with its parental cell line UM-SCC 11A, as well as the other UM-SCC cells with wild-type p53, and these findings were statistically significant (P < 0.001, class prediction analysis; BRB-Array Tools [45]). Based on mixed p53 protein staining and wild-type p53 associated gene expression pattern, we classified UM-SCC 11B cells with the wild-type p53 group as having a 'wild-type p53-like' expression pattern. Next, we studied a total of 1,011 genes that exhibited twofold or greater differences in gene expression when comparing HKCs with all UM-SCC, or comparing UM-SCC cells with either wild-type p53-like (UM-SCC cell lines 1, 6, 9, 11A, and 11B) or mutant p53 expression patterns (UM-SCC cell lines 5, 22A, 22B, 38, and 46; Figure 2a). The 1,011 genes, including 371 over-expressed and 640 under-expressed genes, were subjected to hierarchical clustering, as shown in Figure 2a. The expression profile of 1,011 genes clustered all samples into three groups, namely HKCs, UM-SCC wild-type p53-like, and UM-SCC with mutant p53 (Figure 2a). Six major clusters (A to F) of differentially expressed genes were identified, with most over-expressed genes included in two distinct clusters, A and B, and three subclusters within cluster C (subclusters C1, C2, and C3) on the top portion of the expression tree (Figure 2a).
Figure 2

Hierarchical clustering analysis of differentially expressed genes in UM-SCC cells. A total of 1,011 differentially expressed genes was extracted from 24,000 cDNA microarray database, based on twofold and greater difference among human normal kerintinocytes (HKCs), UM-SCC cells with wild-type p53-like expression pattern, mutant p53 or wild-type + mutant p53 status (t-test score at P < 0.05, two-tailed). The hierarchical clustering tree was generated using Java Treeview [107]. Four HKCs were grouped on the left, and five UM-SCC cell lines with wild-type p53-like expression pattern were grouped together in the middle, and five UM-SCC cell lines with mutant p53 were grouped to the right, respectively. Over-expressed genes are indicated by red and under-expressed genes by green; and the expression level is proportional to the brightness of the color (see color bar). (a) Entire hierarchical clustering tree included three upregulated clusters (A, B and C [including subclusters C1 to C3]) and three downregulated clusters (D, E and F). (b) Cluster A consisted of 34 genes. (c) Cluster B consisted of 37 genes. mt, mutant; wt, wild-type.

Hierarchical clustering analysis of differentially expressed genes in UM-SCC cells. A total of 1,011 differentially expressed genes was extracted from 24,000 cDNA microarray database, based on twofold and greater difference among human normal kerintinocytes (HKCs), UM-SCC cells with wild-type p53-like expression pattern, mutant p53 or wild-type + mutant p53 status (t-test score at P < 0.05, two-tailed). The hierarchical clustering tree was generated using Java Treeview [107]. Four HKCs were grouped on the left, and five UM-SCC cell lines with wild-type p53-like expression pattern were grouped together in the middle, and five UM-SCC cell lines with mutant p53 were grouped to the right, respectively. Over-expressed genes are indicated by red and under-expressed genes by green; and the expression level is proportional to the brightness of the color (see color bar). (a) Entire hierarchical clustering tree included three upregulated clusters (A, B and C [including subclusters C1 to C3]) and three downregulated clusters (D, E and F). (b) Cluster A consisted of 34 genes. (c) Cluster B consisted of 37 genes. mt, mutant; wt, wild-type. The unique gene signatures of clusters A and B consisted of 34 and 37 genes (Figure 2b,c and Table 2), respectively. We used the mixed model based F-test to examine the statistical difference of gene expression within clusters among HKCs and UM-SCC cell lines with different expression patterns. Within both cluster A and cluster B genes, a significant difference in gene expression (probability, Pr [F] < 0.001) was observed when comparing the two groups of UM-SCC cells. A significant difference was also observed when comparing HKCs with UM-SCC cells with mutant p53 in cluster A, and comparing UM-SCC with the wild-type p53-like expression pattern in cluster B. Thus, cluster A genes were over-expressed in UM-SCC cells with mutant p53 (Figure 2b), whereas cluster B genes were over-expressed in UM-SCC cells with wild-type p53-like expression pattern (Figure 2c).
Table 2

Putative transcription factor binding sites of clusters A and B over-expressed in HNSCC

Gene nameGene descriptionRefSeqOrthlogaNumber of TFBSs predictedbFunctional annotationc

p53NF-κBAP-1STAT3EGR1
Cluster A
ABCC5ATP-binding cassette, subfamily C (CFTR/MRP), member 5NM_005688hmra1 (hmr)Transport
ARID1AAT rich interactive domain 1a (SWI-like)NM_006015hmr1 (hmr)5 (hmr)Regulation of metabolism
ARTS-1Type 1 TNF receptor shedding aminopeptidase regulatorNM_016442hmr1111Catabolism
ATP1B3ATPase, Na+/K+ transporting, beta 3 polypeptideNM_001679hmr37Transport
BLKB lymphoid tyrosine kinaseNM_001715hmr1Signal transduction
CDC42EP4cdc42 effector protein 4; binder of Rho GTPases 4NM_012121h122Regulation of cell shape
CDH18Cadherin 18, type 2NM_004934hCell adhesion
CDKN2CCyclin-dependent kinase inhibitor 2C (p18)NM_078626hmr31Cell proliferation; cell cycle
CKBCreatine kinase, brainNM_001823hmr38Creatine kinase activity
CPS1Carbamoyl-phosphate synthetase 1, mitochondrialNM_001875hmr2 (hmr)11 (hmr)Amino acid metabolism
FZD1Frizzled homolog 1 (Drosophila)NM_003505h14Signal transduction
HARSLHistidyl-tRNA synthetase-likeNM_012208hm211Amino acid metabolism
HBE1Hemoglobin, epsilon 1NM_005330hr11Transport
HIST1H2ACH2A histone family, member LNM_003512h1Chromosome organization and biogenesis
HIST1H2AMH2A histone family, member NNM_003514h3Chromosome organization and biogenesis
HIST1H2BCH2B histone family, member LNM_003526h1Chromosome organization and biogenesis
HIST1H2BDH2B histone family, member BNM_138720hr11Chromosome organization and biogenesis
HIST1H2BJH2B histone family, member RNM_021058h11Chromosome organization and biogenesis
HIST1H2BLH2B histone family, member CNM_003519h1Chromosome organization and biogenesis
HIST1H2BNH2B histone family, member DNM_003520h22Chromosome organization and biogenesis
HIST2H2BEH2B histone family, member QNM_003528h3111Chromosome organization and biogenesis
IGFBP2Insulin-like growth factor binding protein 2 (36 kDa)NM_000597hmr1 (hr)4 (hmr)Regulation of cell growth
LGALS3BPLectin, galactoside-binding, soluble, 3 binding proteinNM_005567hmr12 (hmr)Cell adhesion
MATN2Matrilin 2NM_002380h1Extracellular matrix assembly
MYST3MYST histone acetyltransferase (monocytic leukemia) 3NM_006766hmr5 (hmr)6 (hmr)DNA packaging
OLFM1Olfactomedin 1NM_014279hmr1124 (hmr)Morphogenesis
PRODHProline oxidase homologNM_016335hAmino acid metabolism
SLC9A3R1Solute carrier family 9, isoform 3 regulatory factor 1NM_004252hmr13 (hr)Signal transduction
TDRD7Tudor domain containing 7NM_014290hmr11 (hmr)5 (hmr)Protein amino-terminus binding
TGM1Transglutaminase 1NM_000359hmr211Morphogenesis; cell proliferation
THAP11THAP domain containing 11NM_020457h3DNA binding, ion binding
UBADC1Ubiquitin associated domain containing 1NM_016172hmr2 (hmr)7 (hmr)Protein ubiquitination
XCL1Chemokine (C motif) ligand 2NM_002995h2Signal transduction
XPAXeroderma pigmentosum, complementation group ANM_000380hmr212 (hmr)DNA repair
Cluster B
ABCG2ATP-binding cassette, subfamily G (WHITE), member 2NM_004827h312Transport
ACSL5Fatty-acid-coenzyme a ligase, long-chain 5NM_016234hmr1 (hmr)1Fatty acid metabolism
AIF1Allograft inflammatory factor 1NM_001623hmrInflammatory response; cell cycle
AKAP12A kinase (PRKA) anchor protein (gravin) 12NM_005100hmr122Signal transduction
ARID3AAT rich interactive domain 3A (BRIGHT-like)NM_005224hmr2 (hmr)13Regulation of transcription
BCAT1Branched chain aminotransferase 1, cytosolicNM_005504h112Cell cycle; amino acid metabolism
BIRC2Baculoviral IAP repeat-containing 2NM_001166h2212Antiapoptosis; signal transduction
CA9Carbonic anhydrase IXNM_001216hmr1 (hmr)1 (hmr)One-carbon compound metabolism
CAP2Adenylyl cyclase-associated protein 2NM_006366hm1Signal transduction
CROC4Transcriptional activator of the c-fos promoterNM_006365h1Cell proliferation
DMAP1DNA methyltransferase 1-associated protein 1NM_019100hmr3 (hr)1Regulation of transcription
DNAH11Dynein, axonemal, heavy polypeptide 11NM_003777h4Transport
FADS3Fatty acid desaturase 3NM_021727hmr35 (hr)Fatty acid metabolism
ICAM1Intercellular adhesion molecule 1 (CD54)NM_000201hmr2 (hmr)11 (hmr)3Cell adhesion
IL6Interleukin 6 (interferon, beta 2)NM_000600hmr1 (hm)1 (hmr)1 (hmr)1Signal transduction; inflammatory response
IL8Interleukin 8NM_000584h111Signal transduction; inflammatory response
KCNN4Intermediate conductance Ca-activated K channel protein 1NM_002250hmr13 (hmr)1 (hmr)Transport
KRT18Keratin 18NM_199187h11Structural molecule activity
KRT8Keratin 8NM_002273hr221 (hr)2Structural molecule activity
MLPHMelanophilinNM_024101h12Transport
Pfs2DNA replication complex GINS protein PSF2NM_016095h15DNA metabolism
PLK1Polo-like kinase (Drosophila)NM_005030hmr111 (hm)Metabolism; cell proliferation
PORIMINPro-oncosis receptor inducing membrane injury geneNM_052932h15Oncosis-like cell death
PPP1R12AProtein phosphatase 1, regulatory (inhibitor) subunit 12ANM_002480hmr15 (hmr)Regulation of organismal physiological process
PTPRJProtein tyrosine phosphatase, receptor type, JNM_002843h18Signal transduction
RAB17RAB17, member RaS oncogene familyNM_022449h12Signal transduction
RAD54LRAD54-like (S. cerevisiae)NM_003579hmr17DNA repair; cell cycle
RPN2Ribophorin IINM_002951hmr1111Protein metabolism
SHANK2Cortactin binding protein 1NM_012309hmr224Signal transduction
SNCGSynuclein, gamma (breast cancer-specific protein 1)NM_003087hmr1Pathogenesis
SRPX2Sushi-repeat proteinNM_014467hmr2 (hmr)Electron transport
STC1Stanniocalcin 1NM_003155hmr13 (hmr)Signal transduction
STK6Serine/threonine kinase 15NM_198433hmr1Cell cycle
TOMM34Translocase of outer mitochondrial membrane 34NM_006809hmr3 (hmr)2 (hmr)Protein metabolism
TXNRD1Thioredoxin reductase 1NM_003330hmr121 (hmr)Signal transduction
YAP1Yes-associated protein 1, 65 kDNM_006106hmr111(hmr)Signal transduction
ZNF239Zinc finger protein 239NM_005674h1Regulation of transcription

Shown are numbers of transcription factor binding sites (TFBSs) from p53, nuclear factor-κB (NF-κB), activator protein (AP)-1, signal transducer and activator protein (STAT)3, and early growth response (EGR)1 in clusters A and B over-expressed in head and neck squamous cell carcinoma (HNSCC). TFBSs were predicted using Genomatix Suite 3.4.1 [108]. aOrthologous promoter sets are indicated by single-letter abbreviations (h, human; m, mouse; r, rat). bValues are presented as number of TFBSs in proximal region of promoters. The average length of these promoters was adjusted to approximately 600 base pairs (bp): about 500 bp upstream and about 100 bp downstream. Letters in the parentheses refer to conserved TFBSs identified among human, mouse, or rat using multiple sequence alignment of DiAlign TF of Genomatix Suite 3.4.1. cFrom Gene Ontology Annotation using Onto-Express [46], AmiGo [106], and National Center for Biotechnology Information [107]. TNF, tumor necrosis factor.

Putative transcription factor binding sites of clusters A and B over-expressed in HNSCC Shown are numbers of transcription factor binding sites (TFBSs) from p53, nuclear factor-κB (NF-κB), activator protein (AP)-1, signal transducer and activator protein (STAT)3, and early growth response (EGR)1 in clusters A and B over-expressed in head and neck squamous cell carcinoma (HNSCC). TFBSs were predicted using Genomatix Suite 3.4.1 [108]. aOrthologous promoter sets are indicated by single-letter abbreviations (h, human; m, mouse; r, rat). bValues are presented as number of TFBSs in proximal region of promoters. The average length of these promoters was adjusted to approximately 600 base pairs (bp): about 500 bp upstream and about 100 bp downstream. Letters in the parentheses refer to conserved TFBSs identified among human, mouse, or rat using multiple sequence alignment of DiAlign TF of Genomatix Suite 3.4.1. cFrom Gene Ontology Annotation using Onto-Express [46], AmiGo [106], and National Center for Biotechnology Information [107]. TNF, tumor necrosis factor. In addition to genes of clusters A and B, we defined another group of over-expressed genes, namely cluster C, including three subclusters C1, C2, and C3 (Figure 2a and Additional data file 1). Overall, cluster C contained 240 genes that were over-expressed by 10 cancer cell lines when compared with HKCs (Pr [F] < 0.001). However, two of the subclusters (C1 and C2) identified exhibited a degree of differential expression in UM-SCC cells similar to the various p53-associated expression patterns (Additional data file 1).

Gene Ontology annotation revealed the unique nature of clustered genes

To determine the functional classification of the various gene clusters, we conducted Gene Ontology (GO) annotation using Onto-Express, which constructs statistically significant functional profiles from a set of genes [46]. Additional data file 2 shows functional categories that are significantly enriched in the six clusters. The top categories of GO biologic processes in cluster A were nucleosome assembly, chromosome organization, and biogenesis. These included genes involved in regulation of chromosome structure or function (such as H2B histone family B, C, D, R, L and Q, and H2A histone family L and N), transport (such as MYST3, ABCC5, ATP1B3, and HBE1), and DNA repair (such as XPA). The main GO molecular function was DNA binding, including eight genes in histones H2A and H2B, and MYST3, THAP11 and ARID1A (Table 2 and Additional data file 2). In contrast to cluster A, the top ranked GO biologic processes in cluster B belonged to signal transduction (such as cell-cell signaling, cell surface receptor linked signal transduction), including AKAP12, CAP2, IL6, IL8, RAB17, SHANK2, STC1, PTPRJ, TXNRD1, and YAP1 (Table 2 and Additional data file 2). Other enriched functional categories included cell cycle (AIF1, BCAT1, RAD54L, and STK6), regulation of transcription (ARID3A, DMAP1, and ZNF239), cell proliferation and apoptosis (CROC4, BIRC2, PLK1, and PORMIN), adhesion (ICAM1), and structural proteins related to tumor progression (KRT8 and KRT18; Table 2 and Additional data file 2). Interestingly, several genes in this cluster or their homologs involved in angiogenesis and inhibition of apoptosis have previously been associated with metastatic tumor progression in murine SCC or human HNSCC (IL6, IL8, YAP1, and BIRC2) [9,22-24,30,33,37,38,47-49], and shown to be regulated by NF-κB [22-25,37,49]. Genes in cluster C exhibited annotations for DNA replication, ubiquitin cycle, cell division, and oxidoreductase and catalytic activities. The gene list and ontology of the subclusters in C are presented in the Additional data files 1 and 2, respectively. Several genes in subclusters C1 and C2 exhibited weaker clustering but similar functions as those in cluster A or B. In subcluster C1, in which over-expressed genes were mainly found in UM-SCC cells with mutant p53 as in cluster A, there are two additional genes identified that encode proteins involved in chromosome structure and functions (HIST1H2AL and HDAC5). Other genes previously associated with cancer included a member of epidermal growth factor receptor family (ERBB3); a target gene of p53/p63 (IGFBP3); and a gene whose product is involved in calcium storage and signaling (CALR). In subcluster C2, in which over-expressed genes were found in UM-SCC cells with wild-type p53-like expression pattern as in cluster B, another apoptosis related gene (BAG2), genes encoding signal-related molecules (MYBL2 and UBE2C) and a cell cycle related molecule (CCNB2) were identified (Additional data file 1). The rest of the cluster C genes were over-expressed by more than half of ten UM-SCC cells when compared with HKCs. Several genes encoding protein products that are important in cancer and have functions related to cell cycle, growth, DNA replication and protein translation (such as CCND1, TOP2A, TOPBP1, TFRC; three members of H4 histone family [HIST1H4C, HIST1H4B, and HIST1H4E]; and EIF4G1). Some genes encode proteins with functions related to signal transduction (PIK3R3, MAPK8IP1, and GATA2), and one gene encodes a protein that regulates tumor invasion and metastasis (TIMP2). Genes downregulated in UM-SCC cells were included in cluster D, which represents functional categories that are involved in epidermis development, cell adhesion, and cell-cell signaling (Additional data file 2). Downregulated cluster E genes included those encoding molecules with functions in other signal pathways, cell cycle, calcium ion regulation, and actin binding activities (Additional data file 2). The categories over-represented among the downregulated genes in cluster F included cell adhesion, differentiation, and morphogenesis (Additional data file 2). A more detailed analysis of the downregulated genes will be presented elsewhere (Yan, unpublished data).

Over-representation of binding sites of five transcription factors associated with the unique gene clusters in UM-SCC cells

Based on the gene expression profiling data, we hypothesized that transcriptional regulation by multiple transcription factors may be key elements that contribute to the expression of unique gene clusters. To test this hypothesis, in silico computational analyses were performed to determine whether dominant cis-regulatory elements are present in the proximal promoter region of over-expressed genes. We evaluated five transcription factors that were previously found to be altered and functionally important in HNSCCs and other cancers, including p53, NF-κB, AP-1, STAT3, and EGR1. We compared the frequencies of their binding sites with those from vertebrate promoters from the Genomatix promoter database (GPD), which consists of information from human, mouse, and rat. The five transcription factors examined have been shown by our laboratory and others to contribute to regulation of individual gene expression with functional importance in cancer, such as cell proliferation, cell cycle, apoptosis, DNA repair, and angiogenesis [21,22,24,36-38,50-53]. Table 2 shows a list of genes included in clusters A and B and corresponding binding sites for the five transcription factors in proximal promoter regions that are predicted with high probability. The detailed location and sequences of the putative TFBSs are shown in Additional data file 3. Significant differences in the prevalence of predicted TFBSs were observed for genes from different clusters when compared with vertebrate promoters (Figure 3). In cluster A, putative p53 binding sites were detected in 50% of the 34 gene promoters, which is significantly higher than observed in vertebrate promoters (P < 0.05; Figure 3). Conversely, predicted NF-κB binding sites were observed in about 66% to 70% of the promoters in clusters B and C, which was significantly more than in vertebrate promoters. There was no significant difference in the prevalence of NF-κB binding sites between the promoters of cluster A and vertebrates. There were also differences in the prevalence of TFBSs predicted between different clusters. For example, the p53 binding motif was significantly greater in cluster A than cluster B (χ2 analysis; P < 0.05), and the greatest frequency of NF-κB binding sites was observed in cluster B (26/37 [70%]; Figure 3). There were significantly fewer genes with AP-1 binding sites in cluster A and subcluster C1 (12% and 13%, respectively) compared with vertebrate promoters (Figure 3). A relatively higher frequency of AP-1 binding sites was observed in cluster B genes when compared with frequencies in cluster A and subcluster C1 genes (χ2 analysis; P < 0.01). In contrast, a relative increase in prevalence of STAT3 and EGR1 binding sites was observed and distributed among all of the upregulated clusters relative to vertebrate promoters (Figure 3), with increasingly higher frequencies of EGR1 motifs detected in clusters B and C (60% to 76%; using Genomatix matrix EGR1.02).
Figure 3

Frequency of putative TFBSs in proximal regions of promoters. The promoter sequences were extracted from the over-expressed genes in clusters A and B, and subclusters C1 to C3 in UM-SCC cells using Genomatix Suite 3.4.1. The average length of these promoters was adjusted to approximately 600, including about 500 base pairs upstream and about 100 base pairs downstream from the transcription start site. The promoter sequences from vertebrates represented 159,505 promoters, including 55,207 from human, 69,108 from mouse, and 35,190 from rat in Genomatix promoter database. The P value of transcription factor binding site (TFBS) frequency in a given cluster was calculated by MatInspector of Genomatix Suite 3.4.1. *Significantly increased frequencies of putative binding motifs on promoter regions of clustered genes when compared with the vertebrate promoters with a randomly drawn sample of the same size (P < 0.05). †Significantly lower frequency of the activator protein (AP)-1 binding motif when compared with the vertebrate promoters. EGR, early growth response; NF-κB, nuclear factor-κB; STAT, signal transducer and activator of transcription.

Frequency of putative TFBSs in proximal regions of promoters. The promoter sequences were extracted from the over-expressed genes in clusters A and B, and subclusters C1 to C3 in UM-SCC cells using Genomatix Suite 3.4.1. The average length of these promoters was adjusted to approximately 600, including about 500 base pairs upstream and about 100 base pairs downstream from the transcription start site. The promoter sequences from vertebrates represented 159,505 promoters, including 55,207 from human, 69,108 from mouse, and 35,190 from rat in Genomatix promoter database. The P value of transcription factor binding site (TFBS) frequency in a given cluster was calculated by MatInspector of Genomatix Suite 3.4.1. *Significantly increased frequencies of putative binding motifs on promoter regions of clustered genes when compared with the vertebrate promoters with a randomly drawn sample of the same size (P < 0.05). †Significantly lower frequency of the activator protein (AP)-1 binding motif when compared with the vertebrate promoters. EGR, early growth response; NF-κB, nuclear factor-κB; STAT, signal transducer and activator of transcription.

The orthologous promoters and conserved transcription factor binding sites predict increased likelihood of functional co-regulation of clustered genes

The likelihood of functionality of a predicted TFBS can be examined by determining its conservation at the sequence level. To determine the potential conservation of the predicted TFBSs, the orthologous promoter regions of genes in clusters A and B were examined by searching their conservation at the sequence level among vertebrates (human, mouse, and rat) using the comparative genomics analysis feature of Genomatix Suite 3.4.1. Orthologous promoter sets were found in 19 and 24 genes of clusters A and B, respectively (Table 2; Ortholog). Among 17 genes containing predicted p53 binding sites in cluster A, 10 out of 17 (59%) were identified in the orthologous promoter regions. In this cluster, the predicted prevalence of binding sites falling in the orthologous promoter regions were 63% for NF-κB, 60% for AP-1, 100% for STAT3, and 76% for EGR1. Similarly, in cluster B, the prevalence of binding sites falling in orthologous promoters were 65% for NF-κB, 67% for p53 and AP-1, 73% for STAT3, and 64% for EGR1. These levels of conservation indicated that the majority of predicted TFBSs falling in the orthologous promoter regions were likely selected favorable for growth or survival during evolution. Interestingly, although expression of histone H2A and H2B gene members were predominant in cluster A, only a rat orthologous promoter was found in HIST1H2BD among the eight histone genes (Table 2). The conserved TFBSs among the orthologous promoter sets were further investigated by multiple sequence alignment using DiAlignTF [54]. Conserved p53 binding sites were found in three genes of cluster A (ARID1A, CPS1, and UBADC1) and two genes of cluster B (IL6 and ARID3A; Table 2). The conservation of NF-κB binding sites was observed in more genes, including LGALS3BP, MYST3 and TDRD7 in cluster A, and ACSL5, CA9, DMAP1, ICAM1, IL6, KCNN4, and TOMM34 in cluster B. Additionally, the binding sites of AP-1, STAT3, and EGR1 were conserved in 6, 4, and 14 gene promoters, respectively (Table 2). Next, we identified five representative gene promoters from either cluster A or B genes, which contained conserved p53 or NF-κB binding motifs among human, chimpanzee, mouse, and rat (Figure 4). The core sequence (underlined) of a transcription factor matrix represents the most highly conserved and consecutive positions of this matrix. In promoters of both CPS1 and ARID1A from cluster A genes, the predicted p53 binding sites were similar to Genomatix and TRANSFAC p53 matrix consensus sequence GGACATGCCGGGCATGTCY (Figure 4a). The p53 binding site of ARID1A promoter was located 55 to 74 base pairs (bp) downstream from the transcriptional start site and overlapped with a EGR1 binding site. Known NF-κB sites in IL6 and ICAM1 promoters are conserved with about 90% matrix similarity to the five matrices for the NF-κB family, including p65 and cRel (Figure 4b and Additional data file 3). Another conserved NF-κB site in the promoter of gene CA9 exhibited 85% to about 90% similarity to two NF-κB matrices of the family including p50, indicating that these sites are more likely to be functional in a biologic context.
Figure 4

Predicted conserved p53 and NF-κB binding sites in proximal promoter regions of five representative genes from clusters A and B. The search for conserved TFBS was carried out by multiple sequence alignment of each promoter set using DiAlignTF of Genomatix Suite 3.4.1. The promoter region included about 500 base pairs upstream and about 100 base pairs downstream from the transcription start site (TSS) among human, chimpanzee, mouse, and rat. (a) The conserved p53 binding motifs were present in two gene promoters from cluster A (CPS1 and ARID1A), and (b) conserved nuclear factor-κB (NF-κB) binding motifs were present in three gene promoters from cluster B (ICAM1, IL6, and CA9). Letters in bold are the predicted binding sites of p53 or NF-κB, letters in italic are early growth response (EGR)1 binding sites, and letters underlined denote the core conserved sequence. The numbers showed predicted transcription factor binding site (TFBS) position from the TSS of human sequences, where negative positions were upstream of the TSS and positive ones were downstream from the TSS.

Predicted conserved p53 and NF-κB binding sites in proximal promoter regions of five representative genes from clusters A and B. The search for conserved TFBS was carried out by multiple sequence alignment of each promoter set using DiAlignTF of Genomatix Suite 3.4.1. The promoter region included about 500 base pairs upstream and about 100 base pairs downstream from the transcription start site (TSS) among human, chimpanzee, mouse, and rat. (a) The conserved p53 binding motifs were present in two gene promoters from cluster A (CPS1 and ARID1A), and (b) conserved nuclear factor-κB (NF-κB) binding motifs were present in three gene promoters from cluster B (ICAM1, IL6, and CA9). Letters in bold are the predicted binding sites of p53 or NF-κB, letters in italic are early growth response (EGR)1 binding sites, and letters underlined denote the core conserved sequence. The numbers showed predicted transcription factor binding site (TFBS) position from the TSS of human sequences, where negative positions were upstream of the TSS and positive ones were downstream from the TSS.

Novel transcription factor regulatory modules associated with p53 or nuclear factor-κB in promoters of clustered genes

Because we observed that several transcription factors are often co-activated in HNSCCs, we hypothesized that the clustered gene expression could be co-regulated by multiple transcription factors [55]. These transcription factors are expected to be structured and coordinated tightly together, form a functional unit or so-called transcription factors module, and play roles in regulating gene expression. To obtain evidence for this hypothesis, we used FrameWorker of Genomatix Suite 3.4.1 to define promoter models. Based on the promoter modeling, we identified the putative regulatory modules of TFBSs in the clustered genes that were over-expressed by UM-SCC cells. Two co-regulated gene groups were selected for this analysis, which included 17 genes with p53 binding sites in cluster A and 26 genes with NF-κB binding sites in cluster B. In cluster A genes with p53 binding sites, putative models containing three and four transcription factors were present with scores of high selectivity (Table 3), indicating that such models are enriched to a greater extent in cluster A genes than in genes randomly selected from the whole human genome. All eight transcription factor models contained p53-TBPF (TATA-binding protein factors) associated with either CREB (cAMP-responsive element binding proteins; 4/8) or PCAT (promoter of CCAAT-binding factors; 2/8), suggesting the possible functional relationships or co-regulatory mechanisms mediated by these transcription factors. These transcription factor modules were over-represented in the proximal promoter regions of several genes, including CPS1 in all eight models, and HIST1H2AM, HIST1H2BE, and HIST1H2BL in six to seven models. In addition to genes with p53 binding motifs, we also identified a putative module of TBPF-ECAT (enhancer of CCAAT binding factors)-PCAT that was present on 100% promoter regions within eight histone H2A or H2B genes (Figure 5), which is in contrast to the low frequency (0.47%) observed in the entire human promoter database. The putative p53 binding sites found in the promoters of four histone genes, namely HIST1H2AM, HIST1H2BE, HIST1H2BL, and HIST1H2BN, were located within 100 bp of the TBPF-ECAT-PCAT module (Figure 5), which is consistent with a greater likelihood of regulatory interactions.
Table 3

Putative transcription factor models in clusters A and B over-expressed in HNSCC

ModelModel matches in the cluster% of matches in the clustera% of hits in GPDbSelectivityc
Selected genes from cluster Ad
p53-TBPFHIST1H2AM, HIST2H2BE, HIST1H2BL, HIST1H2BN, CPS1, XPA, XCL1414.399.4
p53-TBPF-CREBHIST1H2AM, HIST2H2BE, HIST1H2BL, CPS1, XCL1290.6843.3
p53-TBPF-PCATHIST1H2AM, HIST2H2BE, HIST1H2BL, HIST1H2BN, CPS1290.25119.4
p53-TBPF-CREB-TBPFHIST1H2AM, HIST2H2BE, HIST1H2BL, CPS1, XCL1290.19151.8
p53-TBPF-PCAT-SORYHIST2H2BE, HIST1H2BL, HIST1H2BN, CPS1240.021180.9
p53-TBPF-CREB-ECATHIST1H2AM, HIST2H2BE, HIST1H2BL, CPS1240.05433.0
p53-TBPF-VBPF-TBPFHIST2H2BE, HIST1H2BL, CPS1, XCL1240.06371.1
p53-TBPF-CREB-CDXFHIST1H2AM, HIST2H2BE, HIST1H2BL, CPS1240.04541.2
Selected genes from cluster Be
 ETSF-NFκBTOMM34, DMAP1, ACSL5, YAP1, AKAP12, PLK1, TXNRD1, IL8, ABCG2, BIRC2, RAB174217.212.5
NFκB-ETSFICAM1, DMAP1, YAP1, RPN2, AKAP12, SHANK2, IL8, ABCG2, BIRC2, RAB17,3813.453.1
 ZBPF-NFκBIL6, ICAM1, TOMM34, KCNN4, CA9, FADS3, RPN2, SHANK2, ABCG2, RAB173812.763.0
 EGRF-NFκBfICAM1, TOMM34, CA9, DMAP1, FADS3, KRT8, ABCG2, MLPH,3115.712.0
NFκB-EGRF(+)fICAM1, KCNN4, YAP1, FADS3, RPN2, AKAP12, TXNRD1, ABCG23116.641.8
NFκB-EGRF(-)fICAM1, TOMM34, FADS3, TXNRD1, ABCG2, MLPH, PORIMIN, PTPRJ3116.461.9
NFκB-STATfICAM1, TOMM34, DMAP1, FADS3, KRT8, ABCG2, MLPH279.572.8
 SP1-ZBPF-NFκBIL6, ICAM1, KCNN4, CA9, FADS3, KRT8234.025.7
NFκB-ZBPF-EGRFYAP1, TXND1, AKAP12, FADS3, PPP1R12A, RPN2236.753.4
 ZBPF-NFκB-MAZFKCNN4, DMAP1, YAP1, FADS3, TXNRD1, ABCG2232.459.4
NFκB-PAX5-ZBPF-ZBPFgDMAP1, TXNRD1, FADS3, ABCG2, BCAT1191.4613.2
 CREB-ZBPF-NFκB-ETSFgIL6, TOMM34, KCNN4, ABCG2, PTPRJ190.3554.7
 NKXH-HOXF-CREB-NFκB-ETSFgIL6, TOMM34, DMAP1, IL8150.03530.8
 HNF1-HOXF-CREB-NFκB-ETSFgIL6, TOMM34, DMAP1, IL8150.02943.7
 EVI1-LHXF-HNF1-NFκB-ETSFgICAM1, TOMM34, DMAP1, IL8150.011213.3
NFκB-EGRF-ETSF-SP1F-ZBPFgICAM1, TXNRD1, YAP1, ABCG2150.2270.8
 EVI1-HNF1-HOXF-NFκB-ETSFgICAM1, TOMM34, DMAP1, IL8150.02943.7
 EBOX-ZBPF-NFκB-MAZF-PAX5gKCNN4, YAP1, FADS3, ABCG2150.13116.3

Shown are the selected models and their matches in the clusters A and B measured using FrameWorker, and hits in Genomatix promoter database (GPD) measured using ModelInspector of Genomatix Suite 3.4.1 [108]. In general, the distance between two transcription factor binding elements was limited to 5 to 150 base pairs. Genes in bold are orthologs; those in italic and bold contain conserved nuclear factor-κB (NF-κB) or p53 binding sites in orthologous promoter sets; underlined genes contain known NF-κB binding site. aPercentage of matches in p53 or NF-κB group in the cluster for that model. bPercentage of hits in all human promoters (55,207) in GPD for that model. cThe ratio between percentage of matches in the cluster and percentage of hits in the entire human promoters of GPD for that model [116]. dIncluding 17 input genes with predicted p53 binding sites in cluster A. p53 is shown in bold. eIncluding 26 input genes with predicted NF-κB binding sites in cluster B. NF-κB is shown in bold. fModel searching only covered five transcription factor families: NF-κB, p53, signal transducer and activator of transcription (STAT), activator protein (AP)-1, and early growth response family (EGRF). (-) and (+) indicate strand direction of EGRF binding sites. gThis distance was set to 5 to 200 base pairs.

Figure 5

Transcription regulatory module containing multiple transcription factors in eight histone gene promoters from cluster A. Using FrameWorker of Genomatix Suite 3.4.1, eight promoter regions of histone genes (two H2A and six H2B) from cluster A were used to predict regulatory modules including TBPF (TATA-binding protein factors), ECAT (enhancer of CCAAT binding factors), or PCAT (promoter of CCAAT binding factors). p53 binding motifs were also displayed. '(+)' and '(-)' refer to strand direction of transcription factor binding motifs.

Putative transcription factor models in clusters A and B over-expressed in HNSCC Shown are the selected models and their matches in the clusters A and B measured using FrameWorker, and hits in Genomatix promoter database (GPD) measured using ModelInspector of Genomatix Suite 3.4.1 [108]. In general, the distance between two transcription factor binding elements was limited to 5 to 150 base pairs. Genes in bold are orthologs; those in italic and bold contain conserved nuclear factor-κB (NF-κB) or p53 binding sites in orthologous promoter sets; underlined genes contain known NF-κB binding site. aPercentage of matches in p53 or NF-κB group in the cluster for that model. bPercentage of hits in all human promoters (55,207) in GPD for that model. cThe ratio between percentage of matches in the cluster and percentage of hits in the entire human promoters of GPD for that model [116]. dIncluding 17 input genes with predicted p53 binding sites in cluster A. p53 is shown in bold. eIncluding 26 input genes with predicted NF-κB binding sites in cluster B. NF-κB is shown in bold. fModel searching only covered five transcription factor families: NF-κB, p53, signal transducer and activator of transcription (STAT), activator protein (AP)-1, and early growth response family (EGRF). (-) and (+) indicate strand direction of EGRF binding sites. gThis distance was set to 5 to 200 base pairs. Transcription regulatory module containing multiple transcription factors in eight histone gene promoters from cluster A. Using FrameWorker of Genomatix Suite 3.4.1, eight promoter regions of histone genes (two H2A and six H2B) from cluster A were used to predict regulatory modules including TBPF (TATA-binding protein factors), ECAT (enhancer of CCAAT binding factors), or PCAT (promoter of CCAAT binding factors). p53 binding motifs were also displayed. '(+)' and '(-)' refer to strand direction of transcription factor binding motifs. By contrast, the predicted transcription factor models exhibited greater diversity when connecting NF-κB with other transcription factors. The major transcription factors associated with NF-κB were ETSF (human and murine ETS1 factors; 8/14) and ZBPF (zinc binding protein factors; 8/14). In most cases the locations of NF-κB binding sites were near to either ETSF or ZBPF, except in two cases, where NF-κB sites were separated from ETSF or ZBPF by PAX5 or EGRF (early growth response family). We noticed that the selectivity of these models containing five TFBSs was much greater than that of other ones. It is therefore possible that cooperation of ETSF-NF-κB or ZBPF-NF-κB with other transcription factors is part of NF-κB transcriptional regulatory mechanisms. These transcription factor modules are over-represented in the promoter regions of genes with conserved and confirmed NF-κB binding sites, such as IL8 (6/14), ICAM1 (6/14), and IL6 (5/14); genes with conserved NF-κB binding sites, such as DMAP1 (8/14), TOMM34 (7/14), and KCNN4 (5/14); genes containing an orthologous promoter, such as YAP1 (6/14); and ABCG2 (8/14), which did not belong to any of these categories (Table 3). Furthermore, we identified three NF-κB-EGRF and one NF-κB-STAT models if only five transcription factor families (namely NF-κB, p53, STAT, AP-1, and EGRF) were included in the analysis (Table 3). Both models containing NF-κB with either EGRF or STAT were observed in seven common genes: ICAM1, TOMM34, DAMP1, FADS3, KRT8, ABCG2, and MLPH.

Gene expression and promoter binding activity modulated by doxorubicin or tumor necrosis factor-α

To obtain experimental evidence on the predicted role of p53 and NF-κB in regulating gene expression of the clusters, we compared expression levels of selected genes at baseline and following treatment with classical inducers for p53 (doxorubicin) or NF-κB activation (tumor necrosis factor [TNF]-α), using real time RT-PCR. For study of cluster A genes, HKCs were used and treated with doxorubicin to ensure functional p53 status, because a variety of deficiencies in p53 expression and function were observed in UMSCC cell lines with different p53 status (Friedman, unpublished data). As shown in the left panels of Figure 6a, doxorubicin treatment either induced or suppressed expression levels of genes from cluster A in a time-dependent manner. For genes from clusters B and C, in which NF-κB promoter binding motifs were prevalent, TNF-α treatment significantly modulated the expression of multiple genes in UM-SCC 6 cells (Figure 6a; middle and right panels). To confirm whether predicted NF-κB binding sites in promoters from cluster B genes are bound by NF-κB components, ChIP binding assay was performed using anti-NF-κB antibodies (p65 and cRel). The results showed the promoter binding activity of ICAM1, IL8, IL6, and YAP1 genes from cluster B (Figure 6b). A strong basal and TNF-α induced p65 binding activity in IL8 promoter, and a weaker basal and significant TNF-α induced p65 binding activity were observed in IL6 and ICAM1 promoters, respectively (Figure 6b). Binding of NF-κB family member cRel to the predicted cRel motif on the YAP1 promoter was detected and inhibited by TNF-α (Figure 6b). Minimal nonspecific binding activities were observed in the negative controls using isotype IgGs. In addition, we also observed constitutive p53 binding activity to the promoters of HIST1H2BD and HIST1H2BN from the cluster A gene list (Yang and coworkers, unpublished data).
Figure 6

Basal and inducible gene expression, and promoter binding activity were modulated by doxorubicin or TNF-α. (a) Human keratinocyte (HKC) cells were treated with doxorubicin (Dox; 0.5 μg/ml, left panels), and the UM-SCC 6 cell line was treated with tumor necrosis factor (TNF)-α (2000 U/ml, center and right panels) for different periods, as indicated. Total RNA was harvested by Trizol and genes selected from clusters A to C were analyzed by real-time reverse transcription polymerase chain reaction (RT-PCR). The data are presented as the mean plus standard deviation from triplicates with normalization by 18S ribosome RNA. '(C1)', '(C2)', and '(C3)' refer to the three subclusters of cluster C. '(C)' refers to genes in cluster C outside subclusters C1 to C3. (b) Chromatin immunoprecipitation assays were performed in UM-SCC 11A cells using rabbit polyclonal anti-p65 or cRel antibodies with IgG isotype control.

Basal and inducible gene expression, and promoter binding activity were modulated by doxorubicin or TNF-α. (a) Human keratinocyte (HKC) cells were treated with doxorubicin (Dox; 0.5 μg/ml, left panels), and the UM-SCC 6 cell line was treated with tumor necrosis factor (TNF)-α (2000 U/ml, center and right panels) for different periods, as indicated. Total RNA was harvested by Trizol and genes selected from clusters A to C were analyzed by real-time reverse transcription polymerase chain reaction (RT-PCR). The data are presented as the mean plus standard deviation from triplicates with normalization by 18S ribosome RNA. '(C1)', '(C2)', and '(C3)' refer to the three subclusters of cluster C. '(C)' refers to genes in cluster C outside subclusters C1 to C3. (b) Chromatin immunoprecipitation assays were performed in UM-SCC 11A cells using rabbit polyclonal anti-p65 or cRel antibodies with IgG isotype control.

Discussion

Gene expression profiles have been intensively studied to identify critical gene expression signatures related to heterogeneity in HNSCC phenotypes [1-8,56]. However, the underlying transcriptional regulatory mechanisms have not been well defined. Until recently, genome-wide analysis of transcriptional regulation involving multiple signal pathways and transcription factors have mostly been conducted in the prokaryotic and lower eukaryotic organisms, such as Escherichia coli and yeast. Little information regarding transcriptional control of global gene expression has been generated from large-scale analysis of gene expression profiles in cancer related investigations. Utilizing array technology together with up-to-date bioinformatics analyses and biologic information, we found evidence for increased prevalence and differential distribution of TFBSs for five transcription factors in association with differentially expressed gene signatures in subsets of HNSCC cell lines. The five transcription factors p53, NF-κB, AP-1, STAT3, and EGR-1 have previously been implicated by our laboratory and others [21,22,24,25,30-33,36,41,50-52] as independent factors that contribute to malignant progression of HNSCCs. However, the overall biologic significance and potential scope of the contribution of these transcription factors in regulating global and heterogeneous gene expression have not previously been defined. Our results suggest that over-expressed gene clusters in human HNSCC subgroups identified by global expression profile most likely involve the regulation of transcription factors p53, NF-κB, and AP-1. In addition, the broad repertoire of genes over-expressed by most HNSCCs may be co-regulated by other transcription factors such as STAT3 and EGR1. Enrichment for binding sites of NF-κB, AP-1, STAT3, or EGR1 was found in the promoters of genes in cluster B that are involved in cell survival, inflammation, and angiogenesis (Figure 2 and Table 2). Some of the genes (IL6, IL8, YAP1, and BIRC2) were associated with SCC metastatic progression and aggressive phenotypes in previous studies [9,38], suggesting that these transcription factors may cooperate to activate gene signatures that are important in pathogenesis and increase the malignant potential of HNSCCs with a wild-type p53-like expression pattern. A remarkable observation from this study is the apparent inverse relationship between expression of cluster A and B genes and their association with dominant prevalence of binding sites of p53 in cluster A genes, and NF-κB, AP-1, and other transcription factors in cluster B genes. Cluster A genes were expressed at higher levels in most cells with mutant p53 and exhibited a higher frequency of p53 and lower frequency of AP-1 binding sites than those present in promoters of genes in other clusters or vertebrate promoters (Figure 3). The segregation of HNSCC cells into subsets with these gene expression clusters revealed a relationship with p53 mutation status in most of the cells, which supports the importance of predicted p53 binding sites (Figure 3). In cluster A genes the increased prevalence of predicted p53 binding motifs (Figure 3 and Table 2) were consistent with experimental data indicating that selected genes from the list could be modulated by doxorubicin treatment in HKCs that have normal p53 status (Figure 6a). Surprisingly, however, cluster A genes were over-expressed by UM-SCC cells with mutant p53 genotype, and not by cells with wild-type p53-like expression pattern (Figure 3). A detailed mechanistic study of the role played by mutant p53 in cluster A gene expression is underway. Conversely, cluster B genes with promoters containing known or putative motifs for NF-κB were highly represented in HNSCCs with a wild-type p53-like expression pattern. This suggests that p53 status may also affect expression of NF-κB regulated genes. One possible explanation could be related to a mechanism proposed by Perkins and colleagues [57-59], who previously showed that lack of functional p53 or p14ARF can permit activation of NF-κB, whereas p14ARF or p53 activation can result in a repression of NF-κB regulated genes through competition for co-factor CBP/p300 or phosphorylation of p65 thr 505. Studies of the mechanism(s) underlying this apparent inverse relationship between p53 and NF-κB regulated genes in HNSCC cells are in progress (Friedman, unpublished data). In this study, we observed segregation of nine out of 10 of the UM-SCC lines by p53 genotype, with the exception of UM-SCC 11B cells (Figures 1 and 2). The UM-SCC 11B cell line, which exhibited a p53 mutation by sequence analysis and heterogeneous p53 protein expression in different subpopulations by immunohistochemistry (Figure 1b), clustered with its parental UM-SCC 11A and other UM-SCC cell lines that exhibit low expression of wild-type p53 (Figures 1 and 2). The low mRNA and protein expression of wild-type p53 with deficient p53 function were previously reported in breast cancer cell lines and tissue specimens [60,61]. At this stage, we have not yet determined whether the p53 related subgrouping of UM-SCC cells by gene profiling could result from loss of wild-type p53 function in the UM-SCC cells with wild-type p53-like expression pattern, differences in function of the various p53 mutants, or activation or interactions with other p53 family members or co-factors. Preliminary data revealed the presence of low transducing activity by reporter gene assay in UM-SCC 11B and other UM-SCC cells with a wild-type p53-like expression pattern, which is in contrast to the greater activity observed in some UM-SCC cells with gain-of-function mutant p53 (Friedman, unpublished data). In addition, multiple p53, p63, and p73 family members can compete for the same binding motifs and differentially regulate various p53 family gene programs, such as those recently implicated in survival and adhesion of HNSCCs [62]. Exceptions observed in our study may be useful in dissecting whether certain p53 mutations or alternative family members affect overall gene expression profiles. Determination of the regulation of these genes will require further study of the potentially complex contribution of expression, phosphorylation, and co-factor interaction of wild-type and p53 mutants, and possibly other p53 family members such as p63 and p73. In cluster A, about 25% of genes are involved in chromosome structure and functions by GO annotation (Additional data file 2), including eight genes in histone H2A and H2B families. However, the biologic significance of heterogeneous expression of the H2 histone gene family in cancer biology and the relationship to p53 regulatory mechanisms are not well understood. It is known that the core structure of the nucleosome is dependent upon both histone-histone and histone-DNA interactions, and alterations of chromatin structures and modifications of histone by acetylation and phosphorylation are basic processes for activation or repression of gene expression [63]. For example, in H2AX-deficient mice, embryonic stem cells exhibited impaired recruitment of specific DNA repair complexes to ionizing radiation-induced nuclear foci [64]. Histone H2B has been suggested to play a specific role in UV-induced DNA repair processes in yeast [65], whereas histone H2B-elicited ubiquitylation may play a role in gene activation and affect gene silencing [66]. Mdm2 is a negative regulator of p53 that can interact directly with histones and induce monoubiquitylation of histones H2A and H2B [67,68]. Yu and coworkers [68] observed that rapid p53-mediated inhibition of cell cycle progression was induced by DNA-damaging agents in differentiated neuroblastoma cells, resulting in induction of cdk2-cyclin E expression followed by phosphorylation of histone H2B and cell death. In addition, p53 is also able to recruit p300 to the p21 promoter that leads to targeted acetylation of chromatin-assembled core histones H2A, H2B, H3, and H4 [69]. The possible effects of differential expression of the histone genes in transcription or DNA integrity in HNSCCs over-expressing mutant p53 warrant further investigation. Cluster B genes were over-expressed by UM-SCC cells with wild-type p53-like expression pattern (Figure 2). Functional categories of about 40% genes in cluster B with dominant NF-κB binding sites belong to cell proliferation and signal transduction (Table 2). TNF-α induced rapidly and significantly increased expression of several known NF-κB targeted genes, including ICAM1, IL6, IL8 and TNFAIP2, but the mechanisms are less well understood for the rest of the genes presented (Figure 6a). We previously showed that, in HNSCCs, NF-κB elicited over-expression of cytokine and growth factor genes that promote inflammation and angiogenesis, such as IL1, IL6, IL8, GRO-1, GM-CSF, and VEGF [21,22,24,25,37,49,70]. Molecular profiling of transformed and metastatic murine SCCs showed that Gro1 (IL8 homolog), antiapoptotic gene cIAP-1 (cIap-1/Birc2), and Yap65 (YAP 1) were upregulated and clustered together in association with activation of NF-κB and tumor growth, metastatic progression, and angiogenesis [37,38]. Chung and coworkers [71] recently identified human homologs of a set of 99 genes from our murine tumor model. These genes are modulated by NF-κB and highly represented in a gene cluster and subset of patients with HNSCC who have poor prognosis [71]. In addition, some cluster B genes, such as BIRC2 (cIAP-1), YAP1, and KRT18 were also identified by another laboratory using 5 UM-SCC cell lines resistant to cisplatin [5]. Jeon and coworkers [1] studied 25 UM-SCC lines, including 11A, 11B, 22A and 22B; three of the genes in their upregulated gene cluster, namely IL8, KRT8, and TXNRD1, overlapped with our cluster B gene list. Recently, Roepman and coworkers [9] reported that cluster B genes IL6, IL8, YAP1, and BIRC2 are upregulated in metastatic HNSCC tumor specimens. ICAM1 is another NF-κB regulated gene in the cluster B list and is implicated in adhesion in a wide range of inflammatory and immune responses, and carcinogenesis [72-75]. Together, these data indicate that heterogeneity in expression of genes in clusters B and C are important in the malignant phenotype and therapeutic resistance, and that NF-κB and other transcription factors contribute to their differential expression in HNSCCs. Increased prevalence of STAT3 and EGR1 binding motifs were predicted in the promoters and distributed throughout the gene clusters over-expressed by UM-SCC cells, when compared with those in vertebrate promoters, indicating that these transcription factors could play a broader or co-regulatory role with other transcription factors in gene expression in HNSCC tumorigenesis [31-33,36]. In addition, the occurrence of three EGRF-NF-κB and one STAT-NF-κB putative modules were identified, which is consistent with previous examples indicating that NF-κB can actively interact with STAT3, EGR1, and other transcription factors to regulate cytokine and growth factor expression, and receptor kinase activation [76-79]. Our previous experimental data are also consistent with the hypothesis that NF-κB may interact with EGR1, activated by hepatocyte growth factor (HGF) and the tyrosine kinase receptor c-MET, to co-regulate gene expression in HNSCC [36,48]. EGR1 can be activated through HGF/c-Met tyrosine kinase receptor and protein kinase C dependent mechanisms and induce PDGF and VEGF [48]. HGF promoted expression of angiogenesis factors in tumor cells through both mitogen-activated protein kinase kinase and phosphatidylinositol 3-kinase dependent pathways [48]. Over-expression of c-Met expression enhanced the expression of angiogenesis factors IL8, VEGF, and PDGF in response to HGF in vitro [48], and increased tumorigenesis and metastasis in the tumor microenvironment [80]. Identification of TFBSs by sequence similarity only indicates the potential for physical binding of transcription factors to their corresponding regulatory regions. These binding motifs do not all necessarily play functional roles in the biologic context. Determination of conservation of cis-regulatory elements across species, or so-called phylogenetic footprinting [81,82], is useful in predicting a functional role of cis-regulatory motifs in transcriptional regulation [10,82]. We analyzed conserved TFBSs by two steps at the sequence level, including annotation of orthologous genes and study of the similarity of TFBSs in aligned regulatory regions. We found a majority of the putative TFBSs (more than about 60%) appeared in orthologous promoter sets of cluster A or cluster B (Table 2), and conserved binding sites for p53 and NF-κB were observed in five and ten genes of the two clusters (Figure 4). These data suggest that these predicted TFBSs are likely to be functional in evolution, and the activities of some predicted TFBSs have been tested experimentally. As shown in Figure 6a, among the genes we selected for experiment, ten out of 12 gene promoters fell within orthologous regions (except HIST1H2BN and IL8; Table 2). Their levels of gene expression were regulated by doxorubicin or TNF-α (Figure 6a). It is widely believed that transcriptional regulation of gene expression is most often accomplished by functional cooperation of multiple transcription factors rather than a single factor. In the promoters of cluster A genes, p53 binding motifs were predicted to co-localize with TBPF motifs for TATA-binding proteins (Table 3). As a regulatory module, p53-TBPF interacts with other transcription factors that could contribute significantly to the controlling mechanism of cluster A gene expression regulating chromosome structure, function, and stability (Table 3). In addition, a putative common regulatory module for all eight histone genes in cluster A was identified as TBPF-ECAT-PCAT with high selectivity (Figure 5). It is interesting that the predicted p53 binding sites in four histone genes are close to this module. Using BiblioSphere PathwayEdition of Genomatix, the functional co-citations of p53-NFYC (nuclear factor YC) and p53-YB-1 (Y-box-binding protein 1) were observed, and NFYC and YB-1 belong to ECAT and PCAT families. It has been shown that YB-1 (also known as DNA binding protein B) accumulates in the nucleus under genotoxic stress only when cells retain wild-type p53, whereupon YB-1 inhibits p53 activity [83]. The NFY family, including A, B and C subunits, are histone-like CCAAT-binding trimers and are specifically required for consensus CCAAT-box binding in enhancer regions [84-86]. The core regions of NFYC and NFYB proteins exhibited high sequence similarity with histones H2A and H2B, respectively [84,87,88], and NFYC has been shown to be an important target of p53 regulation in wild-type p53 mediated gene repression [89,90]. In addition to the dominant NF-κB binding motifs on the promoter regions of cluster B, we also captured 14 candidate NF-κB related regulatory models linked with orthologous promoter sets, or the conserved or experimentally defined NF-κB binding sites (Table 3). Six genes, namely IL6, IL8, ICAM1, KCNN4, TOMM34, and DMAP1, which contain phylogenetic conserved or known NF-κB binding sites were highly enriched in these selected models. The majority of the regulatory modules contain ETSF-NF-κB, ZBPF-NF-κB and EGRF-NF-κB with other TFBSs. The ETS family is a large set of transcription factors that is characterized by an evolutionally conserved ETS domain; they are related to cancer cell growth, signal transduction, angiogenesis, cell proliferation, and apoptosis [91-94]. ZBP-89 (also called ZNF-148), a main ZBPF member, is a Krüppel-type zinc-finger protein and is involved in transcriptional regulation of a variety of genes [95], cell growth arrest [96,97], and apoptosis and cell death [96]. Borghaei and coworkers [98] found that both ZBP-89 and NF-κB binding to SIRE (stromelysin IL-1 responsive element) sites of the matrix metalloproteinase-3 promoter in response to inflammatory cytokines [98]. Coordinated binding of ZBP-89, SP1, and NF-κB p65/p50 in the ENA-78 (epithelial neutrophil-activating peptide-78) promoter play a major role in the regulation of ENA-78 expression in Caco-2 human colonic epithelial cells [99]. Thus, the regulation of genes in cluster B by NF-κB is most likely linked to ETS members or ZBPF. We are currently conducting studies to confirm this co-operation of transcription factors and their binding to the promoter sequence experimentally. In our ten UM-SCC cell lines, unique and novel gene expression signatures were identified by cDNA microarray analysis, and the newly discovered regulatory mechanisms that control such gene expression signatures (predominantly involving transcription factors) were revealed. Important genes within the expression profiles, and the related transcriptional regulatory mechanisms identified, are consistent with observations from clinical studies in more complex tissue specimens in independent studies at our and other institutions, supporting the biologic and clinical relevance of genes within clusters identified in our study in an experimental cell system [5-8,13,16-20,31,32,41,70,71].

Conclusion

A distinct molecular gene classification of HNSCC cells was obtained with differentially expressed gene clusters associated with different p53 genotype and protein expression status. The prevalences of p53, NF-κB, and AP-1 binding sites were observed in the promoter regions of different gene clusters, suggesting their importance for the subgrouping of HNSCCs. Increases in STAT3 and EGR1 binding sites were observed in all over-expressed gene clusters compared with vertebrate promoters, indicating their broader role in co-regulating the expression of genes in HNSCCs. The importance of these transcriptional regulatory elements in clustered gene expression by UM-SCC cells was supported by the prediction of several regulatory modules in different gene clusters, experimentally modulated gene expression under doxorubicin and TNF-α, and NF-κB binding activity measured by ChIP assays. Therefore, our finding of the predominant role of p53, NF-κB, and other transcription factors provides a useful clue for elucidating the mechanisms that control differential expression of gene clusters and related molecular classifications in HNSCCs, and to develop targeted therapeutics in the future.

Materials and methods

Cell lines

Ten established HNSCC cell lines, namely UM-SCC 1, 5, 6, 9, 11A, 11B, 22A, 22B, 38, and 46, were obtained from the University of Michigan series of HNSCC cell lines (Ann Arbor, MI, USA), as described previously and in Table 1[26,48,70]. HKCs were obtained from four individuals (Cascade Biologics Inc., Portland, OR, USA), and cultured following the manufacturer's suggestions.

DNA sequencing

Genomic DNA was isolated using Trizol method, in accordance with the manufacturer's suggestions (Invitrogen, Carlsbad, CA, USA). DNA sequencing was conducted using PCR with reported primers for p53 exons 4 to 9, and samples were sequenced bidirectionally using Applied Biosystems DNA sequencers (Applied Biosystems, Foster City, CA, USA), in accordance with the manufacturer's protocol, at the NIDCD (National Institute on Deafness and Other Communication Disorders) Core Sequencing Facility. All p53 mutations were confirmed in independent PCR reactions to ensure that the mutation was not an artifact of PCR.

Immunocytochemistry

Immunostaining of p53 protein was performed in cultured UM-SCC cells. Briefly, UM-SCC cells (1 to 2 × 104) were plated on eight-well chamber slides (Lab-Tek, Naperville, IL, USA) for 2 to 3 days, fixed, and permeabilized in freshly made cold methanol:acetone (1:1). Then, 1% hydrogen peroxide (Fisher Scientific, Fair Lawn, NJ, USA) was used for blockade of endogenous peroxidase. Ten per cent blocking serum (Vector Labs, Inc., Burlingame, CA, USA) was used to block the nonspecific binding sites for 20 min and removed without washing. The samples were incubated with the primary antibodies, namely 2 μg/ml mouse anti-human p53 (DO-1, IgG2a; Calbiochem, EMD Biosciences, San Diego, CA, USA) or isotype control mouse IgG (Cat# I2000, Vector Labs, Inc.), which were diluted in phosphate-buffered saline with 10% blocking serum overnight at 4°C. The samples were blocked with 5% serum for 20 min and incubated with the secondary biotinylated antibody for 30 min, followed by 30 min incubation with biotin/avidin horseradish peroxidase conjugates (Vectastain Elite ABC kit; Vector Labs, Inc.) and chromogen DAB (ciaminobenzidine tetrahydro-chloride; Vector Labs, Inc.), in accordanc with the manufacturer's specifications.

Microarray experiments and data collection

The experimental methods used for microarray were recently described [44,100]. Briefly, the cDNA microarray chips of 24,000 human elements were developed and printed by National Human Genome Research Institute (Bethesda, MD, USA). Total RNA was isolated from UM-SCC cell lines or HKCs, reverse transcribed, labeled with Cy5, and combined with the Cy3 labeled cDNA reverse transcribed from human universal reference RNA (Stratagene, La Jolla, CA, USA). The labeled targets were hybridized on to the cDNA array chip at 65°C overnight. Fluorescence intensity was obtained using a GenePix 4000 microarray scanner with GenePix Pro software (Axon Instruments, Union City, CA, USA).

Microarray data analysis according to the p53 related expression pattern of UM-SCC cell lines

The 24,000 cDNA microarray chips contained a total of 23,220 spots, which includes known genes, expressed sequence tags, hypothetical proteins, and control spots. There were 12,270 known gene sports on the arrays [44], where a list of 9,273 known genes with good quality were generated and subjected to principal components analysis and genome-wide gene expression hierarchical clustering. The results segregated cells into three distinct groups, HKCs, UM-SCC cells with wild-type p53-like expression pattern (including UM-SCC 11B), and most cells with mutant p53 [44]. We have tested the subgrouping of UM-SCC cells using four strategies: principle components analysis of 9,273 known genes; hierarchical clustering analysis of 9,273 known genes; hierarchical clustering analysis of genes that exhibited at least twofold changes in gene expression among keratinocytes, five UM-SCC lines with mutant p53, and five UM-SCC lines with wild-type p53-like status (including UM-SCC 11B cells); and hierarchical clustering analysis of the genes with at least twofold changes in gene expression among the groups when UM-SCC 11B was included in the mutant p53 group. Under all four circumstances, UM-SCC 11B line is always grouped with UM-SCC cells with wild-type p53-like group, but not with the group of cells with mutant p53. The significance of this grouping of cells has been tested by multiple statistical methods in the 'class prediction analysis' provided in BRB-Array Tools developed by the US National Institutes of Health, in which UM-SCC 11B was predicted to belong to the group designated wild-type p53-like (P < 0.001) [45]. Based on these observations, we defined the subset of UM-SCC cells with wild-type p53 genotype plus UM-SCC 11B cells as exhibiting a wild-type p53-like expression pattern. In order to analyze further the differential gene expression between HKCs and UM-SCC cells with different p53 status, we selected genes that satisfy at least one of the following criteria (two-tailed t-test, P < 0.05): twofold or greater change in average gene expression between UM-SCC cells in either group when compared with the average gene expression by HKCs; and twofold or greater change in average gene expression between ten tumor cell lines and HKCs. Hierarchical clustering was performed using Stanford University software Cluster 2.11 [101]. Missing data from microarray were estimated using K imputer of SAM 2.11 [102]. Normalized two-based log ratios for 1,011 genes were median-centered within each gene and array for all of the cluster analyses. The clustering results were obtained based on average linkage with one minus Pearson correlation distance metrics. The expression of clustered genes was visualized using Java Treeview [103]. A mixed model based F test was used to analyze differential gene expression among three groups: HKCs, UM-SCC cell lines with wild-type p53-like expression patterns, and UM-SCC cell lines with mutant p53 expression patterns. Such a model was used to test the fixed effects of random factors [104,105]. Each gene expression level was measured independently for each subject with homogeneous variance. The normalized model for this study is as follows: Y Where Yijg denotes response of gene g for subject j in group i (c, w, m); μ is a fixed effect that represents an overall mean value; Sj is the random effect of each subject; Gg is the random effect of each gene; εijg is the random error; Ex(c,w,m) is the fixed effect for different gene expression type; c is HKCs; w is wild-type p53-like; and m is mutant p53. Restricted maximum likelihood was used to estimate the covariance parameters in the model. The F test was performed in mixed model using SAS 9.1 program (SAS Institute Inc. Cary, NC, USA). P < 0.05 between two groups was considered statistically significant.

Gene Ontology annotation

GO annotation was performed using Onto-Express [46], AmiGo [106], and Entrez Gene database of National Center for Biotechnology Information (National Institutes of Health, Bethesda, MD, USA) [107]. Onto-Express dynamically calculates P values for the GO term based on the abstraction level chosen by the user. A hypergeometric distribution was used to calculate the significance values in the probability model [46]. Corrected P < 0.05 was used as cutoff value in this study.

Prediction of transcription factor binding sites on promoters of clustered genes

Promoter sequences of selected genes were extracted using ElDorado task of commercial Genomatix software suite 3.4.1 [108]. The promoter regions were defined as 'optimized' regions by Genomatix Suite 3.4.1, in which the average 'optimized' length of 159,207 vertebrate promoters is 632 bp, usually including about 500 bp upstream of the most 5' mapped transcription start site and about 100 bp downstream from the most 3' mapped transcription start site. The vertebrate promoters in GPD consist of 55,207 from human, 69,108 from mouse, and 35,190 from rat. GEMS Launcher of Genomatix is an integrated software package and contains multiple tasks of sequence analysis and functional genomics, including MatInspector. MatInspector is used to search transcription factor matrix matches based on position weight matrices, which provide powerful estimates for searching for TFBSs and have been used successfully to detect functional transcription factor elements [109-111]. To reduce false-positive findings, MatInspector also used optimized transcription factor matrix thresholds and transcription factor family concept, and incorporated data from independent publications [112,113]. In this study, the Gene2Promoter task in Genomatix was used to analyze a set of genes (cluster) when searching for TFBSs through MatInspector. All matrices of p53, NF-κB, and AP-1 families were used. The matrices for p53 family included the full length, 5' half site, or 3' half site of its binding motifs. NF-κB family included five subunits: RelA (p65), RelB, cRel, NF-κB1 (p50), and NF-κB2. AP-1 family consisted of c-Jun, JunB, JunD, c-Fos, FosB, Fra-1, Fra-2, BATF, JDP2, and SNFT. We chose matrix STAT3.01 for the STAT3 search, and matrices EGR1.01 and EGR1.02 for the EGR1 search. Default matrix indices (core similarity: 0.75; matrix similarity: optimized) were set during TFBS searching. The P value of TFBS frequency in a given cluster was calculated by MatInspector. It refers to the probability of obtaining an equal or greater number of sequences with a match in a randomly drawn sample of the same size as input sequence set [108]. P calculation was based on 'positional bias' [114] and the precalculated number of promoter matches in GPD. The lower this probability, the greater is the importance of the observed transcription factor. The χ2 test for TFBSs distributed in different clusters was also conducted.

Prediction of orthologous promoter regions and regulatory modules of multiple transcription factor binding sites

To find conserved TFBSs, we extracted orthologous promoter sets in proximal promoter region with average 'optimized' length among human, mouse, and rat for every gene of interest using 'Comparative genomics' in ElDorado. Multiple sequence alignment of each same promoter set was carried out using the DiAlignTF task of GEMS Launcher. DiAlignTF is able to reduce false-positive findings and display the most likely functional TFBSs within orthologous promoters [112]. Another task, FrameWorker, offers a technology with which to define a common framework of TFBSs from a set of promoter sequences, which was used to construct promoter models consisting of at least two TFBSs for each clustered gene. Such models represent the smallest functional transcriptional unit and putative regulatory module [115]. In this study, we used two sets of genes with either p53 or NF-κB binding sites. In general, the distance of two TFBSs was limited to 5 to 150 bp. For the group with NF-κB binding sites, the distance was set to 5 to 200 bp in order to search models containing four or five TFBSs. Default values were used for other parameters. Such frameworks were refined further with comparative genomics. However, in the p53 group, all genes in the putative models contain at least one p53 binding site. In each model, there was at least one orthologous promoter contains conserved p53 binding sites. The minimal number of the selected genes was seven for two TFBSs, five for three TFBSs, and four for four TFBSs. In the group containing NF-κB binding sites, only models satisfying the following conditions were selected: they contained at least one NF-κB binding site; at least 60% matched gene promoters contained orthologous regions among human, mouse, and rat. The minimal numbers of selected genes were ten for two TFBSs, six for three TFBSs, five for four TFBSs, and four for five TFBSs. Considering the large distance used in defining models containing four to five TFBSs, we set an additional condition; specifically, the model had to contain at least a gene with conserved or known NF-κB binding sites. The predicted models were then used to scan the entire human database of GPD, containing 55,207 promoters, using ModelInspector of GEMS Launcher. The number of hits and percentage of hits in GPD were traced. The percentage was also used to calculate selectivity in order to evaluate the putative models. 'Selectivity' was calculated using the following formula: percentage of matches in the cluster (the third column in the Table 3) divided by percentage of hits in GPD (the fourth column in the Table 3) [116]. The former refers to proportion of matches in certain gene clusters.

Basal and inducible gene expression in HKCs and UM-SCC cells

Gene expression profiles generated from microarray were confirmed by real-time quantitative RT-PCR using the Assays-on-Demand™ Gene Expression Assay (Applied Biosystems, Foster City, CA, USA). Briefly, cultured UM-SCC cells at 70% to 80% confluence were treated with or without TNF (2,000 units/ml, Knoll Pharmaceutical Company, Whippany, NJ, USA) or doxorubicin (0.5 μg/ml, Sigma, St. Louis, MO, USA) for different time points, and total RNA was isolated using Trizol (Invitrogen, Carlsbad, CA, USA). cDNA synthesis was performed by using High-Capacity cDNA Archive Kit (Applied Biosystems) to synthesize single-stranded cDNA from total RNA samples, in accordance with the manufacturer's protocol. The PCR cycling was carried out by adding 30 ng cDNA in 30 μl of PCR reaction mix (1.5 μl of 20× assay mix and 15 μl of 2× TaqMan Universal Master Mix; Applied Biosystems). Amplification conditions were as follows: activation of enzymes for 2 min at 50°C and 10 min at 95°C, followed by 40 cycles at 15 s at 95°C and 1 min at 60°C. Thermal cycling and fluorescence detection was done using an ABI Prism 7700 Sequence Detection System (Applied Biosystems). Relative quantitation of the expression was calculated by normalizing the target gene signals with the 18S endogenous control. An arbitrary unit was calculated after setting CT equal to 40 as undetectable expression level.

Chromatin immunoprecipitation assay

ChIP assays were performed using the EZ ChIP assay kit (Upstate Biotechnology, Waltham, MA, USA), following the manufacturer's directions. Briefly, cultured UM-SCC cells at 70% to 80% confluence were treated with or without TNF-α (2,000 unit/ml) for 1 hour. DNA and proteins were cross-linked by 1% formaldehyde, whole cell lysates then were harvested, and sonicated using SONICATOR XL2020 (Misonix Inc, Farmingdale, NY, USA). Pre-cleared chromatin was immunoprecipitated with a rabbit polyclonal antibody against p65 (Upstate Biotechnology), or c-Rel subunits (Santa Cruz, Biotechnology, Inc, Santa Cruz, CA, USA) of human NF-κB or normal rabbit IgG (Upstate Biotechnology), followed by incubation with protein A-agarose saturated salmon sperm DNA (Upstate Biotechnology). Precipitated DNA was analyzed by PCR (35 cycles) with Platinum Taq DNA Polymerase (Invitrogen, Carlsbad, CA). The PCR products were analyzed on a 2% agarose E-Gel (Invitrogen). Primers for the IL8 promoter (-121 to +61 bp) were 5'-GGGCCATCAGTTGCAAATC-3' (forward) and 5'-TTCCTTCCGGTGGTTTCTTC-3' (reverse). For the IL6 promoter (-203 to -60 bp) they were 5'-TGCACTTTTCCCCCTAGTTG-3' (forward) and 5'-TCATGGGAAAATCCCACATT-3' (reverse). For the ICAM1 promoter (-385 to -157 bp) they were 5'-GCAGCCTGGAGTCTCAGTTT-3' (forward) and 5'-TCCGGAATTTCCAAGCTAAA-3' (reverse). Finally, for the YAP1 (Yes-associated protein 1) promoter (-309 to -164) they were 5'-TAGCAACTTGCAGCGAAAAG-3' (forward) and 5'-GCCTCAAACGCCAAAACTAA-3' (reverse). All of the above PCR amplified regions contained at least a putative NF-κB binding site, as predicted by Genomatix Suite 3.4.1.

Additional data files

The following additional data are available with the online version of this paper. Additional data file 1 shows list of genes in cluster C (over-expressed in UM-SCC cells). Additional data file 2 includes GO annotations of genes in clusters A to C (over-expressed in UM-SCC cells) and of genes in clusters D to F (under-expressed in UM-SCC cells). Additional data 3 provides sequences, location, and matrix similarity of putative TFBSs in genes in clusters A and B (over-expressed in UM-SCC cells).

Additional data file 1

Shown is a list of genes in cluster C (over-expressed in UM-SCC cells). Click here for file

Additional data file 2

Included are GO annotations of genes in clusters A to C (over-expressed in UM-SCC cells) and of genes in clusters D to F (under-expressed in UM-SCC cells). Click here for file

Additional data file 3

Provided are sequences, location, and matrix similarity of putative TFBSs in genes in clusters A and B (over-expressed in UM-SCC cells). Click here for file
  112 in total

1.  Assessing gene significance from cDNA microarray expression data via mixed models.

Authors:  R D Wolfinger; G Gibson; E D Wolfinger; L Bennett; H Hamadeh; P Bushel; C Afshari; R S Paules
Journal:  J Comput Biol       Date:  2001       Impact factor: 1.479

2.  DNA damage induces cdk2 protein levels and histone H2B phosphorylation in SH-SY5Y neuroblastoma cells.

Authors:  Xiaoyi Yu; John Caltagarone; Mark A Smith; Robert Bowser
Journal:  J Alzheimers Dis       Date:  2005-09       Impact factor: 4.472

3.  Epigenetic modification of SOCS-1 differentially regulates STAT3 activation in response to interleukin-6 receptor and epidermal growth factor receptor signaling through JAK and/or MEK in head and neck squamous cell carcinomas.

Authors:  Tin Lap Lee; Jason Yeh; Carter Van Waes; Zhong Chen
Journal:  Mol Cancer Ther       Date:  2006-01       Impact factor: 6.261

4.  Gene expression profiles identify epithelial-to-mesenchymal transition and activation of nuclear factor-kappaB signaling as characteristics of a high-risk head and neck squamous cell carcinoma.

Authors:  Christine H Chung; Joel S Parker; Kim Ely; Jesse Carter; Yajun Yi; Barbara A Murphy; K Kian Ang; Adel K El-Naggar; Adam M Zanation; Anthony J Cmelak; Shawn Levy; Robbert J Slebos; Wendell G Yarbrough
Journal:  Cancer Res       Date:  2006-08-15       Impact factor: 12.701

Review 5.  Regulation of p53 downstream genes.

Authors:  W S el-Deiry
Journal:  Semin Cancer Biol       Date:  1998       Impact factor: 15.707

6.  Determination of functional domains in the C subunit of the CCAAT-binding factor (CBF) necessary for formation of a CBF-DNA complex: CBF-B interacts simultaneously with both the CBF-A and CBF-C subunits to form a heterotrimeric CBF molecule.

Authors:  I S Kim; S Sinha; B de Crombrugghe; S N Maity
Journal:  Mol Cell Biol       Date:  1996-08       Impact factor: 4.272

7.  Nuclear factor-kappaB p65 small interfering RNA or proteasome inhibitor bortezomib sensitizes head and neck squamous cell carcinomas to classic histone deacetylase inhibitors and novel histone deacetylase inhibitor PXD101.

Authors:  Jianming Duan; Jay Friedman; Liesl Nottingham; Zhong Chen; Gulshan Ara; Carter Van Waes
Journal:  Mol Cancer Ther       Date:  2007-01       Impact factor: 6.261

8.  Phylogenetic footprinting reveals a nuclear protein which binds to silencer sequences in the human gamma and epsilon globin genes.

Authors:  D L Gumucio; H Heilstedt-Williamson; T A Gray; S A Tarlé; D A Shelton; D A Tagle; J L Slightom; M Goodman; F S Collins
Journal:  Mol Cell Biol       Date:  1992-11       Impact factor: 4.272

9.  Tumor necrosis factor up-regulates intercellular adhesion molecule 1, which is important in the neutrophil-dependent lung and liver injury associated with hepatic ischemia and reperfusion in the rat.

Authors:  L M Colletti; A Cortis; N Lukacs; S L Kunkel; M Green; R M Strieter
Journal:  Shock       Date:  1998-09       Impact factor: 3.454

10.  Linking disease-associated genes to regulatory networks via promoter organization.

Authors:  S Döhr; A Klingenhoff; H Maier; M Hrabé de Angelis; T Werner; R Schneider
Journal:  Nucleic Acids Res       Date:  2005-02-08       Impact factor: 16.971

View more
  30 in total

1.  TNF-α promotes c-REL/ΔNp63α interaction and TAp73 dissociation from key genes that mediate growth arrest and apoptosis in head and neck cancer.

Authors:  Hai Lu; Xinping Yang; Praveen Duggal; Clint T Allen; Bin Yan; Jonah Cohen; Liesl Nottingham; Rose-Anne Romano; Satrajit Sinha; Kathryn E King; Wendy C Weinberg; Zhong Chen; Carter Van Waes
Journal:  Cancer Res       Date:  2011-09-20       Impact factor: 12.701

Review 2.  NF-kappaB in carcinoma therapy and prevention.

Authors:  Matthew Brown; Jonah Cohen; Pattatheyil Arun; Zhong Chen; Carter Van Waes
Journal:  Expert Opin Ther Targets       Date:  2008-09       Impact factor: 6.902

Review 3.  Dysregulated molecular networks in head and neck carcinogenesis.

Authors:  Alfredo A Molinolo; Panomwat Amornphimoltham; Cristiane H Squarize; Rogerio M Castilho; Vyomesh Patel; J Silvio Gutkind
Journal:  Oral Oncol       Date:  2008-09-19       Impact factor: 5.337

4.  ΔNp63 versatilely regulates a Broad NF-κB gene program and promotes squamous epithelial proliferation, migration, and inflammation.

Authors:  Xinping Yang; Hai Lu; Bin Yan; Rose-Anne Romano; Yansong Bian; Jay Friedman; Praveen Duggal; Clint Allen; Ryan Chuang; Reza Ehsanian; Han Si; Satrajit Sinha; Carter Van Waes; Zhong Chen
Journal:  Cancer Res       Date:  2011-05-15       Impact factor: 12.701

5.  Metal-proteinase ADAM12, kinesin 14 and checkpoint suppressor 1 as new molecular markers of laryngeal carcinoma.

Authors:  Jarosław Markowski; Tomasz Tyszkiewicz; Michał Jarzab; Małgorzata Oczko-Wojciechowska; Tatiana Gierek; Małgorzata Witkowska; Jarosław Paluch; Małgorzata Kowalska; Zbigniew Wygoda; Dariusz Lange; Barbara Jarzab
Journal:  Eur Arch Otorhinolaryngol       Date:  2009-07-16       Impact factor: 2.503

6.  The Role of the NF-kappaB Transcriptome and Proteome as Biomarkers in Human Head and Neck Squamous Cell Carcinomas.

Authors:  Zhong Chen; Bin Yan; Carter Van Waes
Journal:  Biomark Med       Date:  2008       Impact factor: 2.851

7.  YAP dysregulation by phosphorylation or ΔNp63-mediated gene repression promotes proliferation, survival and migration in head and neck cancer subsets.

Authors:  R Ehsanian; M Brown; H Lu; X P Yang; A Pattatheyil; B Yan; P Duggal; R Chuang; J Doondeea; S Feller; M Sudol; Z Chen; C Van Waes
Journal:  Oncogene       Date:  2010-08-23       Impact factor: 9.867

8.  Attenuated transforming growth factor beta signaling promotes nuclear factor-kappaB activation in head and neck cancer.

Authors:  Jonah Cohen; Zhong Chen; Shi-Long Lu; Xin Ping Yang; Pattatheyil Arun; Reza Ehsanian; Matthew S Brown; Hai Lu; Bin Yan; Oumou Diallo; Xiao-Jing Wang; Carter Van Waes
Journal:  Cancer Res       Date:  2009-04-07       Impact factor: 12.701

9.  Ronin is essential for embryogenesis and the pluripotency of mouse embryonic stem cells.

Authors:  Marion Dejosez; Joshua S Krumenacker; Laura Jo Zitur; Marco Passeri; Li-Fang Chu; Zhou Songyang; James A Thomson; Thomas P Zwaka
Journal:  Cell       Date:  2008-06-27       Impact factor: 41.582

10.  A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes.

Authors:  Angela K Dean; Stephen E Harris; Ivo Kalajzic; Jianhua Ruan
Journal:  BMC Bioinformatics       Date:  2009-09-17       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.