| Literature DB >> 26884679 |
John C Stansfield1, Matthew Rusay1, Roger Shan1, Conor Kelton1, Daria A Gaykalova2, Elana J Fertig3, Joseph A Califano4, Michael F Ochs1.
Abstract
The goal of this study was to discover a minimally invasive pathway-specific biomarker that is immune to normal cell mRNA contamination for diagnosing head and neck squamous cell carcinoma (HNSCC). Using Elsevier's MedScan natural language processing component of the Pathway Studio software and the TRANSFAC database, we produced a curated set of genes regulated by the signaling networks driving the development of HNSCC. The network and its gene targets provided prior probabilities for gene expression, which guided our CoGAPS matrix factorization algorithm to isolate patterns related to HNSCC signaling activity from a microarray-based study. Using patterns that distinguished normal from tumor samples, we identified a reduced set of genes to analyze with Top Scoring Pair in order to produce a potential biomarker for HNSCC. Our proposed biomarker comprises targets of the transcription factor (TF) HIF1A and the FOXO family of TFs coupled with genes that show remarkable stability across all normal tissues. Based on validation with novel data from The Cancer Genome Atlas (TCGA), measured by RNAseq, and bootstrap sampling, the biomarker for normal vs. tumor has an accuracy of 0.77, a Matthews correlation coefficient of 0.54, and an area under the curve (AUC) of 0.82.Entities:
Keywords: biomarkers; biostatistics; cancer; gene expression profiling
Year: 2016 PMID: 26884679 PMCID: PMC4750896 DOI: 10.4137/CIN.S32468
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1Diagram of the signaling network involved in HNSCC. The root nodes (octagons) of this diagram represent the receptors that are activated and then drive the rest of the network. The leaf nodes (circles) represent the TFs that activate a large number of genes involved in HNSCC. A pointed arrow represents activation of the target and a T represents repression of the target. Rounded rectangles represent signaling proteins.
Figure 2The overall analysis path for the creation of robust biomarkers. The diagram shows the plan from initial data gathering to biomarker identification and is described in detail in the text.
Figure 3Heat map showing hierarchical clustering of subject types across the patterns. The values in the heat map provide the level of association of a sample with a pattern. Class labels are presented in the top bar: HPV+ tumors (red), HPV− tumors (yellow), or normal tissue (green).
Figure 4Boxplots of the strength of each sample in the patterns related to disease status produced by running CoGAPS with the HNSCC network prior.
The target genes of HIF1A and FOXO and the reference gene list from which the biomarker of Table 2 was developed.
| HIF1A AND FOXO TARGETS | REFERENCE GENE LIST |
|---|---|
| ANGPTL2 IGFBP1 FBXO32 | TOP3A ACTR8 PTCD1 ZFYVE27 |
Table of TSPs produced from the analysis of the targets of HIF1A and FOXO to find a biomarker for differentiating HNSCC from normal tissue. Column one is the gene from the reference gene list, while column 2 provides the target of the TF identified by CoGAPS. The third column contains the score of the TSP.
| GENE 1 (REFERENCE GENES) | GENE 2 (TF TARGETS) | TSP SCORE |
|---|---|---|
| MYBBP1A | HMOX1 | 0.470 |
| ZNF74 | TF | 0.448 |
| UBOX5 | HIF3A | 0.225 |
| COPS7B | BLNK | 0.806 |
| RHBDD1 | SELL | 0.669 |
Figure 5ROC curves for the results of the TSPs as predictors for cancer in the original data set (A) and in the TCGA data generated by bootstrapping (B). Six thresholds (0–5) for the number of votes required to determine the case vs. control were used for producing these plots.