Literature DB >> 24988459

Integrated multidimensional analysis is required for accurate prognostic biomarkers in colorectal cancer.

Marisa Mariani1, Shiquan He1, Mark McHugh1, Mirko Andreoli1, Deep Pandya1, Steven Sieber1, Zheyang Wu2, Paul Fiedler1, Shohreh Shahabi1, Cristiano Ferlini1.   

Abstract

CRC cancer is one of the deadliest diseases in Western countries. In order to develop prognostic biomarkers for CRC (colorectal cancer) aggressiveness, we analyzed retrospectively 267 CRC patients via a novel, multidimensional biomarker platform. Using nanofluidic technology for qPCR analysis and quantitative fluorescent immunohistochemistry for protein analysis, we assessed 33 microRNAs, 124 mRNAs and 9 protein antigens. Analysis was conducted in each single dimension (microRNA, gene or protein) using both the multivariate Cox model and Kaplan-Meier method. Thereafter, we simplified the censored survival data into binary response data (aggressive vs. non aggressive cancer). Subsequently, we integrated the data into a diagnostic score using sliced inverse regression for sufficient dimension reduction. Accuracy was assessed using area under the receiver operating characteristic curve (AUC). Single dimension analysis led to the discovery of individual factors that were significant predictors of outcome. These included seven specific microRNAs, four genes, and one protein. When these factors were quantified individually as predictors of aggressive disease, the highest demonstrable area under the curve (AUC) was 0.68. By contrast, when all results from single dimensions were combined into integrated biomarkers, AUCs were dramatically increased with values approaching and even exceeding 0.9. Single dimension analysis generates statistically significant predictors, but their predictive strengths are suboptimal for clinical utility. A novel, multidimensional integrated approach overcomes these deficiencies. Newly derived integrated biomarkers have the potential to meaningfully guide the selection of therapeutic strategies for individual patients while elucidating molecular mechanisms driving disease progression.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 24988459      PMCID: PMC4079703          DOI: 10.1371/journal.pone.0101065

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

CRC is one of the deadliest diseases worldwide. Caucasian patients with local, regional, or metastatic disease exhibit a 5-year survival rate of 66%, 44%, and 4%, respectively [1]. Disease stage at the time of surgery is well established as the most important prognostic factor in CRC. In the last two decades, median overall survival has increased significantly with the introduction of new cytotoxic agents and biologic therapies. The response to such treatments depends on molecular determinants whose elucidation has been the focus of intense and productive research efforts. We now know, for example, that cancers harboring activating KRAS mutations do not respond to anti-EGFR therapy [2]. However, the goal of optimizing treatment protocols based on the unique molecular characteristics of an individual's tumor still remains elusive. Development of novel biomarkers that can reliably identify patients at high risk for disease progression and death would be especially useful in determining the clinical circumstances where adjuvant chemotherapy is warranted. Whereas the use of the antimetabolite 5-Fluorouracil (5FU) is standard therapy for patients with stage III CRC, its potential benefits compared to risks in stage II CRC patients is a matter of controversy and debate [3]. In the absence of a robust clinical predictor of disease outcome, the decision to treat or not to treat stage II patients with 5FU cannot rest on objective and firm criteria. Previously identified predictive biomarkers which had shown great promise in this arena including telomerase, transforming growth factors (TGFα and TGFβ), epidermal growth factors (erbB2 and erbB3) and mucin (MUC1 and MUC2) have disappointed in studies of clinical utility [4]. The traditional approach to biomarker development relies on single dimensional (microRNA, gene or protein) analysis in an attempt to link a single molecular entity to tumor behavior. This method seems to have reached a zenith that is suboptimal for clinical decision-making. Previous multidimensional approaches have demonstrated that through the combination of biomarkers coming from different dimensions a better knowledge of the biology of CRC can be achieved [5], [6], [7]. In an attempt to provide more personalized options, we developed a novel method that further advances the integration and incorporates multiple molecular entities from all three molecular dimensions (microRNA, genes and protein) simultaneously to generate accurate predictors of outcome in patients with CRC. Our results clearly demonstrate the superiority of this novel, multidimensional approach as compared with the traditional tools of single dimension analysis. We are hopeful that newly discovered multidimensional biomarkers will provide a basis for successful triage and stratification of patients in prospective clinical trials while simultaneously revealing molecular agents and pathways playing prominent sinister roles in CRC disease progression.

Materials and Methods

Gene and micro-RNA expression assessed with nanofluidic technology

A clinical cohort of 267 colon cancer patients was analyzed in this retrospective study. After approval of the Danbury Hospital Internal Review Board (DHIRB) and collection of the relevant clinical information, FFPE samples were obtained from colon cancer cases that had been preserved between 2000 and 2008. According to the protocol of the study (DH-17/12) including full de-identification of patient information, DHIRB waived the need of informed consent. FFPE samples were cut to 10 µm thickness and two tissue slices were put into a 1.5 ml tube. To each tube, one milliliter of xylene was added for deparaffinization followed by mixing twice with a high speed vortex for 3 min at room temperature. Total RNA was then automatically extracted with the QIAcube using the miRNeasy FFPE kit (Qiagen, Valencia, CA) following manufacturer's protocol. The RNA from SW837 cells was automatically extracted with the QIAcube using the miRNeasy kit (Qiagen, Valencia, CA) following manufacturer's protocol. RNA quantity and the quality were assessed by Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA). Analysis was carried out using the 48.48 dynamic array (Fluidigm Corporation, CA, USA) and a Biomark platform following the manufacturer's protocol as previously described [8], [9].

Quantitative fluorescent immunohistochemistry

Quantitative fluorescent immunohistochemistry was performed for protein analysis. Tissue specimens were prepared in a Tissue Micro Array (TMA) format: representative tumor areas were obtained from Formalin Fixed Paraffin Embedded (FFPE) specimens of the primary tumor, and up to three representative replicate 3-mm cores from multiple tumor blocks were taken after review and marking of the hematoxylin and eosin stained slides by board-certified pathologists (SS and PF). In total, 630 cores were taken and distributed over 16 slides from 267 patients. FFPE tissues used as controls of the reaction included normal colon, kidney, liver, brain, breast, lymph nodes, thyroid, skin, tonsil, skeletal muscle and bladder along with breast cancer and non-small cell lung cancer. TMA slides were deparaffinized in xylene and then rehydrated in sequentially diluted ethanol solutions. Antigen retrieval was conducted by heating the slides in a steamer for 30 minutes in a solution of Tris-EDTA pH 8.0. Endogenous peroxidase activity was blocked by treating the slides in Peroxidazed reagent (Biocare Medical, Concord, CA) for 5 minutes. Non-specific binding was reduced by incubation with Background Sniper (Biocare Medical, Concord, CA) for 10 minutes. Slides were incubated with the primary target antibodies and epithelial and stromal cell mask antibodies diluted in Da Vinci Green antibody diluent (Biocare Medical, Concord, CA) for 1 hour at room temperature. Details of all antibodies used are in Table S1. Cyanine 5 (Cy5) directly conjugated to tyramide (Perkin-Elmer, Boston, MA) at 1∶50 dilution was used as the fluorescent detection for all the target antigens.

Statistical Analysis

For single dimension analysis, overall survival was calculated from the date of diagnosis to the date of death or date last seen. Medians and life tables were computed using the product-limit estimate by the Kaplan and Meier method, and the Log Rank test was employed only to assess the statistical significance. Multivariate analysis assessed the clinical role of each factor matched with other clinical variables (age, stage, grading, type of tumor, and gender), following Cox proportional hazards model. To simplify the issue of censoring, we remove the patients who were censored within 3 years and transformed the survival data into binary response, either aggressive or non-aggressive. For each factor proven to be significant in multivariate analysis (p-value <0.05), the area under the curve (AUC) in the receiver operating characteristic (ROC) curve was utilized to assess discriminatory power. For multidimensional analysis, the dataset was randomly divided into training and testing subsets, with 125 cases in each subset. Multiple biomarkers were combined to yield a diagnostic score that was used as a predictor of outcome. To generate the score, we first used sliced inverse regression [10], [11] to do the sufficient dimension reduction whereby no information about the conditional distribution of outcome was lost during the dimension reduction. Next, a scalar diagnostic score was computed from the lower dimensional data generated in the first step by likelihood ratio statistic which has been proven to be optimal among all possible functions of multiple markers for binary disease outcomes [12]. This approach enabled utilization of information from multiple markers simultaneously without the need to make assumptions concerning the distributions of the markers. Cox and Kaplan-Meier models were employed to evaluate the statistical significance of multidimensional biomarkers in multivariate analysis as described above.

Results

Expression Analysis of microRNA

The main clinical parameters of the 267 CRC patients enrolled in this retrospective analysis are illustrated in Table 1. All the specimens were collected at the first surgery before any treatment. As anticipated the most important clinical factor to predict outcome was the stage of the disease. For patients at stage IV the progression was fast with a median survival rate of 11 months, while for patients at earlier stages the outcome was better (Fig. 1). All the patients were then treated with the best available care and this study will focus on pure prognostic predictors and not to predictors of response to specific treatments. As a first step, we screened a series of 33 microRNAs to identify potential predictors of outcome in multivariate analysis including age and stage in the Cox model. MicroRNAs were chosen according to the number of citations in Pubmed using as keywords the terms “colorectal cancer” and “microRNA”. Ten microRNAs (MiR-532-3p, Mir-200a, Mir-17, Mir-106a, MiR-193a-5p, MiR-145, MiR-375, Mir-29a, MiR-18a and Mir-200b) were statistically significant with values of range risk ratio (RR) less than 1 for each, meaning that high expression was related to a good outcome (Table 2). To further support the results of Cox analysis, data were also assessed using the Kaplan-Meier method. Five quintile cutoffs (25, 33, 50, 67 and 75) were used to stratify patients for high and low expression of each microRNA and log-rank test served to detect if differences in the outcome were significant. The quintile cutoff providing the lowest p-value at the log-rank test was used as discriminator (Table 2). Seven microRNAs (Mir-200a, Mir-17, Mir-106a, MiR-375, Mir-29a, MiR-18a and Mir-200b) were confirmed to be significant using the Kaplan-Meier method and the corresponding plots are shown in Fig. 2.
Table 1

Distribution of clinico-pathological characteristics of patients.

CharacteristicsAll Cases (%)Training Set (%)Testing Set (%)p-value*
Number2671251250.9469
Age, yrs Median (range)70 (22–96)72 (22–96)70 (25–95)
AJCC Stage
I–II176 (65.9)82 (65.6)80 (64.0)0.8947
III–IV91 (34.1)43 (34.4)45 (36.0)
Gender
Male122 (45.6)55 (44.0)59 (47.2)0.7033
Female145 (54.3)70 (56.0)66 (52.8)
Histotype
Adenocarcinoma243 (91.0)114 (91.2)115 (92.0)0.9114
Mucinous24 (9.0)11 (8.8)10 (8.0)
Grade
G1-2212 (78.2)99 (79.2)101 (80.8)0.6091
G331 (20.9)18 (14.4)13 (10.4)
N/A24 (3.8)8 (6.4)11 (8.8)

*testing the difference between training and testing set.

Figure 1

Kaplan-Meier analysis according to stage of the clinical group of 267 patients enrolled in this retrospective analysis.

In red stage I–II patients (n = 176), in green stage III patients (n = 82) and in blue stage IV (n = 8).

Table 2

Results of microRNA expression analysis in the clinical setting of 267 patients with multivariate Cox and Kaplan-Meier method.

P-value from KMP-value from CoxRange risk ratioRange risk ratio lower limitRange risk ratio upper limit
MiR-532-3p0.0647710.0083390.0948420.016480.545822
Mir-200a0.00290.0105570.0775610.0109280.550463
Mir-170.0190420.0111390.0794890.0112510.561619
Mir-106a0.0200480.0152440.0998160.0155180.642047
MiR-193a-5p0.0714980.0164230.1082250.01760.665506
MiR-1450.1070960.0199850.1932440.0483860.771785
MiR-3750.0004340.0203660.211650.0569870.786067
Mir-29a0.0017720.0243570.1353510.0237340.771891
MiR-18a0.0201650.0273170.2243830.0595140.845988
Mir-200b0.021830.0479430.1441040.0211330.98261
Mir-200c0.0156710.0538020.1933410.036391.027235
Mir-4290.0091230.0570330.1800530.0307991.052607
Let-7c0.1223390.0643380.2188020.0437331.094688
Mir-1260.0134570.071680.1988860.0343051.15305
Mir-92a0.0154780.0886240.3119770.0816231.192433
MiR-1280.0741330.0913350.3221180.0864761.199867
MiR-18b0.0570480.0937050.3501990.1026781.194406
Mir-1410.0301670.1001760.2529850.0491441.302326
Mir-27a0.036270.1033850.2590990.0510051.316174
Let-7g0.117260.1043050.2959490.0681041.286059
Mir-2210.0635950.1195590.3225130.0775811.340726
Mir-210.1030070.132830.3413320.0840311.386486
MiR-3200.0050640.1559640.2990320.0564221.584841
MiR-6420.0399540.1852480.2800210.0425911.841022
Mir-34a0.0425440.2370980.3951420.0847721.84185
Let-7e0.034180.3280380.5148860.1361471.947215
Mir-20a0.3211340.3317750.5093870.1304511.989068
MiR-3280.025980.3839020.5511850.1442022.106798
Mir-310.3437310.5071350.5702420.1084552.998244
Mir-1830.319130.5722820.57310.0830013.95708
MiR-30c0.2916010.8827070.8930810.1988394.011246
MiR-125b0.125620.9414080.9479180.2276843.946476
Mir-2030.0335550.9546960.9595460.2308593.988278
Figure 2

Kaplan-Maier analysis of 267 patients according to the analysis of Mir-200a, Mir-17, Mir-106a, MiR-375, Mir-29a, MiR-18a and Mir-200b expression.

Kaplan-Maier analysis was performed dividing the patients as high (green) and low (red) setting. Survival time scale is in months. All the differences were significant and p-values are reported in Table 2 (Log-rank test).

Kaplan-Meier analysis according to stage of the clinical group of 267 patients enrolled in this retrospective analysis.

In red stage I–II patients (n = 176), in green stage III patients (n = 82) and in blue stage IV (n = 8).

Kaplan-Maier analysis of 267 patients according to the analysis of Mir-200a, Mir-17, Mir-106a, MiR-375, Mir-29a, MiR-18a and Mir-200b expression.

Kaplan-Maier analysis was performed dividing the patients as high (green) and low (red) setting. Survival time scale is in months. All the differences were significant and p-values are reported in Table 2 (Log-rank test). *testing the difference between training and testing set.

Expression Analysis of genes

Nanofluidic technology offers the advantage of allowing analysis of microRNAs and their target genes (targetome) in the same RNA sample due to the low volume of each individual qPCR analysis. To perform this analysis, we employed multiple software applications (www.miRbase.org)[13] to prepare a list of genes that might be targetable by the 33 microRNAs investigated in this study. The list was prioritized according to a functional network obtained with the DAVID software (http://david.abcc.ncifcrf.gov) [14] in order to enrich the pool with actionable targets and master regulators of gene expression and apoptosis. After an initial analysis of 180 candidates, we focused on 79 genes whose expression was detectable in a large number of CRC cancer patients. Six genes (MID1, INHBA, OSBPL3, BGN, DICER1 and FAP) were predictors in univariate analysis (data not shown), but only MID1 remained significant after multivariate correction and Kaplan-Meier analysis (Table 3 and Fig. 3). As a second measure to possibly increase the number of candidate genes, we analyzed the public dataset GSE14333 reporting transcriptome analysis of 290 CRC cancer patients [15]. For each individual gene, data were analyzed and computed in a multivariate Cox model as described above (Table 4). Predictive capability was confirmed by Kaplan-Maier analysis using a multiple procedure of quintile selection for the cutoff as described above. The 45 genes with the lowest p-values in multivariate analysis were assessed in our platform of nanofluidic gene expression. Only 3 out of 45 (7%) genes were confirmed as predictors of outcome in both GSE14333 and our clinical setting in multivariate Cox regression and Kaplan-Meier analysis (ANO1, KANK4 and IGFBP3, Table 4 and Fig. 3).
Table 3

Results of gene expression analysis in the clinical setting of 267 patients with multivariate Cox and Kaplan-Meier method.

P-value from KMP-value from CoxRange risk ratioRange risk ratio lower limitRange risk ratio upper limit
ANO10.0004550.001999251.70497.5544598386.487
MID10.0223690.002190.0614880.0103230.366259
KANK40.000510.01457429.036911.946079433.2518
IGFBP30.0009680.02159514.175551.476178136.1259
KLF60.2397880.026870.1307990.0215950.792224
GLI30.1144570.0383890.1153890.0149440.890962
SMCR7L0.0378680.042249.4131941.08174281.91252
NEK60.0664070.050178133.9630.99620718014.43
NAT20.3567280.0659657.3872180.87640562.26683
PBX30.1400480.0728550.0763920.0045981.26913
ANGPT20.0003430.07368321.839040.744166640.9105
AIMP20.4316650.0792816.07430.722844357.4535
OSBPL38.41E-050.080834.9749090.82141430.13061
ATXN10.0208480.1023648.8342320.646996120.6247
TUBB30.0191820.1068983.2081440.77773813.2335
TPBG0.0095430.1096523.9025650.73595420.69425
IGF1R0.1340230.1103249.7398040.595836159.2113
EZR0.0161560.11710817.775040.485969650.1487
SAV10.0113840.11854714.031510.509015386.7929
AR0.3485340.1213650.1618480.016161.621013
RAI140.004330.1315373.1228650.71089713.71828
CXCR40.058910.1401090.2694590.0471991.53833
UST0.0131350.1403327.7065860.510569116.324
INHBA0.0022970.1451513.1999090.66921415.30066
CCL50.076610.1467050.2182040.0279241.705063
STC10.0479760.15472611.710390.395198346.9982
CALU0.5406140.1730610.2157610.023761.959315
CD1090.1592530.1784255.2551920.4688358.90633
DICER10.1660570.190960.2553690.0330121.975454
HIF1A0.7794110.2081620.3620010.0743921.76154
MKI670.0073810.2242423.3964510.47275524.40137
VAV30.0240250.2245280.472730.1411031.583763
TFAM0.4829920.24030810.343190.209424510.8378
BGN0.0139170.244562.6296490.51601513.40088
UGCG0.1338810.24577218.611690.1335322594.106
PLK20.1217920.2717727.0807990.215788232.3469
CAV20.0951720.2767782.6602360.45620615.51242
BBX0.3746520.27963812.478110.1285171211.536
SNAI10.1831310.2828960.3636170.0573772.304384
MCL10.2312970.2838470.2955410.0318042.746347
COX7A2L0.1227840.2859490.2774050.0263162.924187
CEP1700.0189820.2998864.7280320.25067889.17529
CCND10.1717180.3009183.2511980.34825430.35228
COMMD20.0773550.3036295.3955630.217408133.9053
ADAMTS50.1233170.3058622.3143920.46436111.535
ESR20.1537520.3119682.661130.39911617.74326
KLF120.4650260.3190692.646060.39021517.943
CCL20.4416740.3195250.4387550.0866512.22162
EPAS10.4906250.3316630.4185050.0720982.429288
RTN20.2498430.3347482.7358750.35396321.14633
CHST130.0390540.3366992.8769490.3331624.84341
PIM10.1050520.3671173.5524540.22601155.83778
PPP2CA0.199790.3732040.3172130.0253413.970806
CLOCK0.2097620.3740670.4927090.1034572.346506
FAM84A0.1076450.379970.4968510.1042492.367992
TM4SF10.1580230.3925413.4963840.19837161.62552
CAMK2D0.7030070.3966341.9682450.4112429.420213
HIPK10.3687790.3974180.3227430.0235144.429794
KLF50.0892710.4080220.5236010.1130692.42469
PTEN0.5544170.4139230.4126120.0493463.450134
MYOF0.2752640.4176545.5619070.087722352.6461
PNRC10.3181340.4206183.3981650.17312766.69977
IGF2R0.0968860.4291152.605050.24272427.95882
HGF0.0036720.4299732.0994310.33281813.24333
CDKN1A0.0593860.4306430.3889640.0371564.071878
TGFB10.0063360.4468012.4212790.24809523.63038
KDR0.181160.4499741.728420.4179187.14838
EPHA30.5671950.4723040.550430.1080452.804128
CFTR0.3845980.475352.3604750.22326124.95658
ARNT20.065330.4818560.4139580.0354424.834986
MITF0.1105270.4877991.8299580.33194610.08821
AHNAK20.1886410.4950261.7743880.3417249.213451
KLF70.0498110.5062721.6125360.394096.598172
JAK20.3944450.5163950.5913440.1209792.890488
SOX20.1501920.5419160.5308770.0693854.061828
MAPRE10.3579870.5430750.5579440.0850983.658149
CDH10.035820.5523331.6150030.3323497.847868
IGSF50.0195590.5558331.6792890.299269.423273
ANTXR20.5464130.5654293.6718510.043523309.7803
PLK10.4169950.5729430.6718780.1685822.677758
MECP20.4624990.5811380.6261290.1186653.303723
VAMP20.1612110.582260.6283160.1199733.290569
COL1A10.026230.5843640.5296850.0543485.162451
FES0.3824670.5935561.5913990.2889028.766134
PTK20.0608760.6014081.6493510.2523210.78138
BCL20.3381860.6105890.6267850.1038143.784251
HOXB50.272840.621911.4864590.3075717.183917
CYP39A10.4865910.6421431.6713650.19152914.58506
CDX20.0024860.6832330.6518890.0834865.090169
PDGFRB0.0426450.697181.420490.2424598.322194
CD590.0051980.7003351.6356620.13348920.04205
EPHB20.0422450.7440990.6554460.0518788.281188
FAM17380.1127020.7443951.8499620.04582874.67826
SHMT20.0408280.7459610.738740.1182734.614223
SLC12A20.4341880.7718480.7767190.1407684.285722
HOXB70.1956270.7884450.7988340.1548714.120429
PBK0.4457470.7904260.8217220.1931183.496454
DUSP100.138420.7947391.6690820.0351979.16626
CISH0.0906550.8100020.7546790.0760797.486208
FAP0.0307510.8118331.1910010.2824415.022236
MAFF0.2551530.8141440.7489270.0672368.342135
MET0.134370.8232241.3741030.08455422.33087
CHEK10.1639260.8250530.8227220.1458274.641614
ESR10.0569360.8461161.31230.08433920.41915
CDX10.0105790.8485951.301810.08680819.52242
ADM0.2625820.8533411.1565160.2475035.40409
HECTD20.3534390.8553160.8601830.1704154.341829
PPARGC1B0.379680.8556111.2226480.1402710.65707
LDLR0.3554470.8587841.2220840.13415511.13256
HIC20.1762020.8625870.8155250.0810158.209401
CHEK20.1556760.8669041.1725250.1822867.542064
AXL0.0390570.8825211.1675140.1496759.107027
SRC0.5039740.8832680.8956660.2057583.898842
FGFR10.4164220.8928550.8961110.1816184.421444
CXCL100.0907450.8989250.8685970.0988017.636207
ETS20.1351750.9042790.800610.02134930.02398
KIF110.2439860.913880.9184960.1967494.287873
ROCK10.3026250.9155221.1233170.1310329.630014
DROSHA0.259950.9247161.0864340.1946456.064063
JAK10.1131290.9303650.9050770.0966478.47581
ERBB20.3028750.9621231.0484090.149017.376446
CTNNB10.0997350.9632751.033430.2549084.18966
GBP10.0400140.9691381.0431440.1227588.864152
SLITRK40.2380820.9730460.9783160.2742913.489367
Figure 3

Kaplan-Maier analysis of 267 patients according to the analysis of MID1, ANO1, KANK4 and IGFBP3 expression.

Kaplan-Maier analysis was performed dividing the patients as high (green) and low (red) setting. Survival time scale is in months. All the differences were significant and p-values are reported in Table 3 (Log-rank test).

Table 4

Results of gene expression analysis in the public dataset GSE14333 with multivariate Cox and Kaplan-Meier method.

ProbeGeneP-value from KMP-value from CoxRange risk ratioRange risk ratio lower limitRange risk ratio upper limit
229357_atADAMTS56.04E-060.00020134.08645.305339219.0025
235368_atADAMTS51.32E-068.75E-0542.646786.539883278.101
219935_atADAMTS58.92E-050.00026522.252674.200208117.8945
1558636_s_atADAMTS50.0019660.005793100.5893.8026892660.788
220726_atAHNAK20.268890.6901710.6973880.1185394.102851
1558378_a_atAHNAK20.0003620.0030827.086163.047499240.7416
212992_atAHNAK21.55E-063.46E-05483.515625.934899014.392
209971_x_atAIMP20.0001780.0001730.0561170.012480.252328
202138_x_atAIMP20.0004835.44E-050.04150.008850.194598
209972_s_atAIMP20.0122590.4576150.4154320.0409114.218525
205572_atANGPT20.0071380.02783811.731351.307621105.248
236034_atANGPT20.0037490.000137102.52289.4967371106.794
211148_s_atANGPT20.0001030.06673826.071170.798486851.2436
237261_atANGPT20.0323430.15300812.349050.392984388.0543
1555269_a_atANO10.0056920.0236623.378111.52436358.5346
218804_atANO10.0015140.00024988.276128.031295970.2886
1555536_atANTXR20.1658720.4483391.9920170.3354311.82999
225524_atANTXR20.0021830.0004744.010175.278853366.916
228573_atANTXR20.0050470.00031619.326023.85726396.82903
213015_atBBX0.0037280.00628720.994372.364569186.4033
1557239_atBBX0.2912660.9805331.0364030.05863918.31775
223134_atBBX6.95E-050.0027430.210563.248415280.9611
226331_atBBX2.81E-058.81E-07359.26834.410533750.987
213016_atBBX0.0044710.000586102.75347.3263521441.135
223135_s_atBBX6.36E-081.22E-0545.648458.239986252.8865
232008_s_atBBX0.0005319.24E-05102.482510.061761043.82
1557240_a_atBBX0.1699630.2993112.433680.45378913.05188
213426_s_atCAV20.1887660.0318677.1325421.1857542.90378
203324_s_atCAV23.97E-074.18E-0541.992527.026959250.9438
203323_atCAV21.60E-092.04E-0677.0072112.82191462.4983
229900_atCD1090.0416030.00491626.165362.689981254.5096
226545_atCD1090.0003250.0002320.464144.105505102.0048
200984_s_atCD590.0003620.00014955.544486.96658442.8556
200983_x_atCD591.14E-063.41E-0570.301589.407096525.3813
200985_s_atCD599.51E-063.67E-0552.148017.975754340.9602
212463_atCD596.84E-071.22E-06175.864721.786291419.626
228748_atCD590.0046330.0412377.889421.08561157.33449
206430_atCDX16.36E-074.45E-050.0303740.0056750.162561
206387_atCDX20.00143.67E-050.0739550.0214720.254721
231606_atCDX20.0005680.0950680.3031180.0746351.231068
212746_s_atCEP1700.0002990.00988423.829422.142353265.055
207719_x_atCEP1700.0001316.49E-0531.191915.766271168.7287
234702_x_atCFTR0.016940.272930.4615450.1158561.838699
215703_atCFTR0.0373251.37E-050.0531110.0141470.199386
217026_atCFTR0.3859810.2527990.1686050.0079763.563921
215702_s_atCFTR0.0123290.0007820.022790.0025090.206981
234706_x_atCFTR0.3715550.5788870.5116180.0479865.454732
205043_atCFTR0.0216923.17E-050.0653490.0180780.236224
239647_atCHST130.0003193.61E-050.074160.0215840.254811
242503_atCHST130.2464650.2637410.4637530.1205151.784564
223377_x_atCISH0.011240.0305940.1538750.028210.839335
223961_s_atCISH4.08E-055.85E-050.0332150.0063120.174775
221223_x_atCISH0.0027720.0018460.0365730.0045580.293441
226910_atCOMMD20.0006330.00012637.401575.872115238.2237
223491_atCOMMD23.60E-050.01936510.472721.46248774.99413
221563_atDUSP105.91E-050.00109937.057564.233756324.3604
215501_s_atDUSP101.84E-050.00038826.50644.335402162.0586
209588_atEPHB20.000339.89E-050.051850.0116870.230028
209589_s_atEPHB20.0105170.001930.0828110.0171480.399908
211165_x_atEPHB22.20E-050.0001490.0464330.00950.226942
210651_s_atEPHB20.0275640.0009680.0603460.0113860.319845
234158_atEPHB20.0318390.1694350.4127420.1168061.45845
233699_atEPHB20.2312560.4536651.6889960.4287756.653159
222303_atETS20.0068060.0058870.1143870.0244480.535198
201328_atETS20.0004118.08E-050.0813690.0233750.283247
201329_s_atETS20.0003288.38E-050.0650160.0166560.253792
208621_s_atEZR0.0347370.0971027.0737650.701371.35055
217234_s_atEZR0.0442950.0414989.9105041.09240989.90964
208622_s_atEZR0.0006220.00130399.008496.0136561630.07
208623_s_atEZR4.02E-072.11E-051207.41245.8900531768.2
217230_atEZR0.1744030.1404143.3611790.67071816.84393
238645_atEZR0.0210880.04216811.380811.089858118.8439
215200_x_atEZR0.2869940.3180442.8230120.36812221.64879
225670_atFAM173B2.61E-050.0001170.0332820.0058940.187931
225668_atFAM173B0.0637790.3432160.3310040.0336453.256445
234335_s_atFAM84A1.84E-052.34E-050.0215970.0036530.127692
229546_atFAM84A0.0008942.59E-050.0340420.0070480.164431
225667_s_atFAM84A0.0007011.19E-050.0132240.0019080.091637
234331_s_atFAM84A0.0004893.23E-070.0105210.0018340.06036
231439_atFAM84A0.0028810.000160.0432080.0084560.220772
228459_atFAM84A0.0001240.002850.1046820.0237680.461058
228319_atFAM84A0.0889040.6889810.6094380.0539186.888528
210095_s_atIGFBP30.000380.00012234.813515.693815212.8592
212143_s_atIGFBP33.87E-066.58E-0540.756636.598764251.7294
243027_atIGSF50.0001130.00143255.301844.691637651.8607
229125_atKANK48.48E-060.00027612.869273.24749850.99869
217173_s_atLDLR0.0111690.165384.5767610.53365639.25138
202067_s_atLDLR0.0025420.12035114.718720.494593438.0182
202068_s_atLDLR0.0022360.00025551.660796.236509427.9376
217103_atLDLR0.3404160.5683310.541430.0657874.456002
217183_atLDLR0.1299070.4802341.7085430.3861487.559592
217005_atLDLR0.0749240.2286962.6758230.53882813.28814
205193_atMAFF0.0001580.00010244.544216.562634302.3461
211864_s_atMYOF1.04E-050.00013862.987817.483162530.1855
201798_s_atMYOF0.0002082.47E-05188.518916.517642151.602
217518_atMYOF0.0200040.1668634.9720520.51163148.31859
206797_atNAT20.0001122.13E-050.0626070.0174480.224656
201939_atPLK20.0037917.69E-0514.061083.79243852.13371
209034_atPNRC11.45E-054.87E-0531.434275.954339165.9484
1555282_a_atPPARGC1B0.0602670.3520010.5053940.1200922.126897
1553639_a_atPPARGC1B3.01E-062.18E-050.037990.0083940.171939
1563943_atPPARGC1B0.4582450.9856531.0147170.2064394.98768
202052_s_atRAI145.41E-070.0002872081.50133.49958129334.4
204217_s_atRTN20.0009050.007263165.4783.9704976896.612
222573_s_atSAV10.0021440.02768513.528641.331125137.4957
218276_s_atSAV11.68E-064.00E-0539.629516.846883229.3742
234491_s_atSAV10.0239350.01265515.708871.802495136.9039
236606_atSAV10.0282970.161562149.86230.134718166709.5
204404_atSLC12A23.73E-060.0002360.052940.0110530.253562
225835_atSLC12A20.0012860.0075220.1117520.0224050.557401
232636_atSLITRK48.73E-060.0003421081.21923.634949462.24
204596_s_atSTC10.0014290.002766128.34085.3395023084.814
204595_s_atSTC19.53E-060.00023426.791814.6472154.4589
230746_s_atSTC10.0008660.00095317.638633.21430196.79288
204597_x_atSTC12.33E-060.00233576.656344.6894061253.079
238443_atTFAM0.0598190.2252280.3712290.0748381.841456
203177_x_atTFAM1.71E-050.0001130.0228290.003350.155556
208541_x_atTFAM0.0027440.0548750.2160470.0452061.032524
203176_s_atTFAM0.0003420.0030430.1154760.0276970.481457
238168_atTM4SF10.0307680.014898614.98033.498935108090.2
209387_s_atTM4SF12.56E-082.02E-06800.665150.7997612619.44
215034_s_atTM4SF13.95E-071.61E-071067.39878.6008114495.26
209386_atTM4SF15.87E-108.06E-081440.194101.095620516.8
215033_atTM4SF10.0008490.02472876.443761.7361143365.936
203476_atTPBG6.74E-050.00010253.327577.179862396.0842
224967_atUGCG0.0001670.00018925.076864.619337136.1341
204881_s_atUGCG0.0308880.255742.6643850.49163114.43957
221765_atUGCG6.94E-050.0145918.464031.778815191.6559
205138_s_atUST0.0398990.0950853.7888170.79289118.10481
205139_s_atUST0.0001528.32E-0539.82986.355425249.6155
214792_x_atVAMP20.000190.00061545.853935.13567409.4078
201556_s_atVAMP20.000360.00023151.113596.295788414.9757
201557_atVAMP20.0013650.0016941058.69313.6839981908.26

Kaplan-Maier analysis of 267 patients according to the analysis of MID1, ANO1, KANK4 and IGFBP3 expression.

Kaplan-Maier analysis was performed dividing the patients as high (green) and low (red) setting. Survival time scale is in months. All the differences were significant and p-values are reported in Table 3 (Log-rank test).

Expression Analysis of proteins

To conduct analysis at the protein level, we chose 9 factors (TUBB3, ELAVL1, OSBPL3, IGFBP3, ANO1, HGF, GLI3, PPP2CA and ARNT2). TMAs were prepared from the same paraffin blocks used for gene and microRNA analysis. Triplicate cores of each case were included in the TMAs to capture clonal heterogeneity, and each TMA was analyzed in triplicate by multiplexed, quantitative fluorescent immunohistochemistry. Nuclei were stained with DAPI (blue channel), and stromal and epithelial cells were stained with anti-vimentin (green channel) and anti-cytokeratin (yellow channel), respectively. Antigens of interest were acquired in the red channel, and a representative image of the analysis for IGFBP3 and ANO1 is shown in Fig. 4A. For each protein, expression was quantified with AQUA software which utilizes an unsupervised algorithm to quantify expression in defined subcellular compartments or “masks”. In our study, we selected four masks: tumor (cytokeratin+), stromal (vimentin+), tumor nuclei (DAPI+/cytokeratin+) and tumor cytoplasm (DAPI-/cytokeratin+). For each 3 mm core, at least three electronic subsegments (histospots) were analyzed. Because of replicate analysis, we collected up to 18 AQUA scores for each patient which were then averaged. GLI3, ARNT2 and HGF showed predominantly nuclear staining in some cancer cells, while in others, the staining was predominantly cytoplasmic. To exploit this phenomenon, an index was created by dividing the nuclear over the cytoplasmic expression. A value >1 was typical of a strong nuclear staining, while a value <1 indicated a predominantly cytoplasmic pattern of expression. Expression of all proteins and the index were analyzed with multivariate Cox regression analysis. Only expression of ANO1 (in cancer cells and in the nuclei of cancer cells) was significant in multivariate analysis (Table 5 and Fig. 4B).
Figure 4

Representative quantitative immunofluorescence for IGFBP3 (left column) and ANO1 (right column).

A: From top to bottom the following signals are represented: antigen of interest (red channel), cell nuclei (DAPI), tumor cells (cytokeratin), stromal cells (vimentin) and merged image. B: Kaplan-Maier analysis of 267 patients according to the expression of AQUA scores of ANO1 inside the tumor mask (ANO1_AQUA) and in the nucleus of cancer cells (ANO1_Nuclear_AQUA). Kaplan-Maier analysis was performed dividing the patients as high (green) and low (red) setting. All the differences were significant and p-values are reported in Table 5 (Log-rank test).

Table 5

Results of quantitative fluorescent immunohistochemistry quantified with AQUA in the clinical setting of 267 patients with multivariate Cox and Kaplan-Meier method.

P-value from KMP-value from CoxRange risk ratioRange risk ratio lower limitRange risk ratio upper limit
ANO1_Nuclear_AQUA0.0193250.01635.3347061.36093820.91138
ANO1_AQUA0.0008980.0471763.9332421.01722515.20842
ANO1_Cyto_AQUA0.0004660.0640553.4553120.93013412.83598
OSBPL3_Cyto.AQUA0.1233250.078970.1884650.0292821.213009
OSBPL3_Nuclear_AQUA0.1098940.0801470.196410.0317281.21585
ARNT2_Cyto_AQUA0.1674870.1406983.7288650.64735321.47889
ANO1_Stromal_AQUA0.1961940.1680632.7595580.65167111.68558
ARNT2_Nuclear_AQUA0.2169220.1909523.4662720.53797922.33368
HGF_Index0.0724210.2566120.4215160.0947561.875095
IGFBP3_Nuclear_AQUA0.1035250.2803962.4698120.47828812.75378
IGFBP3_AQUA0.032150.3342762.0713440.4723739.082782
IGFBP3_Cyto_AQUA0.0223620.3558871.913890.4824187.592945
OSBPL3_AQUA0.2533410.4147253.7378450.15724288.85318
IGFBP3_Stroma_AQUA0.2561470.432211.8193910.4085958.101375
Gli3_Cytoplasm_AQUA0.6302870.4408491.6033060.4827045.3254
HGF_Cyto_AQUA0.4600810.5048680.5868840.1225452.810666
Gli3_Index0.2823450.5669010.6563630.1553342.773461
ARNT2_Index0.4192950.6157510.5622250.0593275.328067
Gli3_Nuclear_AQUA0.4722190.6219251.3391160.4195354.274326
ELAVL1_AQUA0.1858240.6423861.9114760.12410429.44091
PPP2CA_AQUA0.4315680.7063740.6678260.0817455.455859
TUBB3_Aqua0.1422290.7521191.2389510.3278044.682667
HGF_AQUA0.2447030.758160.767950.1430244.12343
HGF_Nuclear_AQUA0.4031270.7908990.7858880.132394.665154
ARNT2_AQUA0.1374210.7936271.2122920.2865595.128612
Gli3_AQUA0.4056240.8134040.8339070.1845493.768124
TUBB3_Cyto_AQUA0.2530910.8572070.8843850.231963.371855
ELAVL1_Cyto_AQUA0.3750180.9004911.1575210.11688111.4634
ELAVL1_Index0.2967730.9157890.906530.1470365.58907
TUBB3_Nuclear_AQUA0.4432650.9191850.9474650.3340612.68721
ELAVL1_Nuclear_AQUA0.0371630.9583671.078470.06324618.39011

Nuclear and Cyto indicate expression of the antigen in nucleus and cytoplasm, respectively. If no specified expression was assessed inside the cancer cells. Stromal expression refers to vimentin-positive cells. Index was created dividing the nuclear over the cytoplasmic expression.

Representative quantitative immunofluorescence for IGFBP3 (left column) and ANO1 (right column).

A: From top to bottom the following signals are represented: antigen of interest (red channel), cell nuclei (DAPI), tumor cells (cytokeratin), stromal cells (vimentin) and merged image. B: Kaplan-Maier analysis of 267 patients according to the expression of AQUA scores of ANO1 inside the tumor mask (ANO1_AQUA) and in the nucleus of cancer cells (ANO1_Nuclear_AQUA). Kaplan-Maier analysis was performed dividing the patients as high (green) and low (red) setting. All the differences were significant and p-values are reported in Table 5 (Log-rank test). Nuclear and Cyto indicate expression of the antigen in nucleus and cytoplasm, respectively. If no specified expression was assessed inside the cancer cells. Stromal expression refers to vimentin-positive cells. Index was created dividing the nuclear over the cytoplasmic expression.

Calculation of Predictive Accuracy

We divided the patients into two clinical groups of interest to allow simplification of censored data into a binary response. Those surviving less than three years from diagnosis were labeled as having aggressive disease, while those surviving for greater than three years were considered to have more indolent, non-aggressive disease. Each of the individual factors from the three dimensions above (microRNA, gene or protein) was tested as a predictor of disease aggressiveness using ROC curves with AUC calculation. Although some factors were statistically significant in multivariate analysis, the maximum AUC obtained from any single biomarker in a single dimension was only 0.68 (ADAMTS5). Utilizing such a weak predictor for patient care would be unacceptable as it is inaccurate (either falsely positive or falsely negative) in approximately one third of the cases.

Generation of multidimensional biomarkers

We speculated that, by combining the information from different dimensions (microRNA, gene and protein), we could substantially increase predictive accuracy. However, multidimensionality engenders significant computational complexities and challenges. Whereas in single dimension analysis the number of considered variables in our case is relatively limited at 188, multidimensional analysis of two and three variables yields 17,578 and 1,089,836 combinations, respectively. Controlling for type 1 errors using cross-validation becomes critically important as the number of variables rises. For this reason, after excluding 17 patients due to incomplete data, we randomly assigned the remaining 250 patients to either training or testing set (Tab. 1). As a first step, we randomly chose either two or three variables from all the microRNAs, genes and proteins that we considered. After sufficient dimension reduction, variables were combined into a new diagnostic score, which included all the information of the parental factors. Computation clearly demonstrated that by increasing the amount of data from different dimensions, the calculated AUCs increased in the training set (Fig. 5A). After computation of all the 1,089,836 multidimensional predictors, we selected the combinations with the highest ranking of AUCs. We then added one additional biomarker at a time, a microRNA, gene or protein, into the existing combinations while considering all possible combinations, and calculated the AUCs again in the training set (Fig. 5B). This process was repeated until AUCs reached a maximum and failed to increase significantly by adding additional predictors into the existing combinations. These maximums were reached when number of variables inside each combination reached 10 in the training set (Fig. 5B). Thereafter, we analyzed the top combinations in the testing set and we found 15 multidimensional biomarkers (MB) which showed AUC values >0.83 in the training and testing set (composition is reported in Table 6, 7 and 8) supporting the notion that multidimensional biomarkers are more accurate than any individual single dimension predictor. From this list, we selected the 4 most accurate multidimensional biomarkers (MB1 to MB4), each with AUCs of approximately 0.9 in both training and test sets (Fig. 6A). Their composition is graphically depicted in Fig. 6B. These biomarkers were also outstanding predictors of outcome in Kaplan-Meier analysis (Fig. 6C).
Figure 5

Box-whisker plot representing the values of AUC in the training set.

In the boxplot, from bottom to top, they are Q1-1.5*IQR, Q1, median, Q3, and Q3+1.5*Q3 where Q1 is first quartile (25th percentile), Q3 is the third quartile (75th percentile), and IQR is the interquartile (namely, Q3-Q1). In A the analysis is made with a single variable, with all the possible combination of two (n = 17,578) and three variables (n = 1,089,836). In B the analysis is performed by adding one new variable (gene, microRNA or protein) to the previous top combinations.

Table 6

AUC analysis for the top 15 multidimensional biomarkers in the training and testing set.

AUC CombinationsMBTrainingTest
DICER1+IGFBP3+Mir−29a+HGF_Index +ADAMTS5+OSBPL3_Cyto_AQUA+ETS2+EPHA3+SHMT2+ESR110.910590.89881
ADAMTS5+OSBPL3_Cyto_AQUA+HGF_Index+ESR2+ANGPT2+HGF_Nuclear_AQUA+ARNT2+IGFBP3+Mir−29a+HGF_Cyto_AQUA20.9210070.871599
CD109+IGFBP3+Let−7c+CISH+KLF7+CDX1+MITF+ADAMTS5+ANO1+ANTXR230.9142510.861305
HOXB7+IGFBP3+Mir−141+HGF_Index+ESR1+ANO1+PLK2+OSBPL3_AQUA+COX7A2L+IGFBP3_Stroma_AQUA40.8651030.899916
HOXB7+IGFBP3+MiR−320+HGF_Index+DROSHA+KLF6+Mir−200a+FGFR1+TM4SF1+Mir−200b50.9160840.847826
IGFBP3+Mir−29a+PNRC1+HGF_Index+CFTR+SRC+KLF5+KIF11+ANO1+CAV260.9036520.855797
EZR+Mir−17+HGF_Index+ANO1+IGFBP3_AQUA+OSBPL3_AQUA+Mir−200b+PIM1+ANO1_AQUA+PPP2CA70.8279570.930193
EZR+Mir−200a+HGF_Index+IGFBP3+Mir−17+ESR1+Mir−92a+ERBB2+DUSP10+PLK280.8686870.881884
DICER1+IGFBP3+MiR−193a−5p+CDX1+DROSHA+ANTXR2+HGF_Index+KIF11+OSBPL3_Cyto_AQUA+PLK290.8758680.872449
RAI14+HGF_Index+STC1+MiR−18a+COX7A2L+SOX2+SMCR7L+NAT2+MCL1+ANTXR2100.8453770.88913
HOXB7+IGFBP3+Mir−141+HGF_Index+ESR1+ANO1+PLK2+OSBPL3_AQUA+COX7A2L110.8446180.885606
IGFBP3+Mir−29a+PNRC1+HGF_Index+CFTR+SRC+KLF5+KIF11+ANO1120.8857810.843478
HOXB7+IGFBP3+MiR−320+HGF_Index+DROSHA+KLF6+Mir−200a+FGFR1130.8624710.845652
DICER1+IGFBP3+Let−7g+CDX1+ANTXR2+ADM+DROSHA+KLF6+TUBB3_Cyto_AQUA+HGF_Cyto_AQUA140.867480.837662
EZR+Mir17+HGF_Index+ANO1+IGFBP3_AQUA+OSBPL3_AQUA+Mir−200b+PIM1+ANO1_AQUA150.8396870.849453
Table 7

KM and Cox information in training set for all the top multidimensional biomarkers.

MBP-value from KMP-value from CoxRange risk ratioRange risk ratio lower limitRange risk ratio upper limit
12.00E-071.51E-06653.018146.558819159.011
21.28E-073.80E-06258.931224.541752731.889
31.48E-097.80E-07327.943632.94033264.907
46.92E-081.15E-05331.701224.798514436.787
51.07E-091.24E-0540807.52349.22524768423
69.46E-082.72E-06112.571515.63995810.2548
71.09E-060.000413195.653110.467893656.91
81.75E-052.97E-05266.244119.356743662.079
99.17E-060.000101192.073513.573562717.948
101.67E-066.71E-0512784.75122.36061335805
111.81E-055.65E-06102.750913.90232759.4232
127.40E-083.94E-06560.100738.100628233.798
133.96E-086.38E-0532337.23199.12645251420
142.69E-050.000165402.811217.783579123.973
150.0001790.00011689.506079.105603879.8249
Table 8

KM and Cox information in testing set for all the top multidimensional biomarkers.

MBP-value from KMP-value from CoxRange risk ratioRange risk ratio lower limitRange risk ratio upper limit
11.08E-070.000175149.421110.93472041.818
28.10E-070.000655135.43448.0468892279.45
31.00E-052.96E-07134.66420.66007877.7509
41.02E-060.000173174.068111.785162571.004
50.0001070.00049964.979796.196342681.43
61.51E-060.00070275.678746.197425924.1375
71.21E-096.37E-07301.118531.852252846.654
82.45E-087.39E-061150.19552.7629925073.42
91.17E-062.19E-05165.184115.627991745.956
100.0039860.00858433.057862.43376449.0262
110.0009760.000481171.80739.5536753089.673
120.0004040.00366416.682792.499333111.3559
130.000470.002143657810126.85593.41E+09
142.47E-052.90E-05143.115413.971011466.037
150.0001494.68E-06647.349240.5359110338.02
Figure 6

Bar Chart reporting AUC of the top MB obtained in the training and testing set (A).

Graphical chart of the composition of MB1-4 (B). In blue, green and red protein, genes and microRNA are reported. Kaplan-Maier analysis of the training and testing set according to the expression of the top 4 MB (C). All the differences were highly significant (Log-rank test) and are reported in Table 7 and 8 for training and testing set, respectively.

Box-whisker plot representing the values of AUC in the training set.

In the boxplot, from bottom to top, they are Q1-1.5*IQR, Q1, median, Q3, and Q3+1.5*Q3 where Q1 is first quartile (25th percentile), Q3 is the third quartile (75th percentile), and IQR is the interquartile (namely, Q3-Q1). In A the analysis is made with a single variable, with all the possible combination of two (n = 17,578) and three variables (n = 1,089,836). In B the analysis is performed by adding one new variable (gene, microRNA or protein) to the previous top combinations.

Bar Chart reporting AUC of the top MB obtained in the training and testing set (A).

Graphical chart of the composition of MB1-4 (B). In blue, green and red protein, genes and microRNA are reported. Kaplan-Maier analysis of the training and testing set according to the expression of the top 4 MB (C). All the differences were highly significant (Log-rank test) and are reported in Table 7 and 8 for training and testing set, respectively.

Discussion

CRC cancer remains among the deadliest malignancies. For clinical management, particularly in stage II and III disease, multiple therapeutic options are now available. As observed also in our clinical study the outcome is mostly driven by clinical stage at diagnosis with stage IV patients presenting a severe prognosis. However, even in patients with earlier stages the outcome is not only favorable with a significant relapse rate. Discoveries of effective biomarkers that can guide therapeutic decisions are ambitiously sought in the hopes of achieving the best possible outcomes, minimizing not necessary and toxic procedures. A host of studies have been conducted toward this end [2], [16], [17]. The ideal biomarker to drive clinical treatments should be significant in multivariate analysis while having robust predictive accuracy with few false positive and false negative results. Some limited successes have been obtained with regard to the selection of specific therapeutic regimens according to toxicity and efficacy [2]. However, most of these promising individual biomarkers have fallen short in clinical trials [18]. More complex biomarkers have been created, albeit in a single dimension. One 12-gene panel was effective in predicting risk of recurrence and response to treatment in a large clinical study of 1436 patients of stage II and III CRC patients [19]. However, later validation studies did not reproduce the same results [20], since the Achilles' heel of this technology remains the lack of accuracy in independent validation studies. In our study we intend dimension as the nature of the variable, being microRNA, a gene or a protein. We believe that the lack of accuracy should be dependent at least in part from the fact that the 12-gene signature was obtained only in the gene dimension, thus downsizing the possible role that other factors such as microRNA and protein may have in the predictive capability of genes. This past experience prompted us to revisit the way predictive biomarkers are built. Cancer aggressiveness is a complex trait in most of the cases. It is like a multifactorial equation. To make the pattern more complex, such multiple factors are coming from different dimensions such as genes, proteins, DNA sequences and different subset of cells (cancer and stromal). Our idea was centering the prediction on an integrated method of analysis including more dimensions and more factors at the same time. We believe that only an integrated approach can get closer to the solution of a multifactorial equation. The results we present in this study support our hypothesis. In our cohort of CRC patients, we first analyzed a large panel of possible individual predictors coming from each of three single dimensions (microRNA, gene or protein). We were indeed able to identify statistically significant predictors of outcome as determined by multivariate Cox analysis and Kaplan-Meier method. Some of these predictors have not been extensively investigated in CRC to date. As an example, expression of ANO1 (anoctamine 1) was found to be statistically significant at the gene and protein level, re-enforcing recent data coming from analysis of the dataset GSE14333 [15]. However, AUC of ANO1 in our analysis was not greater than 0.65, meaning that as a driver of clinical decisions, ANO1 would misclassify a consistent number of patients. Thus, statistical significance does not necessarily translate into clinical utility. Failure to recognize this fact can account for much of the disappointment with individual biomarkers derived from a single dimension [4], [20]. Not satisfied with AUCs below 0.7 and in the hopes of developing more robust predictors, we sought to combine our data in novel ways. In this manuscript, we provide details of a multidimensional platform which combines nanofluidic technology with quantitative fluorescent immunohistochemistry to create biomarkers with AUCs approaching and even exceeding 0.9. While the number of variables that need to be analyzed is immense, this potent toolset can collect multidimensional data at a reasonable reagent cost for FFPE samples ($0.20 for gene/microRNA analysis, $0.85 per protein). Beyond predicting clinical outcome, our assay can highlight molecular drivers of aggressiveness. For example, IGFBP3 appears in all of the four top multidimensional biomarkers. This antigen is well known to researchers in CRC, although conflicting data are present in the literature regarding its effects [21]. At the gene expression level in both GSE14333 and our data set, high expression of IGFBP3 was related to poor outcome. This is in keeping with other previous studies [21], [22], [23]. The weight of evidence surely implicates this gene as a prominent driver of CRC cancer aggressiveness despite its being at odds with older studies connecting IGFBP3 expression to an anti-proliferative effect on the growth of colon cancer cells (reviewed in [24]). Only two variables were present in 3 out of the 4 top multidimensional biomarkers: ADAMTS5 and HGF index. ADAMTS5 is a member of the ADAMTS (a disintegrin and metalloproteinase with thrombospondin motifs) protein family. The enzyme encoded by this gene contains two C-terminal TS motifs and functions as aggrecanase to cleave aggrecan, a major proteoglycan of cartilage [25]. As single factor, in the dataset GSE14333, high expression was associated with poor outcome in multiple probes. However, in our analysis, this factor did not show a significant trend in multivariate analysis as single element. Literature on ADAMTS5 in CRC cancer is extremely limited with only one study reporting this gene as one of the most hypermethylated in tumor as compared with the surrounding normal colonic mucosa [26]. The other variable, HGF (hepatocyte growth factor) index, represents a pathway that is known to be activated in aggressive CRC. HGF has been extensively investigated as a potential new target (reviewed in [27]). Although HGF expression in immunoperoxidase staining appears with a clear cytoplasmic pattern in CRC cancer cells [28], our immunofluorescence assay demonstrated a nuclear pattern that was of clinical significance. A similar nuclear localization of the receptor of HGF c-Met has been reported in breast cancer cells, where such overexpression was related to increased metastatic potential and aggressive disease [29]. In summary, CRC cancer aggressiveness is a complex trait that cannot be predicted with suitable accuracy by the use of an individual, single dimensional factor (microRNA, gene or protein). In contrast, a multidimensional integrated approach which utilizes data from microRNA, gene and protein analysis can generate accurate predictors of biological behavior, foster better clinical management of CRC, and shine a spotlight on molecules and molecular pathways which are associated with and potentially the cause of poor outcome. List of antibodies, suppliers and final concentration used. (DOCX) Click here for additional data file.
  29 in total

1.  Validation study of a quantitative multigene reverse transcriptase-polymerase chain reaction assay for assessment of recurrence risk in patients with stage II colon cancer.

Authors:  Richard G Gray; Philip Quirke; Kelly Handley; Margarita Lopatin; Laura Magill; Frederick L Baehner; Claire Beaumont; Kim M Clark-Langone; Carl N Yoshizawa; Mark Lee; Drew Watson; Steven Shak; David J Kerr
Journal:  J Clin Oncol       Date:  2011-11-07       Impact factor: 44.544

2.  A model free approach to combining biomarkers.

Authors:  Ruth M Pfeiffer; Efstathia Bur
Journal:  Biom J       Date:  2008-08       Impact factor: 2.207

Review 3.  MicroRNA expression in colorectal cancer.

Authors:  Niamh M Hogan; Myles R Joyce; Michael J Kerin
Journal:  Cancer Biomark       Date:  2012       Impact factor: 4.388

4.  Matrix-degrading proteases ADAMTS4 and ADAMTS5 (disintegrins and metalloproteinases with thrombospondin motifs 4 and 5) are expressed in human glioblastomas.

Authors:  Janka Held-Feindt; Elke Bernedo Paredes; Ulrike Blömer; Constanze Seidenbecher; Andreas M Stark; H Maximilian Mehdorn; Rolf Mentlein
Journal:  Int J Cancer       Date:  2006-01-01       Impact factor: 7.396

5.  Gender influences the class III and V β-tubulin ability to predict poor outcome in colorectal cancer.

Authors:  Marisa Mariani; Gian Franco Zannoni; Stefano Sioletic; Steven Sieber; Candice Martino; Enrica Martinelli; Claudio Coco; Giovanni Scambia; Shohreh Shahabi; Cristiano Ferlini
Journal:  Clin Cancer Res       Date:  2012-03-21       Impact factor: 12.531

6.  IGFBP-3 mediates TGF beta 1 proliferative response in colon cancer cells.

Authors:  S Kansra; D Z Ewton; J Wang; E Friedman
Journal:  Int J Cancer       Date:  2000-08-01       Impact factor: 7.396

Review 7.  Colorectal cancer.

Authors:  David Cunningham; Wendy Atkin; Heinz-Josef Lenz; Henry T Lynch; Bruce Minsky; Bernard Nordlinger; Naureen Starling
Journal:  Lancet       Date:  2010-03-20       Impact factor: 79.321

Review 8.  Biomarkers in cancer screening: a public health perspective.

Authors:  Sudhir Srivastava; Rashmi Gopal-Srivastava
Journal:  J Nutr       Date:  2002-08       Impact factor: 4.798

9.  Dimension reduction methods for microarrays with application to censored survival data.

Authors:  Lexin Li; Hongzhe Li
Journal:  Bioinformatics       Date:  2004-07-15       Impact factor: 6.937

10.  Insulin-like growth factor binding protein-3 (IGFBP-3): Novel ligands mediate unexpected functions.

Authors:  Robert C Baxter
Journal:  J Cell Commun Signal       Date:  2013-08       Impact factor: 5.782

View more
  7 in total

1.  Nek6 and Hif-1α cooperate with the cytoskeletal gateway of drug resistance to drive outcome in serous ovarian cancer.

Authors:  Marta De Donato; Mara Fanelli; Marisa Mariani; Giuseppina Raspaglio; Deep Pandya; Shiquan He; Paul Fiedler; Marco Petrillo; Giovanni Scambia; Cristiano Ferlini
Journal:  Am J Cancer Res       Date:  2015-05-15       Impact factor: 6.166

Review 2.  Lessons from CKD-Related Genetic Association Studies-Moving Forward.

Authors:  Sophie Limou; Nicolas Vince; Afshin Parsa
Journal:  Clin J Am Soc Nephrol       Date:  2017-12-14       Impact factor: 8.237

Review 3.  MicroRNAs as growth regulators, their function and biomarker status in colorectal cancer.

Authors:  Lina Cekaite; Peter W Eide; Guro E Lind; Rolf I Skotheim; Ragnhild A Lothe
Journal:  Oncotarget       Date:  2016-02-09

4.  Assessment of the Feasibility of a Future Integrated Larger-Scale Epidemiological Study to Evaluate Health Risks of Air Pollution Episodes in Children.

Authors:  Sarah J D Nauwelaerts; Koen De Cremer; Natalia Bustos Sierra; Mathieu Gand; Dirk Van Geel; Maud Delvoye; Els Vandermassen; Jordy Vercauteren; Christophe Stroobants; Alfred Bernard; Nelly D Saenen; Tim S Nawrot; Nancy H C Roosens; Sigrid C J De Keersmaecker
Journal:  Int J Environ Res Public Health       Date:  2022-07-12       Impact factor: 4.614

Review 5.  Proteomics for Biomarker Discovery for Diagnosis and Prognosis of Kidney Transplantation Rejection.

Authors:  Luís M Ramalhete; Rúben Araújo; Aníbal Ferreira; Cecília R C Calado
Journal:  Proteomes       Date:  2022-07-02

Review 6.  Recent Advancements in Prognostic Factors of Epithelial Ovarian Carcinoma.

Authors:  Mohammad Ezzati; Amer Abdullah; Ahmad Shariftabrizi; June Hou; Michael Kopf; Jennifer K Stedman; Robert Samuelson; Shohreh Shahabi
Journal:  Int Sch Res Notices       Date:  2014-10-29

7.  Data integration by multi-tuning parameter elastic net regression.

Authors:  Jie Liu; Gangning Liang; Kimberly D Siegmund; Juan Pablo Lewinger
Journal:  BMC Bioinformatics       Date:  2018-10-10       Impact factor: 3.169

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.