| Literature DB >> 29617451 |
Antonio Irigoyen1, Cristina Jimenez-Luna2, Manuel Benavides3, Octavio Caba4, Javier Gallego5, Francisco Manuel Ortuño6, Carmen Guillen-Ponce7, Ignacio Rojas6, Enrique Aranda8, Carolina Torres9, Jose Prados2.
Abstract
Applying differentially expressed genes (DEGs) to identify feasible biomarkers in diseases can be a hard task when working with heterogeneous datasets. Expression data are strongly influenced by technology, sample preparation processes, and/or labeling methods. The proliferation of different microarray platforms for measuring gene expression increases the need to develop models able to compare their results, especially when different technologies can lead to signal values that vary greatly. Integrative meta-analysis can significantly improve the reliability and robustness of DEG detection. The objective of this work was to develop an integrative approach for identifying potential cancer biomarkers by integrating gene expression data from two different platforms. Pancreatic ductal adenocarcinoma (PDAC), where there is an urgent need to find new biomarkers due its late diagnosis, is an ideal candidate for testing this technology. Expression data from two different datasets, namely Affymetrix and Illumina (18 and 36 PDAC patients, respectively), as well as from 18 healthy controls, was used for this study. A meta-analysis based on an empirical Bayesian methodology (ComBat) was then proposed to integrate these datasets. DEGs were finally identified from the integrated data by using the statistical programming language R. After our integrative meta-analysis, 5 genes were commonly identified within the individual analyses of the independent datasets. Also, 28 novel genes that were not reported by the individual analyses ('gained' genes) were also discovered. Several of these gained genes have been already related to other gastroenterological tumors. The proposed integrative meta-analysis has revealed novel DEGs that may play an important role in PDAC and could be potential biomarkers for diagnosing the disease.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29617451 PMCID: PMC5884535 DOI: 10.1371/journal.pone.0194844
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Workflow of the whole integrated meta-analysis for integration of Affymetrix/Illumina expression data from PDAC datasets.
Characteristics of both Cohort 1 and Cohort 2 groups of PDAC patients.
| Cohort 1 (n = 18) | Cohort 2 (n = 36) | |
|---|---|---|
| Characteristic | N°. case (%) | N°. case (%) |
| Male | 9 (50%) | 24 (67%) |
| Female | 9 (50%) | 12 (33%) |
| 61.4±10.7 | 60.0±7.7 | |
| Maximum | 76 | 73 |
| Minimum | 37 | 42 |
| Yes | 0 (0%) | 2 (5.6%) |
| No | 18 (100%) | 34 (94.4%) |
| Yes | 7 (38.9%) | 14 (38.9%) |
| No | 11 (61.1%) | 22 (61.1%) |
| I | 0 (0%) | 0 (0%) |
| II | 0 (0%) | 0 (0%) |
| III | 6 (33.3%) | 0 (0%) |
| IV | 12 (66.7%) | 36 (100%) |
Coincident genes in the three analyzes: Affymetrix, Illumina and integrated meta-analysis.
| Gene | Gene description | ENTREZ | FCb | adj.P.Val |
|---|---|---|---|---|
| Fas apoptotic inhibitory molecule 3 | 9214 | - 2.17 | 4.59E-11 | |
| interleukin-1 receptor-associated kinase 3 | 11213 | 1.84 | 4.59E-11 | |
| DENN/MADD domain containing 2D | 79961 | - 1.67 | 1.08E-09 | |
| phospholipase B domain containing 1 | 79887 | 1.67 | 1.50E-09 | |
| 1-acylglycerol-3-phosphate O-acyltransferase 9 | 84803 | 1.58 | 1.47E-08 |
aEntrez Gene Name.
bFold change.
Fig 2Comparison of individual analysis by technology with integrated analysis.
a Coincident genes in the three analyzes: Affymetrix, Illumina and integrated meta-analysis (Table 2). b Remaining differentially expressed genes in individual Illumina and the integrative meta-analysis (S1 Table). c Remaining differentially expressed genes in individual Affymetrix and the integrative meta-analysis (S2 Table). d Differentially expressed genes in the integrative meta-analysis but not in individual analysis (gained genes) (S3 Table).
Fig 3ROC Curves for the 5 genes commonly expressed: FAMI3, IRAK3, DENND2D, PLBD1 and AGPAT9.
Curves are provided for both Illumina and Affymetrix individual analyses as well as our integrative meta-analysis. The Area Under the Curve (AUC) metrics are also provided for each curve.
Sensitivity and specificity values for the selected genes after a leave-one-out cross-validation (LOOCV) process.
| Gene | Sensitivity | Specificity |
|---|---|---|
| 0.889 | 0.75 | |
| 0.87 | 0.969 | |
| 0.944 | 0.75 | |
| 0.852 | 0.813 | |
| 0.889 | 0.813 |
Shared Gene Ontology (GO) terms after the gene enrichment analysis applied over the 28 gained genes.
The Kolmogorov-Smirnov statistical test was performed to determine their significance (p-value < 0.05).
| GO ID | GO Term | Ontology | # Genes | Genes | |
|---|---|---|---|---|---|
| GO:0044237 | cellular metabolic process | BP | 12 | 0.011 | |
| GO:0044763 | single-organism cellular process | BP | 23 | 0.015 | |
| GO:0050776 | regulation of immune response | BP | 5 | 0.023 | |
| GO:0044710 | single-organism metabolic process | BP | 6 | 0.037 | |
| GO:0006139 | nucleobase-containing compound metabolic process | BP | 7 | 0.043 | |
| GO:0006725 | cellular aromatic compound metabolic process | BP | 7 | 0.043 | |
| GO:0006807 | nitrogen compound metabolic process | BP | 7 | 0.043 | |
| GO:0034641 | cellular nitrogen compound metabolic process | BP | 7 | 0.043 | |
| GO:0034645 | cellular macromolecule biosynthetic process | BP | 7 | 0.043 | |
| GO:0044249 | cellular biosynthetic process | BP | 7 | 0.043 | |
| GO:0046483 | heterocycle metabolic process | BP | 7 | 0.043 | |
| GO:1901360 | organic cyclic compound metabolic process | BP | 7 | 0.043 |