| Literature DB >> 28903366 |
Yan Shi1, Sijin Yang2, Man Luo3, Wei-Dong Zhang4, Zun-Ping Ke5.
Abstract
Coronary artery disease caused about 1 of every 7 deaths in the United States and early prevention was potential to decrease the incidence and mortality. We aimed to figure the genes involving in the coronary artery disease using meta-anlaysis. Five datasets of coronary heart disease from GEO series were retrieved and data preprocessing and quality control were carried out. Moderated t-test was used to decide the differentially expressed genes for a single dataset. And the combined p-value using systematic-analysis methods were conducted using MetaDE. The pathway enrichment was carried out using Reactome database. Protein-protein interactions of the identified differentially expressed genes were also analyzed using STRING v10.0 online tool. After removing unidentified or intermediate samples and a total of 238 cases and 189 matched or partially matched control from five microarray datasets were retrieved from GEO. Six different quality control measures were calculated and PCA biplots were plotted in order to visualize the quantitative measure. The first two PCs captured 91% of the variance and we decided to include all of the datasets for systematic analysis. Using the FDR cut-off as 0.1, nine genes, including LFNG, ID3, PLA2G7, FOLR3, PADI4, ARG1, IL1R2, NFIL3 and MGAM, were differentially expressed according to maxP. Their protein-protein interactions showed that they were closely connected and 24 Reactome pathways were related to coronary artery disease. We concluded that pathways related to immune responses, especially neutrophil degranulation, were associated with coronary heart disease.Entities:
Keywords: coronary artery disease; gene expression omnibus (GEO); microarray; neutrophil degranulation; systematic analysis
Year: 2017 PMID: 28903366 PMCID: PMC5589605 DOI: 10.18632/oncotarget.17426
Source DB: PubMed Journal: Oncotarget ISSN: 1949-2553
The GEO datasets used in this study
| GEO Accession | Platform | Source DOI | control | case |
|---|---|---|---|---|
| gse20681 | GPL4133 | 10.1186/1755-8794-5-58 | 99 | 99 |
| gse20680 | GPL4133 | 10.1186/1755-8794-4-26 | 52 | 87 |
| gse29532 | GPL5175 | 10.1016/j.cca.2013.03.011 | 6 | 8 |
| gse42148 | GPL13607 | NA | 11 | 13 |
| gse48060 | GPL570 | 10.1016/j.yjmcc.2014.04.017 | 21 | 31 |
Figure 1Overview of the systematic analysis and datasets of the coronary heart disease
(A) The workflow of this study. (B) The number of cases (red) and controls (blue) in the five datasets of the coronary heart disease.
Quantitative quality control measures of coronary heart disease studies
| Dataset | Study | IQC | EQC | CQCg | CQCp | AQCg | AQCp | Rank |
|---|---|---|---|---|---|---|---|---|
| 1 | gse20680 | 6.25 | 0.9 | 1.84 | 8.58 | 1.7 | 9.27 | 2.5 |
| 2 | gse20681 | 4.5 | 1.1 | 10.91 | 27.02 | 4.29 | 10.78 | 1.5 |
| 3 | gse29532 | 2.85 | 0.53 | 0.24 | 2.42 | 0.08 | 0.01 | 4.5 |
| 4 | gse42148 | 0.61 | 1.1 | 2.39 | 9.34 | 1.6 | 2.73 | 2.83 |
| 5 | gse48060 | 4.91 | 0.91 | 0.24 | 0.84 | 0.8 | 3.73 | 3.67 |
Figure 2The systematic analysis of differentially expressed genes between patients with coronary heart diseases and controls by combining p-value
(A) PCA biplot of six quality control measures in five datasets. (B) The number of differentially expressed genes plotted as FDR in the analysis of five different datasets. (C) The heatmap identifying the differentially expressed gene in cases and controls subjected to maxP systematic analysis when FDR was lower than 0.1.
The number of differentially expressed genes in the five datasets of coronary heart disease using moderated-t test and meta-analysis combined p value
| Cutoff | gse20681 | gse20680 | gse29532 | gse42148 | gse48060 | roP | maxP |
|---|---|---|---|---|---|---|---|
| p <= 0.01 | 70 | 89 | 17 | 506 | 633 | 163 | 136 |
| p <= 0.05 | 415 | 418 | 227 | 1445 | 1697 | 500 | 423 |
| FDR <= 0.01 | 0 | 0 | 0 | 1 | 3 | 2 | 2 |
| FDR <= 0.05 | 0 | 2 | 0 | 9 | 109 | 7 | 7 |
Figure 3The heatmap of enriched pathways from Reactome
Only the pathways which were significantly enriched with the cutoff of FDR lower than 0.05 in at least two datasets were shown in the plot.
Figure 4The function analysis of the differentially expressed genes
(A) The pathway illustration of neutrophil degranulation which was the top enriched pathway. (B) The protein-protein interactions between the nine differentially expressed genes and their interactors. The important hubs (LFNG, ARG1, and PLA2G7) were annotated by the red arrows.