| Literature DB >> 34622712 |
Minwei Hu1, Ling Zou1, Jiong Lu1, Zeyu Yang1, Yinan Chen1, Yaozeng Xu2, Changhui Sun1.
Abstract
Osteoporosis is a progressive bone disease in the elderly and lacks an effective classification method of patients. This study constructed a gene signature for an accurate prediction and classification of osteoporosis patients. Three gene expression datasets of osteoporosis samples were acquired from the Gene Expression Omnibus database with pre-set criteria. Differentially expressed genes (DEGs) between normal and diseased osteoporosis samples were screened using Limma package in R language. Protein-protein interaction (PPI) network was established based on interaction data of the DEGs from the Human Protein Reference Database. Classification accuracy of the classifier was assessed with sensitivity, specificity and area under curve (AUC) using the pROC package in the R. Pathway enrichment analysis was performed on feature genes with clusterProfiler. A total of 310 differentially expressed genes between two samples were associated with positive regulation of protein secretion and cytokine secretion, neutrophil-mediated immunity, and neutrophil activation. PPI network of DEGs consisted of 12 genes. A SVM classifier based on five feature genes was developed to classify osteoporosis samples, showing a higher prediction accuracy and AUC for GSE35959, GSE62402, GSE13850, GSE56814, GSE56815 and GSE7429 datasets. A SVM classifier with a high accuracy was developed for predicting osteoporosis. The genes included may be the potential feature genes in osteoporosis development.AbbreviationsDEGs: Differentially expressed genes; PPI: protein-protein interaction; WHO: World Health Organization; SVM: Support vector machine; GEO: Gene Expression Omnibus; KEGG: Kyoto Encyclopedia of Genes and Genomes; GO: Gene Ontology; BP: Biological Process; CC: Cellular Component; MF: Molecular Function; SVM: Support vector machines.Entities:
Keywords: Osteoporosis; bioinformatics; differentially expressed genes; gene signature; protein–protein interaction; support vector machine
Mesh:
Year: 2021 PMID: 34622712 PMCID: PMC8806423 DOI: 10.1080/21655979.2021.1971026
Source DB: PubMed Journal: Bioengineered ISSN: 2165-5979 Impact factor: 3.269
Sample information of datasets
| Data set | Expression | Platforms |
|---|---|---|
| Normal | 3 | GPL4133 |
| Osteoporosis | 10 | |
| High BMD | 5 | GPL5175 |
| Low BMD | 5 | |
| Normal | 9 | GPL570 |
| Osteoporosis | 5 | |
| High BMD | 20 | GPL96 |
| Low BMD | 20 | |
| High BMD | 42 | GPL5175 |
| Low BMD | 31 | |
| High BMD | 40 | GPL96 |
| Low BMD | 40 | |
| High BMD | 10 | GPL96 |
| Low BMD | 10 |
Figure 1.Work flow chart
Figure 2.Screening of differentially expressed genes. (a) Volcano plot of differentially expressed genes in dataset GSE56116; (b) Heat map of differentially expressed genes
Figure 3.Functional enrichment of differentially expressed genes. (a) BP annotation of differentially expressed genes; (b) CC annotation of differentially expressed genes; (c) MF annotation of differentially expressed genes; (d) KEGG annotation of differentially expressed genes
Figure 4.PPI analysis of the gene of the functional module
Figure 5.Functional enrichment of functional module genes. (a) BP annotation of functional module genes; (b) CC annotation of functional module genes; (c) MF annotation of functional module genes; (d) KEGG annotation of functional module genes
Figure 6.Identification of Hub genes. (a) PPI network diagram of hub genes obtained by Closeness algorithm. (b) PPI network diagram of hub genes obtained by MCC algorithm. (c) PPI network diagram of hub genes obtained by MNC algorithm. (d) PPI network of hub genes obtained by Degree algorithm
Figure 7.Venn diagram of hub genes identification
Figure 8.Development and verification of the model. (a) The classification result and ROC curve of the GSE35959 dataset sample by the diagnostic model; (b) The classification result and ROC curve of the GSE62402 dataset sample by the diagnostic model. (c) The classification result and ROC curve of the GSE13850 dataset sample by the diagnostic model; (d) The classification result and ROC curve of the GSE56814 dataset sample by the diagnostic model; € The classification result and ROC curve of the GSE56815 dataset sample by the diagnostic model; (f) The classification result and ROC curve of the GSE7429 dataset sample by the diagnostic model