| Literature DB >> 33782454 |
Wen-Hua Yuan1, Qi-Qi Xie2, Ke-Ping Wang3,4,5, Wei Shen3,6,4, Xiao-Fei Feng3,6,4, Zheng Liu3,6,4, Jin-Tao Shi3,6,4, Xiao-Bo Zhang3,6,4, Kai Zhang3,6,4, Ya-Jun Deng7,8,9, Hai-Yu Zhou10,11,12.
Abstract
Osteoarthritis (OA) is a chronic degenerative disease of the bone and joints. Immune-related genes and immune cell infiltration are important in OA development. We analyzed immune-related genes and immune infiltrates to identify OA diagnostic markers. The datasets GSE51588, GSE55235, GSE55457, GSE82107, and GSE114007 were downloaded from the Gene Expression Omnibus database. First, R software was used to identify differentially expressed genes (DEGs) and differentially expressed immune-related genes (DEIRGs), and functional correlation analysis was conducted. Second, CIBERSORT was used to evaluate infiltration of immune cells in OA tissue. Finally, the least absolute shrinkage and selection operator logistic regression algorithm and support vector machine-recurrent feature elimination algorithm were used to screen and verify diagnostic markers of OA. A total of 711 DEGs and 270 DEIRGs were identified in this study. Functional enrichment analysis showed that the DEGs and DEIRGs are closely related to cellular calcium ion homeostasis, ion channel complexes, chemokine signaling pathways, and JAK-STAT signaling pathways. Differential analysis of immune cell infiltration showed that M1 macrophage infiltration was increased but that mast cell and neutrophil infiltration were decreased in OA samples. The machine learning algorithm cross-identified 15 biomarkers (BTC, PSMD8, TLR3, IL7, APOD, CIITA, IFIH1, CDC42, FGF9, TNFAIP3, CX3CR1, ERAP2, SEMA3D, MPO, and plasma cells). According to pass validation, all 15 biomarkers had high diagnostic efficacy (AUC > 0.7), and the diagnostic efficiency was higher when the 15 biomarkers were fitted into one variable (AUC = 0.758). We developed 15 biomarkers for OA diagnosis. The findings provide a new understanding of the molecular mechanism of OA from the perspective of immunology.Entities:
Year: 2021 PMID: 33782454 PMCID: PMC8007625 DOI: 10.1038/s41598-021-86319-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Q-Q chart of data set eliminating inter-batch differences.
Figure 2Two-dimensional PCA cluster plot before and after sample correction and volcano plots of DEGs. (A,B) Figure shows two-dimensional PCA cluster plots before and after correcting for inter-batch differences removed for GSE51588, GSE55235, GSE55457, and GSE114007, respectively. Blue represents the osteoarthritis group and red represents the normal control group. (C) DEG volcano plot; red represents up-regulated differentially expressed genes and green represents down-regulated differentially expressed genes.
Figure 3GO, KEGG, and PPI network analyses of DEGs. (A) GO biological function enrichment analysis. (B) KEGG pathway enrichment analysis. (C) PPI network analysis graph. The node size indicates the clustering coefficient; a larger node indicated a larger clustering coefficient, with a greater proportion of genes in the network. The node color indicates the degree; a higher degree indicated greater connectedness of the node. Blue represents a larger degree, yellow indicated a medium degree, and orange indicates a minimum degree. The thickness of the line represents the overall score. A higher score resulted in a thicker line, indicating a strong interaction between the two proteins. (D) Schematic representation of hub genes.
Figure 4Volcano plot of DEIRGs and GO and KEGG enrichment analysis. (A) Volcanic map of DEIRGs; red represents up-regulation of these genes, whereas green represents down-regulation of these genes; (B) GO biological function enrichment analysis; C. KEGG pathway enrichment analysis.
Figure 5PPI network analysis and functional similarity analysis of DEIRGs. (A) PPI network analysis graph, the node size indicates the clustering coefficient; a larger node indicates a larger the clustering coefficient, and thus a greater proportion of genes in the network. The node color indicates the degree; a higher degree indicates greater connectedness of the node. Blue represents a higher degree, yellow represents a medium degree, and orange indicates a minimum degree. The line thickness represents the overall score. A higher score results in a thicker line, indicating that the interaction between the two proteins is stronger. (B) Schematic representation of hub immune-related genes. (C) Hub immune-related gene similarity analysis plot, with the abscissa as the similarity score.
Figure 6Correlation plots of immune cell infiltration analysis. (A) Correlation heat map of 22 immune cells. Blue represents positive correlation, red represents negative correlation, the darker the color, the stronger the correlation. (B) Network diagram of interactions of 22 types of immune cells. The size of the circle represents the interaction strength between immune cells infiltrating cells. (C) Violin plot of the proportion of infiltration by 22 types of immune cells in normal control samples versus in osteoarthritis samples. Red markers represent differences in infiltration between the two groups of samples.
Figure 7Biomarker selection was performed using two algorithms. (A) LASSO logistic regression algorithm to screen for biomarkers; (B) SVM-RFE algorithm for screening biomarkers; (C) Venn diagram showing the intersection of biomarkers obtained by the two algorithms.
Figure 8Verification of biomarkers. (A–C) Validation of the diagnostic efficacy of 15 biomarkers in the validation set; (D) validation of diagnostic efficacy after fitting 15 biomarkers into one variable.