Literature DB >> 32872047

Identification of the key gene and pathways associated with osteoarthritis via single-cell RNA sequencing on synovial fibroblasts.

Zhen Wu1, Lu Shou2, Jian Wang1, Xinwei Xu1.   

Abstract

Osteoarthritis (OA) is a chronic degenerative joint disease with its onset closely related to the growth of synovial fibroblasts (SFs), yet the genes involved in are few reported. In our study, we aimed to identify the OA-associated key gene and pathways via the single-cell RNA sequencing (scRNA-seq) analysis on SFs.scRNA-seq data of SFs from OA sufferers were accessed from GEO database, then the genes involved in were subjected to principal component analysis (PCA) and T-Stochastic Neighbor Embedding (TSNE) Analysis. GO and KEGG enrichment analyses were performed to find the most enriched functions and pathways associated with marker genes and a PPI network was constructed to identify the key gene associated with OA occurrence.Findings revealed that marker genes in three cell types identified by TSNE were mainly activated in pathways firmly related to fibroblasts growth, such as extracellular matrix, immune and cell adhesion molecule binding-associated functions and pathways. Moreover, fibronectin1 (FN1) was validated as the key gene that was tightly related to the growth of SFs, as well as had the potential to play a key role in OA occurrence.Our study explored the key gene and pathways associated with OA occurrence, which were of great value in further investigation of OA diagnosis as well as pathogenesis.

Entities:  

Mesh:

Substances:

Year:  2020        PMID: 32872047      PMCID: PMC7437759          DOI: 10.1097/MD.0000000000021707

Source DB:  PubMed          Journal:  Medicine (Baltimore)        ISSN: 0025-7974            Impact factor:   1.817


Introduction

Osteoarthritis (OA) is a common chronic degenerative joint disease[ especially prevalent in middle-aged population, with around 100 million sufferers[ at present and approximately 40% over 70 years old.[ It is worth noting that the morbidity of OA increases with age,[ and females are more likely to be succumbed to such disease relative to males.[ In clinic, the therapeutic approaches against OA are predominantly pain control, malformation correction, and joint function improvement or recovery, but the overall efficacy remains poor.[ Therefore, more attention has been paid on the investigation of OA pathogenesis and further exploration in control measures, as well, mining OA-related genes has in turn become a focus in OA studies. Synovial fibroblasts (SFs) which play an important role in OA occurrence are direct effectors responding to tissue damage and matrix remodeling in synovitis, and synovial cell activation is triggered by the synovial cell-leucocyte interaction as well as cell-cell contact via the cytokine network.[ Besides, SFs can facilitate OA development via releasing inflammatory factors. Thus, to extend our knowledge of OA occurrence and the key genes involved in, it is of great guiding significance to clarify the mechanism underlying SFs occurrence and development as well as identify the associated genes. To date, several important regulatory genes have been identified during the research on SFs. Day-Williams et al found that MCF2L could function on the apoptosis of SFs through interacting with NRAGE, NRIF, NADE and other genes, thereby affecting OA progression.[ IL-13 has been identified as an important player in SFs via reducing the expression levels of cytokines and metalloproteinase, as reported by Jovanovic et al[ The discovery of these genes lays foundation for the further research on the mechanism of SFs occurrence, yet few studies have been made on the key genes that are associated with SFs and OA. With the development of RNA sequencing (RNA-seq) and single-cell sequencing techniques, most key genes associated with the occurrence of diseases and physiological processes can be identified. For example, Hu et al applied single-cell RNA sequencing (scRNA-seq) technique and revealed that some autophagy-related genes served as important players during the formation of mouse embryonic hematopoietic stem cells.[ Sun et al discovered some extracellular matrix (ECM)-associated genes which were crucial for breast cancer endothelial cells, and some of them had the potential acting as biomarkers in various cancers.[ Collectively, scRNA-seq analysis for identification of key genes associated with disease occurrence can be used as a significant approach for the investigation of the mechanism underlying cancer occurrence. In the present study, scRNA-seq data of SFs in OA sufferers were accessed from GEO database, and then genes involved in were processed for Principal Component Analysis (PCA) and T-Stochastic Neighbor Embedding (TSNE) analysis. Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) and protein-protein interaction (PPI) network were eventually employed to identify the key genes and pathways that were firmly associated with OA occurrence.

Materials and Methods

Data processing

In total, 192 scRNA-seq files of SFs from 2 OA patients were accessed from microarray GSE109449 included in Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/). Then statistical analysis was conducted with respect to gene number and expression levels.

Principal component analysis (PCA) and T-stochastic neighbor embedding (TSNE) analysis

Data fitting and R package were applied to perform PCA and TSNE analyses, as previously reported by Stuart et al[ and Butler[ et al.

Construction of protein-protein interaction (PPI) network

PPI network was constructed using STRING database (https://string-db.org/), and nodes associated to each gene were counted.

Results

Analysis on scRNA-seq of SFs

In all, scRNA-seq data of 192 SFs from 2 OA patients (OA4 and OA5) were obtained from microarray GSE109449 recorded in GEO database. Three aspects were selected for data analysis, including gene number in each cell (nFeature), total count number of all genes in each cell (nCount) and percentage of mitochondrial gene number in total gene number in each cell (percent.mt). As shown in Figure 1A–C nFeature was over 50 in every cell, whereas nCount of most cells was smaller than 5,000,000, and percent.mt was predominantly lower than 10%. As the expression of mitochondrial gene is generally low, cells with nFeature > 50 and percent.mt<5% were selected for subsequent analysis. Correlation analysis was performed on nCount and percent.mt or nFeature, finding a negative correlation between nCount and percent.mt, but a positive correlation between nCount and nFeature (Fig. 1D). This result further elucidated the low expression of genes in mitochondria. Thereafter, scatter diagram was plotted to find genes with larger expression alteration in different cells (Fig. 1E), and the top 1000 were selected for follow-up analysis.
Figure 1

Analysis on scRNA-seq data of SFs. scRNA-seq data of 192 SFs from 2 OA sufferers (OA4 and OA5) were accessed from GSE109449 and then analyzed in (A) nFeature, (B) nCount and (C) percent.mt three aspects, as shown in violin plots. Then, correlation analysis was performed on (D, left) nCount and percent.mt as well as on (D, right) nCount and nFeature, and (E) scatter diagram was plotted to find genes with larger expression alteration in different cells, with the abscissa referring to average expression, the vertical axis referring to standardized variance, black pots meaning the genes ranking after 1000 while red pots meaning the genes ranking top 1000.

Analysis on scRNA-seq data of SFs. scRNA-seq data of 192 SFs from 2 OA sufferers (OA4 and OA5) were accessed from GSE109449 and then analyzed in (A) nFeature, (B) nCount and (C) percent.mt three aspects, as shown in violin plots. Then, correlation analysis was performed on (D, left) nCount and percent.mt as well as on (D, right) nCount and nFeature, and (E) scatter diagram was plotted to find genes with larger expression alteration in different cells, with the abscissa referring to average expression, the vertical axis referring to standardized variance, black pots meaning the genes ranking after 1000 while red pots meaning the genes ranking top 1000.

PCA and TSNE analysis

In order to identify the marker genes from the top 1000 genes screened before, PCA was conducted and 20 principal components (PCs) were obtained. As revealed in the cell distribution in the top 2 PCs (PC1 and PC2) in Figure 2A, discreteness was obviously observed, indicating that PCA analysis could achieve the purpose of dimensionality reduction and cell classification. Then, cluster analysis was performed on the cells in these 20 PCs, and eventually three PCs with the most significant difference were identified, including PC1, PC2, and PC3 (Fig. 2B). Subsequently, cells in these 3 PCs were classified into 4 types (type 0, 1, 2, and 3) via TSNE analysis (Fig. 2C), genes among which were then differentially analyzed for marker genes identification (log|FC| > 0.5, P value < .05). Due to the distant relationship among cell type 1, 2 and 3, marker genes that were significantly different in these three types were chosen for follow-up enrichment analysis.
Figure 2

PCA and TSNE analyses. (A) The 1000 variant genes screened before were analyzed by PCA, and eventually 20 PCs were obtained. Then (B) the 20 PCs were clustered, ultimately 3 PCs with most significant difference was identified. The abscissa represents empirical value and the vertical axis represents theoretical value. In turn, (C) cells in these 3 PCs were sequentially clustered into four types via TSNE analysis.

PCA and TSNE analyses. (A) The 1000 variant genes screened before were analyzed by PCA, and eventually 20 PCs were obtained. Then (B) the 20 PCs were clustered, ultimately 3 PCs with most significant difference was identified. The abscissa represents empirical value and the vertical axis represents theoretical value. In turn, (C) cells in these 3 PCs were sequentially clustered into four types via TSNE analysis.

GO and KEGG enrichment analyses

GO and KEGG enrichment analyses were carried out on the marker genes identified from cell type 1, 2, and 3. The results demonstrated that genes in type 1 were mainly activated in functions related to extracellular structure organization, ECM organization and cell adhesion molecule binding (Fig. 3A), as well as enriched in ECM-receptor interaction pathway. Other than the functions and the pathway that type 1 genes activated in, genes in type 2 were also enriched in some immune-related functions like neutrophil degranulation and neutrophil mediated immunity (Fig. 3B), as well as some signaling pathways like PI3K-AKt. While for genes in type 3, the most enriched functions were similar to those of type 1 genes (Fig. 3C), yet pathways genes activated in were predominantly PI3K-AKt and ECM-receptor interaction pathways. Taken together, despite the most consistency in enriched functions and pathways of genes in these three types, a certain difference was still present among diverse cell types.
Figure 3

GO and KEGG enrichment analyses. GO (left) and KEGG (right) were conducted to analyze the most enriched functions and pathways of genes in cell type (A) 1, (B) 2 and (C) 3.

GO and KEGG enrichment analyses. GO (left) and KEGG (right) were conducted to analyze the most enriched functions and pathways of genes in cell type (A) 1, (B) 2 and (C) 3.

Selection of feature genes in TSNE cell types

To better identify the feature genes with remarkable expression features in cells, the top 10 marker genes in each cell type were selected according to the log|FC| value (Supplementary Table 1). Seurat 3.0 (https://satijalab.org/seurat/) was used to normalize and visualize the expression of the total 40 marker genes, and finally 20 marker genes with significantly differential expression were screened (Fig. 4A). Statistically, 10 marker genes in cell type 3 were obviously increased relative to those in the other 3 types cells, which demonstrated the more remarkable expression heterogeneity of the marker genes in type 3. Thereafter, the expression levels of the 10 marker genes in four types were extracted for visualization analysis, showing that the expression levels of the 10 genes in type 3 cells were greatly higher than those in the other 3 types cells (Fig. 4B and C). Taken together, all findings elucidated that marker genes in TSNE type 3 had much more significant expression difference relative to other types genes, as well, feature genes associated with SFs growth were more likely to appear among these genes.
Figure 4

Selection of significantly variant marker genes in TSNE cell types. The expression levels of variant genes in four cell types were plotted in a (A) heatmap, with purple as low expression genes and yellow as high expression genes. The abscissa represents the cells in type 0, 1, 2, and 3, while genes on the vertical axis represent the marker genes of four cell types that were normalized using Seurat 3.0. 10 marker genes of type 3 were analyzed in (B) expression in type 3 cells, with red meaning low expression and green meaning high expression. Then (C) these genes were analyzed in all 4 cell types, with abscissa referring to cell types and vertical axis referring to expression levels.

Selection of significantly variant marker genes in TSNE cell types. The expression levels of variant genes in four cell types were plotted in a (A) heatmap, with purple as low expression genes and yellow as high expression genes. The abscissa represents the cells in type 0, 1, 2, and 3, while genes on the vertical axis represent the marker genes of four cell types that were normalized using Seurat 3.0. 10 marker genes of type 3 were analyzed in (B) expression in type 3 cells, with red meaning low expression and green meaning high expression. Then (C) these genes were analyzed in all 4 cell types, with abscissa referring to cell types and vertical axis referring to expression levels.

Construction of PPI network and identification of key genes

Marker genes in cell type 3 were projected onto a PPI network using STRING website. As plotted in Figure 5A, all these marker genes showed a certain correlation with some genes in the network, which revealed that these marker genes might affect SFs via mutual adjustment. In addition, FN1 was found to possess the maximum node numbers (Fig. 5B), thus, it had the potential to function on the growth of SFs, consequently making an impact on OA occurrence.
Figure 5

Construction of PPI network and identification of the key gene. Marker genes in TSNE cell type 3 were projected onto (A) a PPI network and analyzed in (B) node number. The abscissa presented as node number.

Construction of PPI network and identification of the key gene. Marker genes in TSNE cell type 3 were projected onto (A) a PPI network and analyzed in (B) node number. The abscissa presented as node number.

Discussion

scRNA-seq is a technique combined with single-cell separation and RNA-seq methods, with the latest progress as detailed quantitative analysis on studied transcripts for the expression detection of all genes in thousands of separated cells at the single-cell level.[ In the past several years, single-cell cDNA libraries have been constructed, where scholars have the chance to obtain scRNA-seq data of the sequenced samples from relevant databases, in turn exploring the targeted key genes and pathways via analysis on related genes from the multi-dimension and multi-feature aspects.[ This study, for the first time, focused on the SFs of OA for gene analysis. scRNA-seq data of totally 192 SFs from 2 OA patients were accessed from GEO database, and the cells involved in were analyzed in nFeature, nCount and percent.mt three aspects. Consequently, 1000 genes with larger expression variance in different cells and had a high expression except in mitochondria were identified. PCA, a conditional screening method for various genes with differential expression, has been often used for dimensionality reduction of genes in the scRNA-seq research.[ In the present study, PCA was performed on the genes associated with SFs for the first time, and 3 PCs with significant difference were identified, cells among which were sequentially classified into 4 types via TSNE clustering analysis. GO and KEGG analyses were conducted on the marker genes from the 3 cell types, finding that functions and pathways genes activated in were predominantly ECM, immune as well as cell adhesion molecule binding-related functions and pathways, which are firmly associated with the occurrence of fibroblasts.[ Collectively, the results indicated the successful identification of genes and corresponding enrichment pathways closely related to SFs. After enrichment analysis, marker genes of cell type 3 (with great difference in expression with other cell types) were projected onto a PPI network, and FN1 was seen to be the key gene. FN1 (fibronectin1) is a vital component of ECM with a diversity of functions, such as accelerating blood clotting, possessing chemotaxis in neutrophil and monocyte, potentiating the movement of fibroblasts and chondrocytes to the injury, and participating in a series of physiological and pathological processes including transduction as well as activation of cell signals, cell growth and differentiation, and wound healing.[ Many studies have reported that the alteration of FN1 mRNA is tightly correlated with the OA occurrence.[ In the present study, FN1 was eventually determined as the key gene associated with OA occurrence via gene analysis in SFs of OA sufferers, supporting its important role in OA occurrence. In conclusion, a series of analyses were performed on the scRNA-seq data of SFs in OA sufferers, and identified that OA occurrence-related genes were mainly enriched in ECM, immune and cell adhesion molecule binding-related functions and pathways. In addition, FN1 was considered as the key gene functioning in OA occurrence. Meanwhile, some limitations were appeared in our study. For instance, the scRNA-seq data were only treated with one method of dimensionality reduction for gene screening, which needs further improvement. In a word, our study provides a novel research object for OA pathogenesis, and helps to explore a potential therapeutic target for OA.

Strengths and limitations of this study

The key pathways associated with osteoarthritis were firstly identified through single-cell RNA sequencing; The protein-protein interaction (PPI) network of osteoarthritis-associated genes was constructed for the first time. The key gene and pathways associated with osteoarthritis occurrence were identified and were of great value in further investigation of osteoarthritis diagnosis as well as pathogenesis.

Author contributions

Zhen Wu, Lu Shou, Jian Wang and Xinwei Xu contributed to the study design. Zhen Wu, Lu Shou and Xinwei Xu conducted the literature search. Zhen Wu, Lu Shou, Jian Wang and Xinwei Xu acquired the data. Zhen Wu, Lu Shou, Jian Wang and Xinwei Xu wrote the article and performed data analysis. All authors gave the final approval of the version to be submitted. All authors read and approved the final manuscript.
  18 in total

Review 1.  Stochastic modelling for quantitative description of heterogeneous biological systems.

Authors:  Darren J Wilkinson
Journal:  Nat Rev Genet       Date:  2009-02       Impact factor: 53.242

2.  Fibroblast growth factor receptors display both common and distinct signaling pathways.

Authors:  E Shaoul; R Reich-Slotky; B Berman; D Ron
Journal:  Oncogene       Date:  1995-04-20       Impact factor: 9.867

Review 3.  Pathogenesis and management of pain in osteoarthritis.

Authors:  Paul A Dieppe; L Stefan Lohmander
Journal:  Lancet       Date:  2005 Mar 12-18       Impact factor: 79.321

Review 4.  [Osteoarthritis of the shoulder: pathogenesis, diagnostics and conservative treatment options].

Authors:  J Mehl; A B Imhoff; K Beitzel
Journal:  Orthopade       Date:  2018-05       Impact factor: 1.087

Review 5.  Functional effects of susceptibility genes in osteoarthritis.

Authors:  Frederique M F Cornelis; Frank P Luyten; Rik J Lories
Journal:  Discov Med       Date:  2011-08       Impact factor: 2.970

6.  Single-cell RNA sequencing highlights transcription activity of autophagy-related genes during hematopoietic stem cell formation in mouse embryos.

Authors:  Yongfei Hu; Yan Huang; Ying Yi; Hongwei Wang; Bing Liu; Jia Yu; Dong Wang
Journal:  Autophagy       Date:  2017-01-27       Impact factor: 16.016

7.  Metabolomics Signature for Non-Responders to Total Joint Replacement Surgery in Primary Osteoarthritis Patients: The Newfoundland Osteoarthritis Study.

Authors:  Christie A Costello; Ting Hu; Ming Liu; Weidong Zhang; Andrew Furey; Zhaozhi Fan; Proton Rahman; Edward W Randell; Guangju Zhai
Journal:  J Orthop Res       Date:  2019-11-25       Impact factor: 3.494

8.  Effect of IL-13 on cytokines, cytokine receptors and inhibitors on human osteoarthritis synovium and synovial fibroblasts.

Authors:  D Jovanovic; J P Pelletier; N Alaaeddine; F Mineau; C Geng; P Ranger; J Martel-Pelletier
Journal:  Osteoarthritis Cartilage       Date:  1998-01       Impact factor: 6.576

9.  Single-cell RNA sequencing reveals gene expression signatures of breast cancer-associated endothelial cells.

Authors:  Zhengda Sun; Chih-Yang Wang; Devon A Lawson; Serena Kwek; Hugo Gonzalez Velozo; Mark Owyong; Ming-Derg Lai; Lawrence Fong; Mark Wilson; Hua Su; Zena Werb; Daniel L Cooke
Journal:  Oncotarget       Date:  2017-12-29

10.  Identification of Biomarkers Associated With Pathological Stage and Prognosis of Clear Cell Renal Cell Carcinoma by Co-expression Network Analysis.

Authors:  Liang Chen; Lushun Yuan; Kaiyu Qian; Guofeng Qian; Yuan Zhu; Chin-Lee Wu; Han C Dan; Yu Xiao; Xinghuan Wang
Journal:  Front Physiol       Date:  2018-04-18       Impact factor: 4.566

View more
  6 in total

1.  The Crosstalk Between Immune Infiltration, Circulating Tumor Cells, and Metastasis in Pancreatic Cancer: Identification of HMGB3 From a Multiple Omics Analysis.

Authors:  Hao-Dong Tang; Yang Wang; Peng Xie; Si-Yuan Tan; Hai-Feng Li; Hao Shen; Zheng Zhang; Zheng-Qing Lei; Jia-Hua Zhou
Journal:  Front Genet       Date:  2022-06-08       Impact factor: 4.772

2.  Consensus clustering of single-cell RNA-seq data by enhancing network affinity.

Authors:  Yaxuan Cui; Shaoqiang Zhang; Ying Liang; Xiangyun Wang; Thomas N Ferraro; Yong Chen
Journal:  Brief Bioinform       Date:  2021-11-05       Impact factor: 11.622

Review 3.  Single Cell Omics for Musculoskeletal Research.

Authors:  Muhammad Farooq Rai; Chia-Lung Wu; Terence D Capellini; Farshid Guilak; Amanda R Dicks; Pushpanathan Muthuirulan; Fiorella Grandi; Nidhi Bhutani; Jennifer J Westendorf
Journal:  Curr Osteoporos Rep       Date:  2021-02-09       Impact factor: 5.096

4.  Protein N-glycosylation aberrations and glycoproteomic network alterations in osteoarthritis and osteoarthritis with type 2 diabetes.

Authors:  Yi Luo; Ziguang Wu; Song Chen; Huanhuan Luo; Xiaoying Mo; Yao Wang; Jianbang Tang
Journal:  Sci Rep       Date:  2022-04-28       Impact factor: 4.996

5.  Multi-omics molecular biomarkers and database of osteoarthritis.

Authors:  Jianhua Li; Xiaotian Yang; Qinjie Chu; Lingjuan Xie; Yuwen Ding; Xiaoxu Xu; Michael P Timko; Longjiang Fan
Journal:  Database (Oxford)       Date:  2022-07-05       Impact factor: 4.462

Review 6.  Current understanding of osteoarthritis pathogenesis and relevant new approaches.

Authors:  Liping Tong; Huan Yu; Xingyun Huang; Jie Shen; Guozhi Xiao; Lin Chen; Huaiyu Wang; Lianping Xing; Di Chen
Journal:  Bone Res       Date:  2022-09-20       Impact factor: 13.362

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.