Literature DB >> 34880861

CD247, a Potential T Cell-Derived Disease Severity and Prognostic Biomarker in Patients With Idiopathic Pulmonary Fibrosis.

Yupeng Li1, Shibin Chen2, Xincheng Li1, Xue Wang1, Huiwen Li1, Shangwei Ning3, Hong Chen1.   

Abstract

Background: Idiopathic pulmonary fibrosis (IPF) has high mortality worldwide. The CD247 molecule (CD247, as known as T-cell surface glycoprotein CD3 zeta chain) has been reported as a susceptibility locus in systemic sclerosis, but its correlation with IPF remains unclear.
Methods: Datasets were acquired by researching the Gene Expression Omnibus (GEO). CD247 was identified as the hub gene associated with percent predicted diffusion capacity of the lung for carbon monoxide (Dlco% predicted) and prognosis according to Pearson correlation, logistic regression, and survival analysis.
Results: CD247 is significantly downregulated in patients with IPF compared with controls in both blood and lung tissue samples. Moreover, CD247 is significantly positively associated with Dlco% predicted in blood and lung tissue samples. Patients with low-expression CD247 had shorter transplant-free survival (TFS) time and more composite end-point events (CEP, death, or decline in FVC >10% over a 6-month period) compared with patients with high-expression CD247 (blood). Moreover, in the follow-up 1st, 3rd, 6th, and 12th months, low expression of CD247 was still the risk factor of CEP in the GSE93606 dataset (blood). Thirteen genes were found to interact with CD247 according to the protein-protein interaction network, and the 14 genes including CD247 were associated with the functions of T cells and natural killer (NK) cells such as PD-L1 expression and PD-1 checkpoint pathway and NK cell-mediated cytotoxicity. Furthermore, we also found that a low expression of CD247 might be associated with a lower activity of TIL (tumor-infiltrating lymphocytes), checkpoint, and cytolytic activity and a higher activity of macrophages and neutrophils.
Conclusion: These results imply that CD247 may be a potential T cell-derived disease severity and prognostic biomarker for IPF.
Copyright © 2021 Li, Chen, Li, Wang, Li, Ning and Chen.

Entities:  

Keywords:  CD247; biomarker; idiopathic pulmonary fibrosis; immune response; inflammation response; prognosis

Mesh:

Substances:

Year:  2021        PMID: 34880861      PMCID: PMC8645971          DOI: 10.3389/fimmu.2021.762594

Source DB:  PubMed          Journal:  Front Immunol        ISSN: 1664-3224            Impact factor:   7.561


Introduction

Idiopathic pulmonary fibrosis (IPF) causes worsening dyspnea and deteriorating lung function, which results in poor prognosis (1). Actually, patients with IPF often die within 2–3 years after diagnosis (2, 3), and the 5-year survival rate is less than 40% (4). Therefore, it is important to identify effective biomarkers for disease severity and prognosis in patients with IPF, which might identify patients with a worse predicted prognosis early and might then benefit from more aggressive interventions or earlier referral for transplantation. Increasing studies have shown that innate and adaptive immune processes can coordinate existing fibrotic responses and are associated with prognosis in patients with IPF (5, 6). CD247 (also referred to as T-cell surface glycoprotein CD3 zeta chain) is part of the T-cell antigen receptor (TCR) complex, playing an important role in receptor expression and signaling (7, 8). Studies have suggested that a low expression of CD247 caused in the setting of chronic inflammation was associated with decreased T cell activity (9–11). Interestingly, the caused immunosuppression is only associated with downregulation of CD247 while the remaining TCR subunits are unaffected, which implies that the CD247 downregulation occurs at chronic inflammation not at acute inflammatory response (9, 10). Furthermore, downregulation of CD247 had been reported in chronic inflammatory diseases such as celiac disease (12), chronic obstructive pulmonary disease (11), systemic lupus erythematosus (13), and systemic sclerosis (14). As far as we know, the association between CD247 and IPF has not been reported. In this study, according to publicly available databases, we presented evidence of such an association between CD247 and immune microenvironment phenotype and evaluated the role of CD247 expression in patients with IPF.

Materials and Methods

Acquisition of Datasets

On the GEO database (http://www.ncbi.nlm.nih.gov/geo/), the datasets meeting the following criteria were included: (1) datasets with data of lung function and (2) datasets with prognostic data (also including the data of decline of lung function). In addition, in order to compare the CD247 expression between IPF and control and clarify the change process of Cd247 expression of T cells in the process of fibrosis, the GSE110147 (15), GSE33566 (16), and GSE141259 (17) datasets (bleomycin-induced mouse model of pulmonary fibrosis) were chosen. In the GSE141259 dataset, single-cell RNA-sequencing (scRNA-seq) data of 0, 3, 7, 10, 14, 21, and 28 days were provided after bleomycin was used. Finally, we selected 11 datasets: four came from lung tissue samples (GSE32537 (18), GSE47460 (19), GSE110147 (15), GSE141259 (17)), six came from blood samples (GSE93606 (20), GSE27957 (21), GSE28042 (21), GSE38958 (22), GSE132607 (23), GSE33566 (16)), and one came from bronchoalveolar lavage fluid (BALF, GSE70866 (24)). Approval of the Ethics Committee was not necessary for the datasets from the GEO database.

Dataset Preprocessing

The gene expression matrix of GSE32537 (18), GSE47460 (19), GSE110147 (15), GSE93606 (20), GSE28042 (21), GSE38958 (22), GSE132607 (23), GSE33566 (16), and GSE70866 (24) and the raw data of GSE27957 (21) were downloaded from GEO. “Affy” package (25) (v.1.68.0) was used to normalize the array data according to the robust multi-array average (RMA) method (GSE27957). In addition, lung function [percent predicted forced vital capacity (FVC% predicted) and percent predicted diffusion capacity of the lung for carbon monoxide (Dlco% predicted)] was extracted from the GSE93606, GSE38958, GSE132607, GSE32537, and GSE47460 datasets ( , ). The data of lung function were not complete in the GSE33566 dataset. Therefore, the data of lung function were not used in this study. Transplant-free survival (TFS) was extracted from the GSE27957 and GSE28042 datasets ( ). The composite end point (CEP, death, or decline in FVC >10% over a 6-month period) and survival were extracted from the GSE93606 and GSE70866 datasets, respectively ( , and ). Furthermore, follow-up data were also extracted from the GSE132607 (transcriptome data and lung function data) and GSE93606 datasets (transcriptome data), respectively.
Table 1

The clinical features of patients with IPF (blood).

Clinical featuresGSE93606GSE27957GSE28042GSE38958GSE132607GSE33566
No. of patients574575607493
No. of controls2001945030
Age (years)67 ± 866.9 ± 8.168.9 ± 8.168.2 ± 7.266.6 ± 7.667.2 ± 11.4
Gender
 Male38 (66.7)40 (88.9)52 (69.3)49 (81.7)52 (70.3)61 (65.6)
 Female19 (33.3)5 (11.1)23 (30.7)11 (18.3)22 (29.7)32 (34.4)
FVC% predicted72.22 ± 20.2662 ± 1465 ± 1662.42 ± 14.9769.66 ± 18.42
Dlco% predicted39.17 ± 14.0744 ± 1749 ± 1843.25 ± 18.7345.61 ± 15.45
Rate of lung transplants2 (4.4)15 (20)
Immunosuppressive therapy02 (4.4)11 (14.7)0
Status
 Non-TFS15 (33.3)43 (57.3)
 TFS30 (66.7)32 (42.7)
 CEP a 34 (59.6)
 Non-CEP23 (40.4)
 FVC decline ≥10% b 16 (21.6)
 Dlco decline ≥15% b 26 (35.1)

Death or decline in FVC >10% over the 6-month period.

Patients experiencing ≥10% relative reduction in FVC% predicted or ≥15% relative reduction in Dlco% predicted over 12 months.

Values are presented as n (%) or mean ± standard deviation (SD).

FVC% predicted, percent predicted forced vital capacity; Dlco% predicted, percent predicted diffusion capacity of the lung for carbon monoxide; TFS, transplant-free survival; CEP, composite end point.

Table 2

The clinical features of patients with IPF (lung tissue).

Clinical featuresGSE32537GSE47460GSE110147
No. of patients11912222
No. of controls509111
Age (years)62.6 ± 8.764.5 ± 8.462 ± 6
Gender
 Male77 (64.7)81 (66.4)17 (77.3)
 Female42 (35.3)41 (33.6)5 (22.7)
FVC% predicted61.25 ± 17.0264.29 ± 16.4257 ± 19
Dlco% predicted45.13 ± 20.3049.51 ± 18.7337 ± 10
Immunosuppressive therapy15 (68.2)

Values are presented as mean ± SD or n (%).

FVC% predicted, percent predicted forced vital capacity; Dlco% predicted, percent predicted diffusion capacity of the lung for carbon monoxide.

The clinical features of patients with IPF (blood). Death or decline in FVC >10% over the 6-month period. Patients experiencing ≥10% relative reduction in FVC% predicted or ≥15% relative reduction in Dlco% predicted over 12 months. Values are presented as n (%) or mean ± standard deviation (SD). FVC% predicted, percent predicted forced vital capacity; Dlco% predicted, percent predicted diffusion capacity of the lung for carbon monoxide; TFS, transplant-free survival; CEP, composite end point. The clinical features of patients with IPF (lung tissue). Values are presented as mean ± SD or n (%). FVC% predicted, percent predicted forced vital capacity; Dlco% predicted, percent predicted diffusion capacity of the lung for carbon monoxide. Pearson correlation coefficients were used to determine the association between CD247 and other genes or lung function according to R package “stats” (v.4.0.5) among these datasets. R packages “gplots” (v.3.1.1), “pheatmap” (v.1.0.12), and “RColorBrewer (v.1.1-2)” were used to construct the heatmap. R package “forestplot” (v.1.10.1) was used to construct the forest plot (https://CRAN.R-project.org/package=forestplot).

Analysis of scRNA-seq Data

The computational analysis of the GSE141259 dataset was performed using R package “Seurat” (4.0.3) (26). Quality control had been finished by the authors of this GSE141259 dataset; therefore, 29,297 cells were analyzed. The Seurat SCTransform () function was used to normalize the scRNA-seq data. Principal component analysis (PCA) was calculated using the Seurat RunPCA () function. UMAP embedding and Louvain clusters were calculated using the first 50 principal components with the Seurat RunUMAP () and FindClusters () functions, respectively. Resolution was set as 0.9. The Seurat FindAllMarkers () function was used to find markers of 31 clusters, and cell types were identified based on markers of each cluster according to the CellMarker (27) and PanglaoDB databases (28). Expression and distribution of Cd247 were visualized according to Seurat DotPlot () and FeaturePlot () functions. Dynamic Cd247 expression was visualized by the web (https://theislab.github.io/LungInjuryRegeneration/).

The Identification of CD247-Related Partners

We used STRING (http://www.string.embl.de/, version: 11.0 b) (29) to analyze proteins that purportedly interact with CD247 [“medium confidence (0.400)”, meaning of network edges (“evidence”), max number of interactors to show (“no more than 50 interactors” in 1st shell), and active interaction sources (“experiments”)]. The MEM database (30) (https://biit.cs.ut.ee/mem/index.cgi) was used to verify the correlation between CD247 and genes obtained from STRING.

Functional Analysis

Differentially expressed genes (DEGs) were defined as expression levels of genes that were significantly diverse in IPF patients with low-expression CD247 compared with those with high-expression CD247 (|log Fold Change|>0.5 and false discovery rates (FDR) < 0.05). “Limma” package (v.3.46.0) (31) was used for the analysis of DEGs. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) were analyzed and visualized according to R package “clusterProfiler” (32). P values were adjusted with the Benjamini and Hochberg (BH) correction method. The single-sample gene set enrichment analysis (ssGSEA) score of 19 immune cells and the activity of 15 immune-related pathways (33) were calculated by the “GSVA” R package (v.1.38.2) (34). CIBERSORT is a large-scale analysis tool of RNA mixtures for cellular biomarkers and therapeutic targets according to the gene expression feature sets of 22 immune cell subtypes (http://cibersort.stanford.edu/) (35). Subsequently, the 20 immune cell subtypes were classified into four types: lymphocytes (B cells naive, B cells memory, plasma cells, T cells CD8, T cells CD4 naive, T cells CD4 memory resting, T cells CD4 memory activated, T cells follicular helper, T cells regulatory, T cells gamma delta, NK cells resting, NK cells activated), macrophages (monocytes, macrophages M0, macrophages M1, macrophages M2), dendritic cells (dendritic cells resting, dendritic cells activated), and mast cells (mast cells resting and mast cells activated).

Statistical Analysis

SPSS Statistics 23 (IBM SPSS) and R software (Version 4.0.3) were used for statistical analysis. Categorical variables were described as number (%) and were compared by the chi-square test or the Fisher exact test. Continuous variables were compared by independent group t tests. The diagnostic accuracy of CD247 for IPF or Dlco% predicted decline ≥15% over 12 months (Dlco15) were estimated by receiver operating characteristic (ROC) analysis. Youden index was used to calculate the optimal cutoff values of CD247 for Dlco15. Logistic regression analysis was used to estimate the odds ratio (OR) for the statistically significant correlation between gene expression and Dlco15. Kaplan–Meier analysis was used to compare the TFS or CEP or survival between different groups. R package “survminer” (https://CRAN.R-project.org/package=survminer, v.0.4.8) was used to estimate the optimal cutoff expression value of CD247 for the survival analysis. The hazard ratio (HR) for the statistically significant correlation between CD247 expression and TFS or CEP was estimated by univariate Cox regression. The R package “survivalROC” (https://CRAN.R-project.org/package=survivalROC, v.1.0.3) was used to construct the time-dependent ROC curve to evaluate the predictive value of CD247. Some statistical analyses were visualized by GraphPad Prism 9. The bilateral test was used.

Results

Identification of the Hub Gene: CD247

The association between CD247 expression and lung function was respectively determined in the GSE38958, GSE132607, and GSE93606 datasets (blood samples). The significant correlations (p < 0.05) between genes and lung function are shown in . After intersecting the genes with significant correlations in the three blood datasets, 33 genes associated with Dlco% predicted and 17 genes associated with FVC% predicted were selected for further analysis ( , and ). Furthermore, of the 33 genes, 14 genes with consistent positive correlations (ATP10A, CD244, CD247, DDX19A, DIS3L, OSBPL3, RFX5, SH3YL1, TDRKH, TMTC4, TTC39B, TXK, UBE3C, and UTP15) and 11 genes with consistent negative correlations (BNIPL, C9orf131, DUSP13, GJA5, GUCY2D, MYL4, RHAG, SLC28A1, SLC6A7, TNFRSF19, and TULP2) were selected as the candidate hub genes ( , pink rectangles). Of the 17 genes, one gene with consistent positive correlations (PMPCB) and five genes with consistent negative correlations (ALLC, ANLN, DRD3, GLT8D2, and NECAB1) were also selected as the candidate hub genes ( , pink rectangles).
Figure 1

Identification of candidate hub genes. (A) The intersection of genes associated with Dlco% predicted in the GSE38958, GSE132607, and GSE93606 datasets. (B) Heatmap plot of the 33 genes associated with Dlco% predicted in the three datasets. (C) The intersection of genes associated with FVC% predicted in the three datasets. (D) Heatmap plot of the 17 genes associated with FVC% predicted in the three datasets. The numbers in the heatmap represent the correlation coefficients, red represents the positive correlation, green represents the negative correlation, the darker shade of red or green represents the higher correlation level, and the pink rectangles showed the genes were selected as the candidate hub genes.

Identification of candidate hub genes. (A) The intersection of genes associated with Dlco% predicted in the GSE38958, GSE132607, and GSE93606 datasets. (B) Heatmap plot of the 33 genes associated with Dlco% predicted in the three datasets. (C) The intersection of genes associated with FVC% predicted in the three datasets. (D) Heatmap plot of the 17 genes associated with FVC% predicted in the three datasets. The numbers in the heatmap represent the correlation coefficients, red represents the positive correlation, green represents the negative correlation, the darker shade of red or green represents the higher correlation level, and the pink rectangles showed the genes were selected as the candidate hub genes. The follow-up data (4, 8, 12 months) were extracted from the GSE132607 dataset (blood). In the fourth month, only MYL4 was significantly negatively associated with Dlco% predicted ( ). In the eighth month, six genes (CD247, DIS3L, OSBPL3, TDRKH, TTC39B, and UTP15) were significantly positively associated with Dlco% predicted, and three genes (GUCY2D, MYL4, and RHAG) were significantly negatively associated with Dlco% predicted ( ). In the 12th month, five genes (CD244, CD247, DDX19A, RFX5, and TXK) were significantly positively associated with Dlco% predicted, and MYL4 was significantly negatively associated with Dlco% predicted ( ). However, no genes were significantly associated with FVC% predicted in the follow-up data. Therefore, CD247 and MYL4 with a greatly consistent significant correlation were chosen for further analysis.
Table 3

The correlations between gene expression and Dlco% predicted or FVC% predicted after visiting at 4, 8, and 12 months in the GSE132607 dataset.

GenesBaseline4 months8 months12 months
rp valuerp valuerp valuerp value
Dlco% predicted
ATP10A0.2580.0260.0840.4840.0200.8770.0110.934
BNIPL-0.2390.0410.0060.960-0.1700.175-0.2250.083
C9orf131-0.2570.027-0.1400.242-0.2100.0940.1350.305
CD2440.2870.0130.1140.3420.1910.1270.2620.043
CD247 0.259 0.026 0.195 0.100 0.261 0.036 0.385 0.002
DDX19A0.2570.0270.0990.4060.2000.1110.2710.036
DIS3L0.3040.0080.1900.1090.3240.0080.2000.126
DUSP13-0.2410.0390.0180.879-0.0790.529-0.1960.134
GJA5-0.3080.008-0.1020.396-0.1960.118-0.0180.890
GUCY2D-0.2610.025-0.0190.871-0.3240.008-0.0620.636
MYL4 -0.254 0.029 -0.289 0.014 -0.403 0.001 -0.433 0.001
OSBPL30.2500.0310.0660.5810.2620.0350.1420.278
RFX50.2330.0460.1850.1210.2230.0740.2720.035
RHAG-0.2500.032-0.2210.062-0.2470.047-0.2330.073
SH3YL10.3150.0060.1110.3520.2270.0690.1990.128
SLC28A1-0.2300.0490.0690.565-0.0140.913-0.0890.500
SLC6A7-0.2920.012-0.0790.508-0.1020.419-0.0540.684
TDRKH0.2650.0220.1480.2150.2800.0240.1180.370
TMTC40.2450.0360.0290.8070.0040.975-0.0580.662
TNFRSF19-0.2430.037-0.0550.6440.0340.789-0.0130.919
TTC39B0.2720.0190.1750.1410.2670.0320.2090.109
TULP2-0.2680.021-0.0070.955-0.1240.324-0.0820.533
TXK0.2630.0240.1120.3500.1710.1740.2740.034
UBE3C0.2600.025-0.1420.2350.2200.078-0.0930.478
UTP150.3370.0030.0870.4680.2530.0420.0180.890
FVC% predicted
ALLC-0.2450.0360.0510.6700.0310.8060.0630.630
ANLN-0.2460.0360.1090.3590.0250.838-0.2130.103
DRD3-0.2870.0140.0050.9690.0550.6590.0110.933
GLT8D2-0.2940.0120.0770.516-0.0790.527-0.1540.240
NECAB1-0.2370.0440.1460.2190.0110.928-0.0960.466
PMPCB0.2630.0250.0100.932-0.1020.412-0.1230.348

Bold values represented that the genes were selected for further study.

The correlations between gene expression and Dlco% predicted or FVC% predicted after visiting at 4, 8, and 12 months in the GSE132607 dataset. Bold values represented that the genes were selected for further study. Dlco% predicted decline ≥15% over 12 months (Dlco15) is a useful index to evaluate whether patients with IPF were progressive. Patients with Dlco15 were more likely to have the lower CD247 expression and higher MYL4 expression compared with those without Dlco15 in the GSE132607 dataset ( and ). In addition, Dlco% predicted in patients with low-expression CD247 classified by the median value of CD247 at baseline was significantly lower than that in patients with high-expression CD247 after visiting at 4, 8, and 12 months in the GSE132607 dataset ( and ). Likewise, Dlco% predicted in patients with high-expression MYL4 classified by the optimal cutoff value (4.37) of MYL4 at baseline was significantly lower than that in patients with low-expression MYL4 after visiting at 4, 8, and 12 months ( and ). Subsequently, two lung tissue datasets (GSE32537 and GSE47460) were used. As shown in , CD247 expression was significantly positively associated with Dlco% predicted in both blood and lung tissue. However, MYL4 was significantly negatively associated with Dlco% predicted in blood samples, not in lung tissue samples ( ). Furthermore, CD247 was not significantly correlated with GAP (gender, age, and physiological index) in the GSE70866 dataset (BALF sample, data not shown).
Figure 2

The expression of CD247 and MYL4 and their value for Dlco% predicted after visiting at 0, 4, 8, and 12 months in the GSE132607 dataset. (A) The change of mean CD247 expression. (B) The comparison of mean Dlco% predicted between high-expression and low-expression CD247. (C) The change of mean MYL4 expression. (D) The comparison of mean Dlco% predicted between high-expression and low-expression MYL4. *p < 0.05.

Table 4

Baseline characteristics of the patients among different groups in the GSE132607 dataset.

CharacteristicsCD247MYL4
High expressionLow expressionp value High expressionLow expressionp value
Age (years)66.86 ± 7.3266.40 ± 8.040.79666.52 ± 7.8666.93 ± 7.200.839
Gender
 Female8 (21.6)14 (37.8)0.12715 (27.8)7 (35.0)0.546
 Male29 (78.4)23 (62.2)39 (72.2)13 (65.0)
Smoking
 No12 (32.4)13 (35.1)1.00015 (27.8)10 (50.0)0.073
 Yes25 (67.6)24 (64.9)39 (72.2)10 (50.0)
Baseline
 FVC% predicted72.14 ± 20.6467.25 ± 15.90.25967.89 ± 17.874.69 ± 19.70.168
 Dlco% predicted49.78 ± 15.7241.44 ± 14.20.01942.76 ± 14.653.28 ± 15.50.008
4 months
 FVC% predicted71.62 ± 20.3365.72 ± 15.90.17166.04 ± 18.475.50 ± 16.80.049
 Dlco% predicted48.78 ± 15.1439.23 ± 13.10.00639.97 ± 13.254.51 ± 14.1< 0.001
8 months
 FVC% predicted72.31 ± 20.3565.35 ± 15.30.11865.82 ± 18.276.07 ± 16.80.034
 Dlco% predicted49.51 ± 13.5839.16 ± 15.40.00640.73 ± 14.653.21 ± 13.60.002
12 months
 FVC% predicted68.63 ± 21.5564.36 ± 14.50.37564.94 ± 18.070.49 ± 18.80.301
 Dlco% predicted46.02 ± 15.3237.33 ± 12.50.01938.65 ± 12.949.46 ± 16.10.009
Figure 3

The correlation between CD247 expression and lung function. Blood: GSE38958 dataset (A), GSE93606 dataset (B), visiting at 0, 4, 8, and 12 months in the GSE132607 dataset (C–F). Lung tissue: GSE47460 dataset (G, H) and GSE32537 dataset (I, J).

The expression of CD247 and MYL4 and their value for Dlco% predicted after visiting at 0, 4, 8, and 12 months in the GSE132607 dataset. (A) The change of mean CD247 expression. (B) The comparison of mean Dlco% predicted between high-expression and low-expression CD247. (C) The change of mean MYL4 expression. (D) The comparison of mean Dlco% predicted between high-expression and low-expression MYL4. *p < 0.05. Baseline characteristics of the patients among different groups in the GSE132607 dataset. The correlation between CD247 expression and lung function. Blood: GSE38958 dataset (A), GSE93606 dataset (B), visiting at 0, 4, 8, and 12 months in the GSE132607 dataset (C–F). Lung tissue: GSE47460 dataset (G, H) and GSE32537 dataset (I, J). The expression of CD247 in patients with IPF was lower than that in controls in the GSE93606 (blood), GSE33566 (blood), GSE47460 (lung tissue), and GSE110147 (lung tissue) datasets, whereas the expression of CD247 did not show a difference in the GSE38958 (blood) and GSE32537 (lung tissue) datasets ( ). The expression of CD247 in patients with IPF (n = 75) was lower than that in controls (n = 19) in the GSE28042 dataset (13.24 vs. 13.56, p = 0.0043). Immunosuppressive therapy was not used before the blood samples were collected in the GSE93606 and GSE33566 datasets, implying that the expression of CD247 in the blood samples was not affected by the immunosuppressive therapy ( ). The diagnostic values of CD247 for IPF were variable in different datasets ( ). Besides, the expression of MYL4 did not show the significant difference in these datasets (data not shown). Therefore, CD247 was considered as the key gene.
Figure 4

The expression of CD247 in both blood and lung tissue. Blood: GSE38958 dataset (A), GSE93606 dataset (B), GSE33566 dataset (C). Lung tissue: GSE47460 dataset (D), GSE32537 dataset (E), GSE110147 dataset (F). ns, not significant.

The expression of CD247 in both blood and lung tissue. Blood: GSE38958 dataset (A), GSE93606 dataset (B), GSE33566 dataset (C). Lung tissue: GSE47460 dataset (D), GSE32537 dataset (E), GSE110147 dataset (F). ns, not significant.

The Prognostic Value of CD247 in the Blood Samples

According to the ROC curve analysis and R package “survminer”, the optimal cutoff value of CD247 was chosen for the logistic regression analysis in the GSE132607 (blood) dataset and prognosis-related analysis in the GSE93606 (blood), GSE27957 (blood), and GSE28042 (blood) datasets, respectively. According to logistic regression analysis, a low expression of CD247 at visiting at 0, 8, and 12 months was the risk factor of Dlco15 in the GSE132607 dataset, whereas the low expression of CD247 was not the risk factor of Dlco15 in the fourth month ( ). According to Cox regression analysis and Kaplan–Meier analysis, low-expression CD247 at visiting at 0, 1, 3, 6, and 12 months was the risk factor of CEP in the GSE93606 dataset and significantly associated with shorter TFS time in the GSE27957 and GSE28042 datasets ( , ). Furthermore, the ROC curve showed that the areas under the curve (AUC) were 0.736 at 1 year and 0.741 at 2 years for CEP in the GSE93606 dataset ( ). Moreover, AUCs for non-TFS were respectively 0.889, 0.787, and 0.702 at 1, 2, and 3 years in the GSE27957 dataset, whereas the AUC was relatively low in the GSE28042 dataset ( ). In the GSE70866 dataset (BALF samples), CD247 was not significantly associated with mortality (p > 0.05, ).
Figure 5

Results of the univariate logistic regression regarding Dlco15 in the GSE132607 dataset (A) and the univariate Cox regression regarding CEP in the GSE93606 dataset and non-TFS in the GSE27957 and GSE28042 datasets (B). Dlco15, Dlco% predicted decline ≥15% over 12 months; TFS, non-transplant-free survival; CEP, composite end point (death or decline in FVC >10% over six months period).

Results of the univariate logistic regression regarding Dlco15 in the GSE132607 dataset (A) and the univariate Cox regression regarding CEP in the GSE93606 dataset and non-TFS in the GSE27957 and GSE28042 datasets (B). Dlco15, Dlco% predicted decline ≥15% over 12 months; TFS, non-transplant-free survival; CEP, composite end point (death or decline in FVC >10% over six months period). The protein–protein interaction (PPI) network was constructed based on CD247 according to the STRING database (average node degree: 3.71, PPI enrichment p-value: <1.62e-3; ) (29). According to the PPI network, 13 protein-coding genes (CD3E, SHC2, DOCK2, JAK3, PTPN6, LCK, FYN, SHC1, CSK, PTPN3, ZAP70, SHC4, and SHC3) were found to interact with CD247. The MEM database (30) gathers several hundreds of publicly available gene expression data sets from the ArrayExpress database, which is a useful tool for the co-expression analysis across hundreds of datasets. Results from the MEM database (30) revealed a significant co-expression relationship between CD247 and the 9 of 13 interacted genes (CD3E, ZAP70, LCK, FYN, JAK3, DOCK2, PTPN6, CSK, and SHC1; ). In addition, in patients with IPF, results from blood, lung tissue, and BALF samples revealed a significant co-expression association between CD247 and the 6 of 13 interacted genes (CD3E, ZAP70, LCK, FYN, JAK3, and PTPN3; ). GO and KEGG analysis implied that CD247 and its 13 interacted genes were enriched in the immune response especially the T cell-related biological processes (BP), cellular component (CC), and pathways ( ). Furthermore, CD247, ZAP70, LCK, FYN, PTPN6, SHC1, SHC4, SHC2, and SHC3 were associated with NK cell-mediated cytotoxicity ( ). Interestingly, five genes (CD247, CD3E, ZAP70, LCK, PTPN6) were associated with the hsa05235 pathway (PD-L1 expression and PD-1 checkpoint pathway in cancer, ).
Figure 6

CD247-related gene enrichment analysis. (A) The available experimentally determined CD247-binding proteins using the STRING tool. (B) According to the MEM database, the co-expression correlations between CD247 and 13 selected genes were analyzed. (C) The corresponding Pearson coefficients of the 13 genes correlated with CD247 were visualized by the heatmap in the blood, lung tissue, and BALF samples. The darker shade of red or blue represents the higher correlation level. (D) The top 10 significant terms in the biological processes (BP), cellular component (CC), and molecular function (MF) analysis for the 14 genes. (E) The top 30 significant terms in the KEGG pathways analysis for the 14 genes.

CD247-related gene enrichment analysis. (A) The available experimentally determined CD247-binding proteins using the STRING tool. (B) According to the MEM database, the co-expression correlations between CD247 and 13 selected genes were analyzed. (C) The corresponding Pearson coefficients of the 13 genes correlated with CD247 were visualized by the heatmap in the blood, lung tissue, and BALF samples. The darker shade of red or blue represents the higher correlation level. (D) The top 10 significant terms in the biological processes (BP), cellular component (CC), and molecular function (MF) analysis for the 14 genes. (E) The top 30 significant terms in the KEGG pathways analysis for the 14 genes. In order to reveal the biological significance correlated with CD247, the DEGs between the patients with high-expression and low-expression CD247 based on the median value were used to conduct the GO enrichment and KEGG pathway analysis in the GSE38958, GSE132607, and GSE93606 datasets (blood samples). In the three datasets, DEGs were mainly enriched in inflammation- and immune-related response and pathways such as neutrophil activation involved in immune response, T cell activation, T cell differentiation, leukocyte chemotaxis, IL-17 signaling pathway, T cell receptor signaling pathway, Th17 cell differentiation, PD-L1 expression, and PD-1 checkpoint pathway in cancer and so on ( ). Furthermore, related functions or pathways with ssGSEA showed that patients with low-expression CD247 were more likely to have lower immune activity [lower T cell general, Th1 cells, TIL (tumor-infiltrating lymphocytes), checkpoint, cytolytic activity, T cell co-inhibition, and T cell exhaustion scores)] and higher degree of inflammation response [higher DCs (dendritic cell), M2 macrophages, and neutrophils] compared with patients with high-expression CD247 in the three datasets ( ), which was consistent with the results of CIBERSORT analysis ( ). In addition, low-expression CD247 was significantly associated with low lymphocyte score ( ).
Figure 7

Significant GO terms and KEGG pathway analysis for the DEGs between patients with low-expression CD247 and patients with high-expression CD247. The most significant GO enrichment and KEGG pathways in the GSE38958 dataset (A, B), GSE132607 dataset (C, D), and GSE93606 datasets (E, F) are displayed.

Figure 8

Comparison of the ssGSEA scores between patients with low-expression CD247 and patients with high-expression CD247 in the GSE38958 dataset (A, B), GSE132607 dataset (C, D), and GSE93606 dataset (E, F). The scores of 19 immune cells (A, C, E) and 15 immune-related functions (B, D, F) are displayed in violin plots. DC, dendritic cell; TIL, tumor-infiltrating lymphocytes; CCR, cytokine–cytokine receptor. P values were shown as: ns, not significant; *p < 0.05; **p < 0.01; ***p < 0.001.

Significant GO terms and KEGG pathway analysis for the DEGs between patients with low-expression CD247 and patients with high-expression CD247. The most significant GO enrichment and KEGG pathways in the GSE38958 dataset (A, B), GSE132607 dataset (C, D), and GSE93606 datasets (E, F) are displayed. Comparison of the ssGSEA scores between patients with low-expression CD247 and patients with high-expression CD247 in the GSE38958 dataset (A, B), GSE132607 dataset (C, D), and GSE93606 dataset (E, F). The scores of 19 immune cells (A, C, E) and 15 immune-related functions (B, D, F) are displayed in violin plots. DC, dendritic cell; TIL, tumor-infiltrating lymphocytes; CCR, cytokine–cytokine receptor. P values were shown as: ns, not significant; *p < 0.05; **p < 0.01; ***p < 0.001.

The scRNA-seq Data Analysis of CD247

According to the LungMAP database (36), CD247 is mainly expressed by the T cells and NK cells in the human lung based on scRNA-seq data (37) ( ). In the mouse lung, Cd247 is also mainly expressed by the T cells and NK cells, and the expression of Cd247 of T cells is increased in the acute inflammatory stage, then decreased in the fibrotic stage after bleomycin injection ( ).

Discussion

IPF is a serious lung disease with high mortality. In this study, we observed a significantly downregulated CD247 expression in patients with IPF compared with controls, and CD247 was significantly positively associated with Dlco% predicted in both blood and lung samples. The significantly positive association was still consistent after following up 8 and 12 months in the GSE132607 dataset (blood). Besides, low-expression CD247 was also a risk factor of Dlco15 in the GSE132607 dataset (blood) and is significantly associated with higher CEP in the GSE93606 datasets (blood). After following up at 1, 3, 6, and 12 months, low-expression CD247 was still the risk factor of CEP in the GSE93606 dataset (blood). In addition, low expression of CD247 was significantly associated with shorter TFS time in the GSE27957 and GSE28042 datasets (blood). The cause of low AUC in the GSE28042 dataset might be the high rate of lung transplantation. Therefore, CD247 might be a potential biomarker for disease severity and prognosis in patients with IPF. In addition, MYL4 was significantly negatively associated with Dlco% predicted in blood samples whereas it was associated neither with Dlco% predicted in the lung tissue samples nor with prognosis in the blood samples. Thereby, MYL4 might be an effective biomarker of disease severity for IPF in blood samples. However, the roles of the two genes need further study for verification. CD247 as part of the TCR complex plays an important role in receptor expression and signaling (7, 8) and is associated with chronic inflammation (9). According to the LungMAP database, CD247 is mainly expressed by the T cells and NK cells in the lung based on scRNA-seq data, implying that CD247 could be an important regulator of immune responses in the lung. Actually, inflammation caused by some pathogens such as viruses and bacteria are proposed to play a role in the development of IPF. In the bleomycin-induced pulmonary fibrosis model, cytomegalovirus is considered to accelerate existing fibrosis according to enhancing TGFB1 activation and the expression of both phospho-SMAD2 and Vimentin (38). Besides, in BALF and lung tissue of patients with IPF, Epstein–Barr virus (EBV, a member of the Herpes family) is enriched compared with healthy controls (39, 40). Furthermore, GERD (gastroesophageal reflux disease) is common in patients with IPF; thereby, the ongoing micro aspiration could lead to repeated inoculation with oral and gastric organisms (41). In this study, results from blood, lung tissue, and BALF samples revealed significant co-expression associations between CD247 and the six interacted genes (CD3E, ZAP70, LCK, FYN, JAK3, and PTPN3). When the expression of CD247 was downregulated, the expressions of CD3E, ZAP70, LCK, and FYN were also downregulated. JAK3 was negatively associated with CD247 in the blood samples but was positively associated with CD247 in the lung tissue and BALF samples. These genes are key signaling molecules not only in the selection and maturation of developing T-cells but also in the activation of T cells. These results reveal that downregulation of CD247 caused by inflammation might suppress the immune activity by regulating the expression of these genes, which needs further study for verification. In addition, according to the GO and KEGG analyses of DEGs between the patients with low-expression and high-expression CD247, these DEGs were mainly enriched in inflammation- and immune-related response and pathways. Besides, patients with low-expression CD247 were more likely to have lower activity of T cells in general, Th1 cells, NK cells, and TIL (tumor-infiltrating lymphocytes) and higher activity of dendritic cells (DCs), M2 macrophages, and neutrophils compared with patients with high-expression CD247. Furthermore, the response process analysis showed that patients with low-expression CD247 were more likely to have a lower score of checkpoint, cytolytic activity, and T cell activation and higher score of inflammation compared with patients with high-expression CD247. These results were consistent with the following studies. The incidence of cancer in IPF patients is higher compared with matched controls, especially for lung cancer (42). Interestingly, five genes (CD247, CD3E, ZAP70, LCK, PTPN6) and DEGs between the patients with high-expression and low-expression CD247 were associated with the hsa05235 pathway (PD-L1 expression and PD-1 checkpoint pathway in cancer), and patients with low-expression CD247 had a lower TIL ssGSEA score, which may explain the high incidence of cancer. Besides, the downregulation of T cell regulatory genes associated with the immune checkpoint CTLA-4 was significantly associated with reduced event-free survival in the PBMCs (peripheral blood mononuclear cell) of patients with IPF (21). Interestingly, inflammatory interstitial lung diseases were caused by checkpoint inhibitors used as cancer immunotherapy (43). We speculated that the low immune checkpoint expression may be associated with the development and progression of IPF, which needs further study for verification. Immune processes can coordinate existing fibrotic responses and are associated with prognosis in patients with IPF (5, 6). Th1 cells and their secretory products such as IL-12 (a potent inducer of IFN-γ) are considered as being anti-fibrotic (44, 45). Macrophages play an important role in the pathogenesis of IPF according to the regulation of both injury and repair of lung (46, 47). A relative excess of M1/M2 macrophages leads to epithelial cell death as well as aberrant and dysregulated repair responses, which could cause progression or acute exacerbation of IPF (48, 49). Neutrophils are associated with production of cytokines and chemokines, tissue injury, regulation of ECM (extracellular matrix) turnover, and generation of NETs (neutrophil extracellular traps), which result in fibroblast activation and ECM accumulation in IPF (6). Taken together, our results suggest that chronic inflammation could participate in the development and progression of IPF according to the downregulated expression of CD247. There are several limitations in this study. First, the study was conducted based on the retrospective data from GEO, and the number of samples in each dataset was relatively small. Second, we have only considered a single variable in the logistic regression and Cox regression. Many prominent prognostic clinical parameters such as lung function, treatment measures, and underlying diseases were not reported in most datasets that we used; thereby, the prognostic value of CD247 was limited. Third, the treatment of patients with IPF was unknown in some datasets, which may affect both disease and data analysis. Fourth, results regarding Dlco15 and CEP were based on a single dataset, which limited the generalizability of these findings. Finally, larger-sample prospective studies are needed to estimate the clinical relevance of CD247.

Conclusion

These results suggest that CD247 could reflect well the immune activity in both lung and blood and may be a potential biomarker to predict the lung function and prognosis of patients with IPF. Besides, MYL4 may be a potential biomarker for Dlco% predicted in the blood samples. However, the results need further study for verification.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: GSE32537 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE32537), GSE47460 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE47460), GSE110147 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE110147), GSE141259 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE141259), GSE93606 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE93606, GSE27957 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE27957), GSE28042 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE28042), GSE38958 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE38958), GSE132607 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132607), GSE33566 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE33566), GSE70866 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE70866). The annotated gene set file used in ssGSEA were available in .

Ethics Statement

Approval of the Ethics Committee was not necessary for the datasets from GEO database.

Author Contributions

HL, XL, and XW performed the data analysis. YL and SC performed the data collection, prepared the first manuscript draft, validated the data collection, refined the research idea, performed the data analysis, and edited the manuscripts. HC and SN designed and developed the research idea, refined the research idea, validated the data collection, and edited the manuscripts. HC and SN are the guarantors of the manuscript. All authors contributed to the article and approved the submitted version.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
  49 in total

1.  Sustained exposure to bacterial antigen induces interferon-gamma-dependent T cell receptor zeta down-regulation and impaired T cell function.

Authors:  Noemí Bronstein-Sitton; Leonor Cohen-Daniel; Ilan Vaknin; Analía V Ezernitchi; Benny Leshem; Amal Halabi; Yael Houri-Hadad; Eugenia Greenbaum; Zichria Zakay-Rones; Lior Shapira; Michal Baniyash
Journal:  Nat Immunol       Date:  2003-09-21       Impact factor: 25.606

Review 2.  Roles of T lymphocytes in pulmonary fibrosis.

Authors:  Irina G Luzina; Nevins W Todd; Aldo T Iacono; Sergei P Atamas
Journal:  J Leukoc Biol       Date:  2007-10-25       Impact factor: 4.962

Review 3.  Idiopathic pulmonary fibrosis.

Authors:  Luca Richeldi; Harold R Collard; Mark G Jones
Journal:  Lancet       Date:  2017-03-30       Impact factor: 79.321

4.  Molecular and genetic properties of tumors associated with local immune cytolytic activity.

Authors:  Michael S Rooney; Sachet A Shukla; Catherine J Wu; Gad Getz; Nir Hacohen
Journal:  Cell       Date:  2015-01-15       Impact factor: 41.582

5.  PanglaoDB: a web server for exploration of mouse and human single-cell RNA sequencing data.

Authors:  Oscar Franzén; Li-Ming Gan; Johan L M Björkegren
Journal:  Database (Oxford)       Date:  2019-01-01       Impact factor: 3.451

6.  BAL Cell Gene Expression Is Indicative of Outcome and Airway Basal Cell Involvement in Idiopathic Pulmonary Fibrosis.

Authors:  Antje Prasse; Harald Binder; Jonas C Schupp; Gian Kayser; Elena Bargagli; Benedikt Jaeger; Moritz Hess; Susanne Rittinghausen; Louis Vuga; Heather Lynn; Shelia Violette; Birgit Jung; Karsten Quast; Bart Vanaudenaerde; Yan Xu; Jens M Hohlfeld; Norbert Krug; Jose D Herazo-Maya; Paola Rottoli; Wim A Wuyts; Naftali Kaminski
Journal:  Am J Respir Crit Care Med       Date:  2019-03-01       Impact factor: 21.405

7.  Host-Microbial Interactions in Idiopathic Pulmonary Fibrosis.

Authors:  Philip L Molyneaux; Saffron A G Willis-Owen; Michael J Cox; Phillip James; Steven Cowman; Michael Loebinger; Andrew Blanchard; Lindsay M Edwards; Carmel Stock; Cécile Daccord; Elisabetta A Renzoni; Athol U Wells; Miriam F Moffatt; William O C Cookson; Toby M Maher
Journal:  Am J Respir Crit Care Med       Date:  2017-06-15       Impact factor: 21.405

8.  Macrophage activation in acute exacerbation of idiopathic pulmonary fibrosis.

Authors:  Jonas Christian Schupp; Harald Binder; Benedikt Jäger; Giuseppe Cillis; Gernot Zissel; Joachim Müller-Quernheim; Antje Prasse
Journal:  PLoS One       Date:  2015-01-15       Impact factor: 3.240

9.  Comprehensive gene expression profiling identifies distinct and overlapping transcriptional profiles in non-specific interstitial pneumonia and idiopathic pulmonary fibrosis.

Authors:  Matthew J Cecchini; Karishma Hosein; Christopher J Howlett; Mariamma Joseph; Marco Mura
Journal:  Respir Res       Date:  2018-08-15

10.  Single-cell multiomic profiling of human lungs reveals cell-type-specific and age-dynamic control of SARS-CoV2 host genes.

Authors:  Allen Wang; Joshua Chiou; Olivier B Poirion; Justin Buchanan; Michael J Valdez; Jamie M Verheyden; Xiaomeng Hou; Parul Kudtarkar; Sharvari Narendra; Jacklyn M Newsome; Minzhe Guo; Dina A Faddah; Kai Zhang; Randee E Young; Justinn Barr; Eniko Sajti; Ravi Misra; Heidie Huyck; Lisa Rogers; Cory Poole; Jeffery A Whitsett; Gloria Pryhuber; Yan Xu; Kyle J Gaulton; Sebastian Preissl; Xin Sun
Journal:  Elife       Date:  2020-11-09       Impact factor: 8.140

View more
  4 in total

1.  Identification of Immune-Related Hub Genes in Thymoma: Defects in CD247 and Characteristics of Paraneoplastic Syndrome.

Authors:  Lin-Fang Deng
Journal:  Front Genet       Date:  2022-06-14       Impact factor: 4.772

2.  An 8-ferroptosis-related genes signature from Bronchoalveolar Lavage Fluid for prognosis in patients with idiopathic pulmonary fibrosis.

Authors:  Yaowu He; Yu Shang; Yupeng Li; Menghan Wang; Dongping Yu; Yi Yang; Shangwei Ning; Hong Chen
Journal:  BMC Pulm Med       Date:  2022-01-05       Impact factor: 3.317

3.  S100A12 as Biomarker of Disease Severity and Prognosis in Patients With Idiopathic Pulmonary Fibrosis.

Authors:  Yupeng Li; Yaowu He; Shibin Chen; Qi Wang; Yi Yang; Danting Shen; Jing Ma; Zhe Wen; Shangwei Ning; Hong Chen
Journal:  Front Immunol       Date:  2022-02-04       Impact factor: 7.561

4.  Gene expression profiling reveals candidate biomarkers and probable molecular mechanisms in chronic stress.

Authors:  Bohan Zhang; Weijie Zhong; Biao Yang; Yi Li; Shuxian Duan; Junlong Huang; Yanfei Mao
Journal:  Bioengineered       Date:  2022-03       Impact factor: 3.269

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.