Hehe Wang1, Junge Zhang2. 1. Department of Otolaryngology, Head and Neck Surgery, Ningbo First Hospital, Ningbo, Zhejiang, People's Republic of China. 2. Department of Anesthesiology, Ningbo First Hospital, Ningbo, Zhejiang, People's Republic of China.
Abstract
Purpose: Although considerable progress has been made in basic and clinical research on nasopharyngeal carcinoma (NPC), the biomarkers of the progression of NPC have not been fully studied and described. This study was designed to identify potential novel biomarkers for NPC using integrated analyses and explore the immune cell infiltration in this pathological process. Methods: Five GEO data sets were downloaded from gene expression omnibus database (GEO) and analysed to identify differentially expressed genes (DEGs), followed by Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses. The four algorithms were adopted for screening of novel and key biomarkers for NPC, including random forest (RF) machine learning algorithm, least absolute shrinkage and selection operator (LASSO) logistic regression, support vector machine-recursive feature elimination (SVM-RFE), and weighted gene co-expression network analysis (WGCNA). Lastly, CIBERSORT was used to assess the infiltration of immune cells in NPC, and the correlation between diagnostic markers and infiltrating immune cells was analyzed. Results: Herein, we identified 46 DEGs, and enrichment analysis results showed that DEGs and several kinds of signaling pathways might be closely associated with the occurrence and progression of NPC. DTL was recognized as NPC-related biomarker. DTL, also known as retinoic acid-regulated nuclear matrix-associated protein (RAMP), or DNA replication factor 2 (CDT2), is reported to be correlated with the cell proliferation, cell cycle arrest and cell invasion in hepatocellular carcinoma, breast cancer and gastric cancer. Immune infiltration analysis demonstrated that macrophages M0, macrophages M1 and T cells CD4 memory activated were linked to pathogenesis of NPC. Conclusion: In summary, we adopted a comprehensive strategy to screen DTL as biomarkers related to NPC and explore the critical role of immune cell infiltration in NPC.
Purpose: Although considerable progress has been made in basic and clinical research on nasopharyngeal carcinoma (NPC), the biomarkers of the progression of NPC have not been fully studied and described. This study was designed to identify potential novel biomarkers for NPC using integrated analyses and explore the immune cell infiltration in this pathological process. Methods: Five GEO data sets were downloaded from gene expression omnibus database (GEO) and analysed to identify differentially expressed genes (DEGs), followed by Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses. The four algorithms were adopted for screening of novel and key biomarkers for NPC, including random forest (RF) machine learning algorithm, least absolute shrinkage and selection operator (LASSO) logistic regression, support vector machine-recursive feature elimination (SVM-RFE), and weighted gene co-expression network analysis (WGCNA). Lastly, CIBERSORT was used to assess the infiltration of immune cells in NPC, and the correlation between diagnostic markers and infiltrating immune cells was analyzed. Results: Herein, we identified 46 DEGs, and enrichment analysis results showed that DEGs and several kinds of signaling pathways might be closely associated with the occurrence and progression of NPC. DTL was recognized as NPC-related biomarker. DTL, also known as retinoic acid-regulated nuclear matrix-associated protein (RAMP), or DNA replication factor 2 (CDT2), is reported to be correlated with the cell proliferation, cell cycle arrest and cell invasion in hepatocellular carcinoma, breast cancer and gastric cancer. Immune infiltration analysis demonstrated that macrophages M0, macrophages M1 and T cells CD4 memory activated were linked to pathogenesis of NPC. Conclusion: In summary, we adopted a comprehensive strategy to screen DTL as biomarkers related to NPC and explore the critical role of immune cell infiltration in NPC.
Nasopharyngeal carcinoma (NPC) is a type of head and neck tumor with high invasion and metastasis originating from nasopharyngeal epithelial tissue. Although originating from similar cell or tissue lineages, NPC is significantly different from other epithelial head and neck tumors, characterized by early cervical lymph node metastasis and invasion of the base of the skull, with significant ethnic and geographic specificity, and the highest incidence of distant metastasis of NPC in head and neck tumors.1–3Unfortunately, early-stage cancers can be asymptomatic, so biomarkers such as circulating cell-free Epstein–Barr virus (EBV) DNA are used to detect NPC in populations at risk for the disease.4 Subjects with elevated plasma biomarkers are assessed by nasopharyngeal endoscopic examination. Those with an abnormality suspicious of NPC undergo endoscopic-guided biopsy for histological confirmation of NPC, whereas those without a suspicious abnormality are considered to have had a false-positive blood test. However, small tumors hidden in the pharyngeal recess, adenoid or beneath the mucosa can be missed on endoscopic examination and the number of such tumors in populations screened for NPC is unknown.5–8 Some studies had found that neoplastic spindle cells have features of epithelial mesenchymal transition (EMT) and cancer stem cells (CSCs), and should be considered as the more aggressive subtype in NPC, and the predictors of tumor cell dissemination and metastasis of patients.9,10 Although considerable progress has been achieved in basic and clinical research on NPC, the biomarkers of the progression of NPC is not fully studied and described. Thus, further investigation is beneficial, especially for identification of potential biomarkers to improve survival in patients for whom the NPC is in its early-stages.With the development of sequencing technologies and microarray, we can easily screen the expression level of thousands of genes simultaneously in the human genome.11 Comprehensive analysis of multiple datasets provides the capabilities to properly identify and assess the pathways and genes that mediate the biological processes associated with NPC. Machine learning (ML) is a rapidly advancing field of artificial intelligence (AI) that enables computer technology to learn from data to identify patterns and make predictions without explicit programming.12 ML does not describe a single specific algorithm, but rather contains a variety of approaches that have to be modified to the addressed issue and data set. ML methods are typically classified as supervised learning, unsupervised learning, and reinforcement learning. The input file can be text, images, or anything that is digitally stored.13 AI/ML techniques have been applied to various fields of biomedicine including novel target identification, understanding of target-disease associations, drug candidate selection, protein structure predictions, molecular compound design and optimization, understanding of disease mechanisms, development of new prognostic and predictive biomarkers, biometrics data analysis from wearable devices, imaging, precision medicine, and more recently clinical trial design, conduct, and analysis.14,15 To this end, we used microarray datasets of gene expression to assess the differentially expressed genes (DEGs) between NPC and normal nasopharyngeal tissue, then ML algorithm was used to screen biomarkers in DEGs for early identification of NPC.
Materials and Methods
Data Collection and Data Processing
Data sets of our study were all from the Gene Expression Omnibus (GEO) public database, and five sets of gene expression profiling Chips (GEPC) are selected, including GSE12452, GSE13597, GSE61218, GSE64634 and GSE5381916–22 (Table 1). NPC tissues and normal nasopharyngeal tissues were collected. GSE12452, GSE13597, GSE61218 and GSE64634 were used as training group data sets, GSE53819 was used as verification group data set. The need for further ethics approval was waived by the Ningbo First Hospital Ethics Committee.
Table 1
Characteristics of mRNA Expression Profiles of Nasopharyngeal Carcinoma (NPC)
GEO Series
Expression Type
Platform
Sample Number
Reference
Normal
Tumor
GSE12452
mRNA
GPL570
10
31
Dodd et al; Sengupta et al; Hsu et al16–18
GSE13597
mRNA
GPL96
3
25
Bose et al19
GSE61218
mRNA
GPL19061
6
10
Fan et al20
GSE64634
mRNA
GPL570
4
12
Bo et al21
GSE53819
mRNA
GPL6480
18
18
Bao et al22
Characteristics of mRNA Expression Profiles of Nasopharyngeal Carcinoma (NPC)
Screening of Differentially Expressed Genes (DEGs)
For the microarray dataset (GSE12452, GSE13597, GSE61218 and GSE64634), background correction and normalization were performed by applying the combat algorithm. The limma package23 of R language was applied for standardization of expression matrix and screening of differential expressed genes (DEGs), and then the volcano plot and heatmap were drawn to present the differential expression of DEGs. The DEGs with an adjusted p < 0.05 and |log2FC| ≥2 were considered statistically significant.
Functional Enrichment Analysis
The GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes) enrichment analysis of DEGs were implemented by the clusterProfiler package in R.24 Gene set enrichment analysis (GSEA) was performed on the gene expression matrix through the “clusterProfiler” package and “c2.cp.kegg.v7.4.symbols.gmt” was selected as enrichment analysis gene set to run GSEA software.25 Enrichment results with a p-value <0.05 and false discovery rate (FDR) <0.05 were considered statistically significant.
Screening Characteristic Related Biomarkers via the Comprehensive Strategy
The four algorithms were adopted for screening of novel and key biomarkers for NPC, including random forest (RF) machine learning algorithm,26 least absolute shrinkage and selection operator (LASSO) logistic regression,27 support vector machine-recursive feature elimination28 (SVM-RFE), and weighted gene co-expression network analysis (WGCNA). WGCNA is a systematic biological method used to describe the gene association modes among different samples, and it can be used to identify gene sets with highly synergistic variation and identify candidate biomarkers or therapeutic targets based on the coherence of gene sets and the correlation between gene sets and phenotypes.29 The RF is widely used in medicine as a machine learning algorithm based on decision-tree theory for solving classification problems. RF produces randomly numerous independent tress as an ensemble to avoid overfitting and sensitivity to training data configuration, the predictive performance of RF has similar performance as the best-supervised learning algorithms, RF efficiently estimates the test error without incurring the cost of repeated model training associated with cross-validation, RF is flexible and has very high accuracy. SVM-RFE was a machine learning algorithm based on a support vector machine used to find the best variables by deleting feature vectors generated by SVM, SVM module was established to further identify the diagnostic value of these biomarkers in NPC by e1071 package.30 Receiver operating characteristic (ROC) curves were established to evaluate the diagnostic significance of NPC-related biomarkers using the pROC package in R, and the area under the ROC curve (AUC) indicated the magnitude of diagnostic efficiency.31 P<0.05 was considered to indicate a statistically significant difference. The input files of the ML model was the expression files of the differential genes in all samples. X-axis label was set to the expression level of the differential genes, y-axis set to the type of the sample. RF, LASSO and SVM were chosen as ML methods. The validation method was performed according to the cross validation. ML model parameters were set as follows: randomForest (ntree=500); LASSO cvfit=cv.glmnet (family=“binomial”, alpha=1, type.measure=“deviance”, nfolds=10); SVM=rfe (functions=caretFuncs, method=“cv”, methods=“svmRadial”). Characteristic genes with the minimum cross-validation error were used as output files.
Validation of the Diagnosis-Related Gene Signature
GSE53819 was used as verification group data set. To validate whether the candidate genes have important diagnostic value in patients with NPC, we also measured the candidate genes’ differential expression, ROC curve value and AUC value in the validation set.
Evaluation and Correlation Analysis of Infiltrating Immune Cells
The CIBERSORT algorithm was used to analyze the normalized gene expression data obtained previously, and the proportions of 22 kinds of immune cells were determined.32 A correlation heatmap was produced to detect the associations of each of the immune cells with the others in NPC samples via the “corrplot” package.33 The “ggstatsplot” package was used to perform the Spearman correlation analysis on diagnostic markers and infiltrating immune cells, and the “ggplot2” package was used to visualize the results.
Results
Although previous studies have reported biomarkers associated with NPC, the relationship between the immune infiltration characteristics and these biomarkers of NPC remains unclear. In this study, we performed a comprehensive analysis of ML algorithms to screen potential biomarkers associated with NPC, including RF, LASSO, SVM-RFE, WGCNA. By using CIBERSORT algorithm, we found the difference of immune infiltration between cancer and normal tissue of 22 subpopulations of immune cells in NPC. Ultimately, DTL has been screened as candidate NPC-related biomarker and immune infiltration characteristics of DTL were analyzed.
Screening of DEGs in Different Datasets
The DEGs of integrated data chip (GSE12452, GSE13597, GSE61218 and GSE64634) were identified by limma package. According to the criteria (adjusted p-value < 0.05 and |log2FC| > 2), a total of 46 DEGs were identified in the integrated data chip, including 11 up-regulated and 35 down-regulated genes. The DEGs data were processed by “pheatmap” and “ggrepel” packages in the R program to draw a heatmap and volcano plot of the significantly changed genes (Figure 1A and B).
Figure 1
DEGs in the integrated dataset of NPC. (A) The volcano plots of DEGs, the red and green dots represent up-regulated and down-regulated genes, respectively. (B) The heatmap of DEGs.
DEGs in the integrated dataset of NPC. (A) The volcano plots of DEGs, the red and green dots represent up-regulated and down-regulated genes, respectively. (B) The heatmap of DEGs.Continued.Functional enrichment analysis of DEGs. (A) Results of GO functional enrichment analysis of the DEGs, including BP, MF and CC. (B) KEEG enrichment analysis revealed signaling pathways highly associated with NPC. (C) The top five signaling pathways in normal nasopharyngeal tissue based on GSEA are shown. (D) GSEA showed that the top five signaling pathways were most related to NPC.
Functional Enrichment Analyses of DEGs
GO enrichment analysis shows the top five GO terms. Biological process (BP) enrichment showed that the common DEGs were enriched in neutrophil degranulation, neutrophil activation involved in immune response, neutrophil mediated immunity, antimicrobial humoral response, and neutrophil activation. The cellular component (CC) part is mainly enriched in secretory granule lumen, cytoplasmic vesicle lumen, vesicle lumen, specific granule lumen and microvillus membrane. GO molecular function (MF) showed that the up-regulated DEGs were remarkably enriched in glycosaminoglycan binding, chemokine activity, serine-type endopeptidase activity, chemokine receptor binding and heparin binding (Figure 2A). KEGG pathway analysis revealed that the DEGs were mainly enriched in the IL-17 signaling pathway, viral protein interaction with cytokine and cytokine receptor, ovarian steroidogenesis, arachidonic acid metabolism and TNF signaling pathway were highly related to NPC pathology (Figure 2B). The GSEA analysis results showed that B cell receptor signaling pathway, metabolism of xenobiotics by cytochrome P450, retinol metabolism, tyrosine metabolism and drug metabolism cytochrome P450 were highly active in normal nasopharyngeal tissue, while cell cycle, DNA replication, small cell lung cancer, ECM receptor interaction and P53 signaling pathway were highly active in NPC tissue (Figure 2C and D).
Screening Characteristic-Related Biomarkers via the Comprehensive Strategy
We utilized LASSO logistic regression algorithm to identify 7 genes from DEGs as biomarkers for NPC (Figure 3A). Six genes were recognized as vital biomarkers with RF algorithm (Figure 3B and C). Six genes were detected from DEGs using the SVM-RFE algorithm as diagnostic markers (Figure 3D). To identify sets of genes that are highly correlated in their expression modules, we performed hierarchical clustering on a batch-controlled, rlog transformed expression data using WGCNA. The soft threshold power 5 was chosen to define the adjacency matrix based on the criterion of approximately scale-free topology. Then, we set MEDissThres as 0.25 to merge similar modules, and a total of 8 modules were identified. The hub genes in brown and turquoise module were highly expressed in tumor samples (Figure 4A–C). Finally, we obtained DTL that was significantly associated with NPC by the four algorithms were overlapped (Figure 5A and B).
Figure 4
(A) The cluster dendrogram of genes in independent data sets. Branches of the cluster dendrogram of the most connected genes gave rise to eight gene coexpression modules. (B) Relationships of consensus modules with samples. Different color represents a specific module, containing a cluster of highly correlated genes. (C) Soft-threshold power determination for WGCNA by analysis of the scale-free fit index and mean connectivity for various soft-threshold powers.
Figure 5
(A) The venn diagram showed the intersection of diagnostic markers obtained by four algorithms. (B) ROC curves of DTL in the training dataset.
Continued.Screening characteristic related biomarkers via comprehensive strategy. (A) The LASSO logistic regression algorithm was performed to retain the most predictive features. (B) Screening biomarkers based on random forest (RF) machine learning algorithm. (C) Results of screening biomarkers based on RF. (D) Results of screening biomarkers based on sSVM-RFE algorithm.(A) The cluster dendrogram of genes in independent data sets. Branches of the cluster dendrogram of the most connected genes gave rise to eight gene coexpression modules. (B) Relationships of consensus modules with samples. Different color represents a specific module, containing a cluster of highly correlated genes. (C) Soft-threshold power determination for WGCNA by analysis of the scale-free fit index and mean connectivity for various soft-threshold powers.(A) The venn diagram showed the intersection of diagnostic markers obtained by four algorithms. (B) ROC curves of DTL in the training dataset.In order to further verify the potentials of DTL as diagnostic markers of NPC, we conducted ROC analysis of these genes in the expression data set GSE53819 and drew the ROC curve (AUC>0.900, P<0.01) (Figure 6A and B).
Figure 6
Validation of the diagnosis-related gene signature. (A) The expression of DTL in GSE53819. (B) ROC curves of DTL in GSE53819.
Validation of the diagnosis-related gene signature. (A) The expression of DTL in GSE53819. (B) ROC curves of DTL in GSE53819.Continued.Immune cells infiltration analysis. (A) Pattern of infiltration of 22 kinds of immune cells in normal and tumor groups. (B) The violin plot showed the difference in 22 infiltrating immune cells between NPC and normal nasopharyngeal tissue. (C) The correlation heatmap was drawn to display the correlations of 22 types of infiltrated immune cells. The size of color square represents correlation intensity, red represents the positive correlation, and blue represents the negative correlation.
Analysis of Infiltrating Immune Cells
The infiltration abundance matrix of 22 kinds of immune cells in integrated data sets was calculated using CIBERSORT algorithm (Figure 7A). The violin plot showed that the immune infiltration of macrophages M0, macrophages M1 and T cells CD4 memory activated was more, while that of B cells naive, B cells memory and T cells CD4 memory resting was less (Figure 7B). Correlation heatmap of the 22 types of immune cells revealed that monocytes and eosinophils had a significant positive correlation. B cells naive were positively correlated with T cells follicular helper, and NK cells activated and monocytes also positively correlate. While mast cells resting were negatively associated with mast cells activated, macrophages M1 and B cells memory also negatively correlate (Figure 7C).
Correlation Analysis Between Related Biomarkers and Infiltrating Immune Cells
Correlation analysis showed that DTL was positively correlated with macrophages M1 (r = 0.461, p < 0.01), neutrophils (r = 0.289, p < 0.01) and T cells CD4 memory activated (r = 0.402, p < 0.01). DTL was negatively correlated with B cells memory (r = −0.606, p < 0.01) and T cells CD4 memory resting (r = −0.367, p < 0.01) (Figure 8).
Figure 8
Correlation between DTL and infiltrating immune cells. The lower the p-value, the more green the color, and the higher the p-value, the yellow the color.
Correlation between DTL and infiltrating immune cells. The lower the p-value, the more green the color, and the higher the p-value, the yellow the color.
Discussion
Early diagnosis of some NPC patients is very difficult, and the number of candidate biomarkers for NPC is very few according to current studies. Therefore, further study on biomarkers for the diagnosis of NPC is important. In this study, we identified DTL as candidate NPC-related biomarker based on ML method and immune cells differentially distributed between NPC tissue and normal nasopharyngeal tissue. Furthermore, we explored the correlations between DTL and immune cells.We identified 46 significant DEGs using limma package, including 11 up-regulated genes and 35 down-regulated genes. GO analysis showed that DEGs were mainly concentrated in antimicrobial humoral response, neutrophil degranulation, neutrophil activation involved in immune response, neutrophil-mediated immunity, and neutrophil activation. The KEGG analysis results showed that IL-17 signaling pathway was highly related to NPC pathology. The interleukin-17 (IL-17) family is a subset of cytokines consisting of IL-17A-F that play crucial roles in autoimmune disease and tumor progression. IL-17A has been demonstrated to be upregulated in a wide variety of biologically distinct cancers, including kidney cancer, gastric cancer, breast cancer, cervical cancer and lung cancer.34–36 IL-17A has been reported to control various processes involved in the malignant transformation of cells, such as cell proliferation, one of the major causes of mortality in cancer.37,38 IL17A stimulation increased the proliferation of human NPC cells in vitro.39 Besides, the top five KEGG terms with inverted gene set enrichment included viral protein interaction with cytokine and cytokine receptor, ovarian steroidogenesis, arachidonic acid metabolism and TNF signaling pathway were also related to NPC pathology. The enrichment pathways of GSEA showed that cell cycle, DNA replication, ECM receptor interaction and P53 signaling pathway were highly active in NPC tissue, and the hyperactivity of these pathways may be associated with the development and progression of NPC.WGCNA is a prevalent systems biology tool used to construct gene co-expression networks, which can be used to detect disease-associated gene clusters and identify therapeutic targets. In order to improve the usability of NPC-related biomarkers for pre-screening purposes, several different approaches were used, including RF, LASSO logistic regression and SVM-RFE. We performed explorative LASSO logistic regression, which performs automatic variable selection and penalizes regression coefficients to decrease overfitting. RF can deal with classification problems with unbalanced, multiclass, and small sample data. Variable selection is performed by means of Support Vector Machine Recursive Feature Elimination (SVM-RFE) for non-linear kernels. To develop biomarkers associated with diagnosis of NPCS, we combined the intersection of four algorithms.40 Finally, DTL was selected as biomarkers to identify NPC.DTL, also known as retinoic acid-regulated nuclear matrix-associated protein (RAMP), or DNA replication factor 2 (CDT2), is reported to be correlated with the cell proliferation, cell cycle arrest and cell invasion in hepatocellular carcinoma, breast cancer and gastric cancer.41 DTL is a substrate receptor for the CRL4 ubiquitin ligase, serving as a key regulator of the cell cycle and genomic stability. Along with the substrate receptor DTL, the CRL4 ubiquitin ligase promotes the ubiquitin-dependent degradation of several proteins essential for cell cycle progression as well as for DNA replication and repair.42 The expression level of DTL was found to be elevated in human malignancies including breast cancer and ovarian cancer. Besides, its potential as a prognostic biomarker in gastric cancer and Ewing sarcoma has been reported. Furthermore, data from TCGA revealed that patients with melanoma with higher DTL expression exhibit shorter disease-free survival (DFS) and overall survival (OS).43–46 Previous studies have shown that DTL might make cancer cells become addicted. This phenomenon has been termed “non-oncogene addiction” in reference to the increased dependence of cancer cells on the normal cellular functions of certain genes, which themselves are not classical oncogenes. Research has demonstrated that DTL depletion can induce apoptosis in different cancer cell lines without affecting non-cancer cell lines. Consequently, the “non-oncogene addiction” feature facilitates DTL signalling as a potential therapeutic target.47–49To quantify the relative proportions of infiltrating immune cells from the gene expression profiles in NPC, a bioinformatics algorithm called CIBERSORT was used to calculate immune cell infiltration. CIBERSORT has been increasingly used to estimate the infiltration of immune cells due to its favourable performance.50,51 We used CIBERSORT to further evaluate the immune infiltration of NPC to explore the role of immune cell infiltration in NPC, and analyzed the correlation between related biomarker and infiltrating immune cells. We discovered that the expression of DTL was positively correlated with macrophages M1, neutrophils and T cells CD4 memory activated levels in NPC group. While was negatively correlated with B cells memory and T cells CD4 memory resting. In addition, we found higher immune infiltration levels of macrophages M0, macrophages M1 and T cells CD4 memory activated in NPC group. Although studies have shown that changes in immune microenvironment are closely related to the occurrence and development of NPC, the specific mechanism remains unclear,52,53 4-mRNA signature (U2AF1L5, TMEM265, GLB1L and MLF1), immune subtypes and constitutive activation of the NF-κB inflammatory pathways were considered as possible mechanisms.54–56 Although more research is needed, we speculated that changes in immune microenvironment caused by overexpression of DTL might be one of the mechanisms of NPC based on the results of this study. The limitation of this study is that the conclusion has not been verified by immunohistochemistry. In the future study, we will scrupulously design experiments and collect nasopharyngeal cancer samples for immunohistochemistry to verify the conclusion of this study.
Conclusions
In summary, we found that DTL was biomarker associated with NPC. Macrophages M0, macrophages M1 and T cells CD4 memory activated are related to NPC occurrence. Further research on biomarkers of NPC will help us to understand the internal mechanism of the occurrence and development of NPC, while help us to diagnose NPC early so that more NPC patients can obtain a better prognosis.
Authors: Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov Journal: Proc Natl Acad Sci U S A Date: 2005-09-30 Impact factor: 11.205
Authors: Baoai Han; Xiuping Yang; Po Zhang; Ya Zhang; Yaqin Tu; Zuhong He; Yongqin Li; Jie Yuan; Yaodong Dong; Davood K Hosseini; Tao Zhou; Haiying Sun Journal: PLoS One Date: 2020-04-09 Impact factor: 3.240