| Literature DB >> 34307679 |
Lei Chen1,2, Zhandong Li3, Tao Zeng4, Yu-Hang Zhang5, KaiYan Feng6, Tao Huang4,7, Yu-Dong Cai1.
Abstract
COVID-19, a severe respiratory disease caused by a new type of coronavirus SARS-CoV-2, has been spreading all over the world. Patients infected with SARS-CoV-2 may have no pathogenic symptoms, i.e., presymptomatic patients and asymptomatic patients. Both patients could further spread the virus to other susceptible people, thereby making the control of COVID-19 difficult. The two major challenges for COVID-19 diagnosis at present are as follows: (1) patients could share similar symptoms with other respiratory infections, and (2) patients may not have any symptoms but could still spread the virus. Therefore, new biomarkers at different omics levels are required for the large-scale screening and diagnosis of COVID-19. Although some initial analyses could identify a group of candidate gene biomarkers for COVID-19, the previous work still could not identify biomarkers capable for clinical use in COVID-19, which requires disease-specific diagnosis compared with other multiple infectious diseases. As an extension of the previous study, optimized machine learning models were applied in the present study to identify some specific qualitative host biomarkers associated with COVID-19 infection on the basis of a publicly released transcriptomic dataset, which included healthy controls and patients with bacterial infection, influenza, COVID-19, and other kinds of coronavirus. This dataset was first analysed by Boruta, Max-Relevance and Min-Redundancy feature selection methods one by one, resulting in a feature list. This list was fed into the incremental feature selection method, incorporating one of the classification algorithms to extract essential biomarkers and build efficient classifiers and classification rules. The capacity of these findings to distinguish COVID-19 with other similar respiratory infectious diseases at the transcriptomic level was also validated, which may improve the efficacy and accuracy of COVID-19 diagnosis.Entities:
Year: 2021 PMID: 34307679 PMCID: PMC8272456 DOI: 10.1155/2021/9939134
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1Entire procedures to investigate the blood expression profiles of acute respiratory infection samples. The profiles are retrieved from Gene Expression Omnibus. They are first analysed by Boruta and mRMR methods, resulting in a feature list. Such list is fed into the incremental feature selection method to extract essential biomarker genes/transcripts, build efficient classifiers, and construct classification rules.
Figure 2IFS curves with different classifiers on different numbers of gene features. The SVM provides the highest MCC of 0.917 when top 168 features are adopted.
Performance of the optimum classifiers based on different classification algorithms.
| Classification algorithm | Number of features | Overall accuracy | MCC |
|---|---|---|---|
| Decision tree | 511 | 0.867 | 0.818 |
|
| 183 | 0.882 | 0.845 |
| Support vector machine | 168 | 0.938 | 0.917 |
| Random forest | 565 | 0.923 | 0.896 |
Figure 3Performance of the optimum classifiers with four different classification algorithms on five categories.
Figure 4Pie chart to show the distribution of 21 classification rules on five categories.
Gene Ontology and KEGG pathway enrichment results.
| Index | Term | Category |
|---|---|---|
| 1 | SRP-dependent cotranslational protein targeting to membrane | GOTERM_BP_DIRECT |
| 2 | Viral transcription | GOTERM_BP_DIRECT |
| 3 | Nuclear-transcribed mRNA catabolic process, nonsense-mediated decay | GOTERM_BP_DIRECT |
| 4 | Translational initiation | GOTERM_BP_DIRECT |
| 5 | Ribosome | KEGG_PATHWAY |
| 6 | Ribosome | GOTERM_CC_DIRECT |
| 7 | Translation | GOTERM_BP_DIRECT |
| 8 | Structural constituent of ribosome | GOTERM_MF_DIRECT |
| 9 | rRNA processing | GOTERM_BP_DIRECT |
| 10 | Cytosolic large ribosomal subunit | GOTERM_CC_DIRECT |
| 11 | Cytosolic small ribosomal subunit | GOTERM_CC_DIRECT |
| 12 | Poly(A) RNA binding | GOTERM_MF_DIRECT |
| 13 | Focal adhesion | GOTERM_CC_DIRECT |
| 14 | Membrane | GOTERM_CC_DIRECT |
| 15 | RNA binding | GOTERM_MF_DIRECT |
| 16 | Small ribosomal subunit | GOTERM_CC_DIRECT |
| 17 | Cytosol | GOTERM_CC_DIRECT |
| 18 | Intracellular ribonucleoprotein complex | GOTERM_CC_DIRECT |
| 19 | Extracellular exosome | GOTERM_CC_DIRECT |
| 20 | Extracellular matrix | GOTERM_CC_DIRECT |
| 21 | Ribosomal large subunit assembly | GOTERM_BP_DIRECT |
| 22 | Nucleolus | GOTERM_CC_DIRECT |
| 23 | Cytoplasmic translation | GOTERM_BP_DIRECT |