| Literature DB >> 36168586 |
Amin Khodaei1, Parvaneh Shams2, Hadi Sharifi1, Behzad Mozaffari-Tazehkand1.
Abstract
Corona disease has become one of the problems and challenges of humankind over the past two years. One of the problems that existed from the first days of this epidemic was clinical symptoms similar to other infectious viruses such as colds and influenza. Therefore, diagnosis of this disease and its coping and treatment approaches is also been difficult. In this study, Attempts has been made to investigate the origin of this disease and the genetic structure of the virus leading to it. For this purpose, signal processing and linear predictive coding approaches were used which are widely used in data compression. A pattern recognition model was presented for the detection and separation of covid samples from the influenza virus case studies. This model, which was based on support vector machine classifier, was tested successfully on several datasets collected from different countries. The obtained results performed on all datasets by more than 98% accuracy. The proposed model, in addition to its good performance accuracy, can be a step forward in quantifying and digitizing medical big data information.Entities:
Keywords: Corona; DNA Sequence; Linear predictive coding; Machine learning; Signal processing; Support vector machine
Year: 2022 PMID: 36168586 PMCID: PMC9500098 DOI: 10.1016/j.bspc.2022.104192
Source DB: PubMed Journal: Biomed Signal Process Control ISSN: 1746-8094 Impact factor: 5.076
Fig. 1The proposed approach flowchart
Fig. 2Sliding window technique procedure on the corresponding signal vectors of nucleotide sequences
The common kernel functions of the SVM learning model
| Linear | |
| Polynomial | |
| Sigmoid | |
| RBF | |
| MLP |
Fig. 3Procedure of a genomic sequence feature extraction steps
Comparison of obtained results by changing various parameters
| no | Order | Window | Accuracy |
|---|---|---|---|
| 1 | 40 | 60 | 0.951 |
| 2 | 40 | 90 | 0.952 |
| 3 | 40 | 150 | 0.943 |
| 4 | 20 | 40 | 0.961 |
| 5 | 20 | 90 | 0.964 |
| 6 | 20 | 150 | |
| 7 | 20 | 200 | 0.961 |
| 8 | 10 | 40 | 0.943 |
| 9 | 10 | 90 | 0.950 |
| 10 | 10 | 150 | 0.961 |
| 11 | 10 | 200 | 0.961 |
Fig. 410-fold accuracy of different machine learning models comparison
Fig. 5Comparison of the maximum vs standard deviation values on China dataset
Fig. 6The result of two features on some covid samples (a) Maximum feature (b) Standard deviation feature
Fig. 7The result of two features on some non-covid samples (a) Maximum feature (b) Standard deviation feature
Fig. 8Comparison of the obtained maximum vs standard deviation values on United Kingdom dataset
Fig. 9Maximum diagram vs standard deviation feature space on some countries datasets (a) USA (b) Australia (c) France (d) India
Comparative analysis of the proposed method and previously published methods
| CpG island feature selection + KNN classifier | 98 % | |||
| Combinatorial of DFT, DCT, and Moment Invariants techniques + KNN classifier | 100% | |||
| COVID-19 and three types of Influenza viruses | 594 | 99% | ||
| Pseudo-convolutional method + Random Forest and MLP classifier | 99% | |||
| CNN Deep learning | 98% | |||
| Proposed method | 107000 | Sliding window technique on LPC model + SVM classifier | 99 % |