| Literature DB >> 34642613 |
Laura Verde1, Giuseppe De Pietro2, Giovanna Sannino2.
Abstract
Healthcare sensors represent a valid and non-invasive instrument to capture and analyse physiological data. Several vital signals, such as voice signals, can be acquired anytime and anywhere, achieved with the least possible discomfort to the patient thanks to the development of increasingly advanced devices. The integration of sensors with artificial intelligence techniques contributes to the realization of faster and easier solutions aimed at improving early diagnosis, personalized treatment, remote patient monitoring and better decision making, all tasks vital in a critical situation such as that of the COVID-19 pandemic. This paper presents a study about the possibility to support the early and non-invasive detection of COVID-19 through the analysis of voice signals by means of the main machine learning algorithms. If demonstrated, this detection capacity could be embedded in a powerful mobile screening application. To perform this important study, the Coswara dataset is considered. The aim of this investigation is not only to evaluate which machine learning technique best distinguishes a healthy voice from a pathological one, but also to identify which vowel sound is most seriously affected by COVID-19 and is, therefore, most reliable in detecting the pathology. The results show that Random Forest is the technique that classifies most accurately healthy and pathological voices. Moreover, the evaluation of the vowel /e/ allows the detection of the effects of COVID-19 on voice quality with a better accuracy than the other vowels. © King Fahd University of Petroleum & Minerals 2021.Entities:
Keywords: COVID-19 detection; Healthcare sensors; Machine learning techniques; Voice analysis; Vowel sounds
Year: 2021 PMID: 34642613 PMCID: PMC8500467 DOI: 10.1007/s13369-021-06041-4
Source DB: PubMed Journal: Arab J Sci Eng ISSN: 2191-4281 Impact factor: 2.807
Details about the subjects involved in this study. For the age, we report the mean and standard deviation (SD)
| Category | Gender | No | Age |
|---|---|---|---|
| Mean SD | |||
| Healthy | Female | 21 | 29.6 ± 10.1 |
| Male | 62 | 36.4 ± 13.1 | |
|
|
| ||
| Covid-positive | Female | 25 | 32.08 ± 11.4 |
| Male | 58 | 31.2 ± 11.5 | |
|
|
| ||
| Total | Female | 46 | 30.9 ± 10.7 |
| Male | 120 | 33.9 ± 12.6 | |
|
|
|
Bold italics indicate the total obtained for each category (healthy, covid-positive and all subjects involved in this study)
Fig. 1An overview of Support Vector Machine algorithm
Fig. 2An overview of Adaboost algorithm
Fig. 3An overview of Random Forest algorithm
Results achieved on the training set for the vowels /a/, /e/ and /o/
| Algorithm | Sensibility (%) | Specificity (%) | Accuracy (%) | Precision (%) | F1-score (%) | AUC |
|---|---|---|---|---|---|---|
| Naive Bayes [ | 96.97 | 53.03 | 75.00 | 67.37 | 79.50 | 0.799 |
| Bayes Net [ | 93.94 | 78.79 | 86.36 | 81.58 | 87.32 | 0.953 |
| SVM [ | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 1.000 |
| SGD [ | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 1.000 |
| Ibk [ | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 1.000 |
| LWL [ | 68.18 | 83.33 | 75.76 | 80.36 | 73.77 | 0.941 |
| Adaboost [ | 92.42 | 89.39 | 90.91 | 89.71 | 91.04 | 0.966 |
| Bagging [ | 98.48 | 92.42 | 95.45 | 92.86 | 95.59 | 0.988 |
| OneR [ | 71.21 | 86.36 | 78.79 | 83.93 | 77.05 | 0.788 |
| Decision Table [ | 86.36 | 78.79 | 82.58 | 80.28 | 83.21 | 0.847 |
| J48 [ | 98.48 | 100.00 | 99.24 | 100.00 | 99.24 | 1.000 |
| Random Forest [ | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 1.000 |
Results achieved on the testing set for the vowels /a/, /e/ and /o/
| Algorithm | Sensibility (%) | Specificity (%) | Accuracy (%) | Precision (%) | F1-score (%) | AUC |
|---|---|---|---|---|---|---|
| Naive Bayes [ | 100.00 | 35.29 | 67.65 | 60.71 | 75.56 | 0.706 |
| Bayes Net [ | 88.24 | 29.41 | 58.82 | 55.56 | 68.18 | 0.671 |
| SVM [ | 94.12 | 52.94 | 73.53 | 66.67 | 78.05 | 0.735 |
| SGD [ | 94.12 | 47.06 | 70.59 | 64.00 | 76.19 | 0.706 |
| Ibk [ | 88.24 | 47.06 | 67.65 | 62.50 | 73.17 | 0.676 |
| LWL [ | 58.82 | 52.94 | 55.88 | 55.56 | 57.14 | 0.647 |
| Adaboost [ | 70.59 | 76.47 | 73.53 | 75.00 | 72.73 | 0.785 |
| Bagging [ | 76.47 | 64.71 | 70.59 | 68.42 | 72.22 | 0.747 |
| OneR [ | 29.41 | 76.47 | 52.94 | 55.56 | 38.46 | 0.529 |
| Decision Table [ | 64.71 | 47.06 | 55.88 | 55.00 | 59.46 | 0.554 |
| J48 [ | 64.71 | 52.94 | 58.82 | 57.89 | 61.11 | 0.588 |
| Random Forest [ | 94.12 | 70.59 | 82.35 | 76.19 | 84.21 | 0.901 |
Results achieved on the testing set for the vowel /e/
| Algorithm | Sensibility (%) | Specificity (%) | Accuracy (%) | Precision (%) | F1-score (%) | AUC |
|---|---|---|---|---|---|---|
| Naive Bayes [ | 70.59 | 70.59 | 70.59 | 70.59 | 70.59 | 0.754 |
| Bayes Net [ | 52.94 | 82.35 | 67.65 | 75.00 | 62.07 | 0.749 |
| SVM [ | 84.62 | 71.43 | 76.47 | 64.71 | 73.33 | 0.765 |
| SGD [ | 64.71 | 76.47 | 70.59 | 73.33 | 68.75 | 0.706 |
| Ibk [ | 58.82 | 52.94 | 55.88 | 55.56 | 57.14 | 0.559 |
| LWL [ | 35.29 | 100.00 | 67.65 | 100.00 | 52.17 | 0.709 |
| Adaboost [ | 64.71 | 76.47 | 70.59 | 73.33 | 68.75 | 0.763 |
| Bagging [ | 82.35 | 70.59 | 76.47 | 73.68 | 77.78 | 0.820 |
| OneR [ | 76.47 | 29.41 | 52.94 | 52.00 | 61.90 | 0.529 |
| Decision Table [ | 58.82 | 64.71 | 61.76 | 62.50 | 60.61 | 0.606 |
| J48 [ | 58.82 | 70.59 | 64.71 | 66.67 | 62.50 | 0.652 |
| Random Forest [ | 76.47 | 94.12 | 85.29 | 92.86 | 83.87 | 0.867 |
Results achieved on the testing set for the vowel /a/
| Algorithm | Sensibility (%) | Specificity (%) | Accuracy (%) | Precision (%) | F1-score (%) | AUC |
|---|---|---|---|---|---|---|
| Naive Bayes [ | 82.35 | 23.53 | 52.94 | 51.85 | 63.64 | 0.713 |
| Bayes Net [ | 47.06 | 70.59 | 58.82 | 61.54 | 53.33 | 0.616 |
| SVM [ | 52.94 | 82.35 | 67.65 | 75.00 | 62.07 | 0.676 |
| SGD [ | 47.06 | 94.12 | 70.59 | 88.89 | 61.54 | 0.706 |
| Ibk [ | 52.94 | 70.59 | 61.76 | 64.29 | 58.06 | 0.618 |
| LWL [ | 47.06 | 58.82 | 52.94 | 53.33 | 50.00 | 0.626 |
| Adaboost [ | 41.18 | 70.59 | 55.88 | 58.33 | 48.28 | 0.683 |
| Bagging [ | 70.59 | 70.59 | 70.59 | 70.59 | 70.59 | 0.744 |
| OneR [ | 47.06 | 58.82 | 52.94 | 53.33 | 50.00 | 0.529 |
| Decision Table [ | 52.94 | 58.82 | 55.88 | 56.25 | 54.55 | 0.578 |
| J48 [ | 58.82 | 76.47 | 67.65 | 71.43 | 64.52 | 0.713 |
| Random Forest [ | 47.06 | 76.47 | 61.76 | 66.67 | 55.17 | 0.739 |
Results achieved on the testing set for the vowel /o/
| Algorithm | Sensibility (%) | Specificity (%) | Accuracy (%) | Precision (%) | F1-score (%) | AUC |
|---|---|---|---|---|---|---|
| Naive Bayes [ | 94.12 | 11.76 | 52.94 | 51.61 | 66.67 | 0.585 |
| Bayes Net [ | 64.71 | 35.29 | 50.00 | 50.00 | 56.41 | 0.533 |
| SVM [ | 70.59 | 58.82 | 64.71 | 63.16 | 66.67 | 0.647 |
| SGD [ | 70.59 | 35.29 | 52.94 | 52.17 | 60.00 | 0.529 |
| Ibk [ | 64.71 | 70.59 | 67.65 | 68.75 | 66.67 | 0.676 |
| LWL [ | 76.47 | 58.82 | 67.65 | 65.00 | 70.27 | 0.775 |
| Adaboost [ | 64.71 | 41.18 | 52.94 | 52.38 | 57.89 | 0.576 |
| Bagging [ | 64.71 | 35.29 | 50.00 | 50.00 | 0.474 | |
| OneR [ | 47.06 | 35.29 | 41.18 | 42.11 | 44.44 | 0.412 |
| Decision Table [ | 76.47 | 41.18 | 58.82 | 56.52 | 65.00 | 0.592 |
| J48 [ | 58.82 | 41.18 | 50.00 | 50.00 | 54.05 | 0.517 |
| Random Forest [ | 70.59 | 52.94 | 61.76 | 60.00 | 64.86 | 0.666 |
Results achieved on the training set for the vowel /a/
| Algorithm | Sensibility (%) | Specificity (%) | Accuracy (%) | Precision (%) | F1-score (%) | AUC |
|---|---|---|---|---|---|---|
| Naive Bayes [ | 92.42 | 28.79 | 60.61 | 56.48 | 70.11 | 0.801 |
| Bayes Net [ | 78.79 | 68.18 | 73.48 | 71.23 | 74.82 | 0.775 |
| SVM [ | 95.45 | 93.94 | 94.70 | 94.03 | 94.74 | 0.947 |
| SGD [ | 95.45 | 100.00 | 97.73 | 100.00 | 97.67 | 0.977 |
| Ibk [ | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 1.000 |
| LWL [ | 80.30 | 65.15 | 72.73 | 69.74 | 74.65 | 0.908 |
| Adaboost [ | 78.79 | 93.94 | 86.36 | 92.86 | 85.25 | 0.952 |
| Bagging [ | 89.39 | 87.88 | 88.64 | 88.06 | 88.72 | 0.958 |
| OneR [ | 75.00 | 86.36 | 81.15 | 82.35 | 78.50 | 0.750 |
| Decision Table [ | 75.76 | 71.21 | 73.48 | 72.46 | 74.07 | 0.753 |
| J48 [ | 98.48 | 96.97 | 97.73 | 97.01 | 97.74 | 0.993 |
| Random Forest [ | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 1.000 |
Results achieved on the training set for the vowel /e/
| Algorithm | Sensibility (%) | Specificity (%) | Accuracy (%) | Precision (%) | F1-score (%) | AUC |
|---|---|---|---|---|---|---|
| Naive Bayes [ | 66.67 | 84.85 | 75.76 | 81.48 | 73.33 | 0.861 |
| Bayes Net [ | 60.61 | 91.30 | 76.30 | 86.96 | 71.43 | 0.880 |
| SVM [ | 96.97 | 95.45 | 96.21 | 95.52 | 96.24 | 0.960 |
| SGD [ | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 1.000 |
| Ibk [ | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 1.000 |
| LWL [ | 78.79 | 98.48 | 88.64 | 98.11 | 87.39 | 0.941 |
| Adaboost [ | 80.30 | 96.97 | 88.64 | 96.36 | 87.60 | 0.965 |
| Bagging [ | 89.39 | 87.88 | 88.64 | 88.06 | 88.72 | 0.961 |
| OneR [ | 86.36 | 71.21 | 78.79 | 75.00 | 80.28 | 0.788 |
| Decision Table [ | 89.16 | 80.30 | 86.64 | 91.93 | 90.52 | 0.800 |
| J48 [ | 96.97 | 98.48 | 97.73 | 98.46 | 97.71 | 0.991 |
| Random Forest [ | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 1.000 |
Results achieved on the training set for the vowel /o/
| Algorithm | Sensibility (%) | Specificity (%) | Accuracy (%) | Precision (%) | F1-score (%) | AUC |
|---|---|---|---|---|---|---|
| Naive Bayes [ | 96.97 | 28.79 | 62.88 | 57.66 | 72.32 | 0.647 |
| Bayes Net [ | 95.45 | 72.73 | 84.09 | 77.78 | 85.71 | 0.900 |
| SVM [ | 89.39 | 86.36 | 87.88 | 86.76 | 88.06 | 0.879 |
| SGD [ | 96.97 | 95.45 | 96.21 | 95.52 | 96.24 | 0.962 |
| Ibk [ | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 1.000 |
| LWL [ | 83.33 | 69.70 | 76.52 | 73.33 | 78.01 | 0.919 |
| Adaboost [ | 89.39 | 80.30 | 84.85 | 81.94 | 85.51 | 0.923 |
| Bagging [ | 93.94 | 91.18 | 92.54 | 91.18 | 92.54 | 0.993 |
| OneR [ | 83.33 | 72.73 | 78.03 | 75.34 | 79.14 | 0.780 |
| Decision Table [ | 95.45 | 54.55 | 75.00 | 67.74 | 79.25 | 0.750 |
| J48 [ | 100.00 | 98.48 | 99.24 | 98.51 | 99.25 | 1.000 |
| Random Forest [ | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 1.000 |