Literature DB >> 31737550

An Ensemble Method for Diagnosis of Parkinson's Disease Based on Voice Measurements.

Razieh Sheibani¹, Elham Nikookar¹, Seyed Enayatollah Alavi¹.

Abstract

BACKGROUND: Parkinson's disease (PD) is the most common destructive neurological disorder after Alzheimer's disease. Unfortunately, there is no specific test such as electroencephalography or blood test for diagnosing the disease. In accordance with the previous studies, about 90% of people with PD have some types of voice abnormalities. Therefore, voice measurements can be used to detect the disease.
METHODS: This study presents an ensemble-based method for identifying patients and healthy samples by class label prediction based on voice frequency characteristics. It includes three stages of data preprocessing, internal classification and ultimate classification. The outcomes of internal classifiers next to primary feature vector of samples are considered the ultimate classifier inputs.
RESULTS: According to the results, the proposed method achieved 90.6% of accuracy, 95.8% of sensitivity, and 75% of specificity, admissible compared to those of other relevant studies.
CONCLUSION: Current experimental outcomes provide a comparative analysis of various machine learning classifiers and confirm that using ensemble-based methods has improved medical diagnostic tasks. Copyright:

Entities: Chemical

Keywords: Classification; ensemble learning; medical diagnostics; parkinson's disease; voice measurements

Year: 2019 PMID： 31737550 PMCID： PMC6839436 DOI： 10.4103/jmss.JMSS_57_18

Source DB: PubMed Journal: J Med Signals Sens ISSN： 2228-7477

Introduction

Parkinson's disease (PD) is the most common destructive neurological disease after Alzheimer's disease. It was first reported by James Parkinson, a British scientist, in 1817. He named the disease a shaking palsy which is known today as PD.[1] According to the European Parkinson Disease Association statistics, about 6.3 million people around the world are involved with this disease.[2] PD is a chronic and progressive disease in which brain dopamine-secreting cells are destroyed. Due to the absence of dopamine, body movements become irregular. Nearly 70%–80% of dopamine-secreting cells are destroyed when primary Parkinson symptoms appear. Tremoring of hands and feet while resting is one of the most common symptoms of the disease. Other symptoms include muscle stiffening, voice changing, slowness of movements, and depression.[3] The average age for symptoms to appear is about 60, even though one among twenty samples demonstrates primary symptoms under the age of 50. Unfortunately, elderly people may associate these symptoms with age-related changes. It should be noted that PD is currently untreatable, but the available drugs significantly reduce the symptoms, mainly at the beginning of the disease.[4] Based on previous studies, about 90% of people with PD have some types of voice abnormalities. Therefore, voice measurements can be used to diagnose and track the progression of PD's symptoms.[5] In order to determine these symptoms, various verbal experiments are designed. These experiments are categorized into two main groups: running speech experiments and sustained phonation experiments. For the first group of experiments, patients are asked to say an expression that includes a sample of linguistic units and for the second one, they are asked to generate a single vowel.[67] In many studies, mentioned verbal experiments are used as an effective element for determining patients and healthy samples. Sakar et al.[8] used a dataset of 40 samples including 20 patients. Each person has been taught to say a set of 26 distinct terms, including sustained vowels, numbers, words, and short sentences. They applied the Summarized Leave One Out technique on each person's voice samples and achieved an accuracy of 77.5%. Zhang et al.[9] applied a combination of Multi Edit Nearest Neighbor and an ensemble learning algorithm on the previous dataset. They obtained an improved classification accuracy of 29.44%. Tsanas et al.[10] used a dataset of 42 participants including 6000 voice samples. They evaluated the Unified Parkinson Disease Rating Scale (UPDRS) which indicates the presence and severity of symptoms by applying linear and nonlinear regression analyses. Their study showed a useful accuracy of 7.5 UPDRS points different from the clinician's estimates. Gil and Johnson[11] used a dataset of 31 samples including 23 patients. They applied artificial neural networks (ANNs) and support vector machines (SVMs) and obtained an accuracy of 90%. Other studies on this dataset are as follows: Ene[12] applied three types of probabilistic neural networks including incremental search, Monte Carlo search, and hybrid search (HS). Their study reported that mentioned techniques are not significantly different, but the HS technique with an accuracy of 81% performs better. Ullah Khan,[13] after data preprocessing by means of data cleaning and recovering missing values, applied three methods of k-nearest neighboring (k-NN), AdaBoost (AB), and random forest (RF) to classify the data. Finally, k-NN achieved the best accuracy of 90.2%. Khemphila and Boonjing[14] applied ANN to classify patients and healthy samples. They obtained about 83.3% of accuracy. Ozcift and Gulten[15] proposed classifier ensemble construction with a rotation forest approach that achieved a classification accuracy of 87.13%. In this study, the mentioned dataset of Gil and Johnson which is a collection of multiple voice recordings has been used.

Methods

This study presents an ensemble-based method for diagnosing patients and healthy samples by class label prediction based on voice frequency characteristics. It includes three stages of data preprocessing, internal classification, and ultimate classification. In the first stage, for improving the processing time, dataset is separated into six subsets according to recorded voice types. Therefore, each dataset sample has only one record in each subset. In the next stage, by applying a set of different internal classifiers, several prediction models are generated. Then, the result of each prediction model is calculated and considered as the next stage's input. At the end, ultimate classifiers determine the final class label of the sample. A schematic illustration of the proposed method is depicted in Figure 1.

Figure 1

Schematic illustration of the proposed method. MLP: Multilayer perceptron, DT: Decision tree, NB: Naive Bayes

Schematic illustration of the proposed method. MLP: Multilayer perceptron, DT: Decision tree, NB: Naive Bayes Because performance evaluation is a vital task in all autodiagnostic systems, three evaluation metrics of accuracy, sensitivity, and specificity have been used to indicate the efficiency of different classifiers. Accuracy is a general metric that allows researchers to measure the efficiency of a method. Sensitivity and specificity refer to the level of sensitivity and specificity of their proposed method, and they are more common in the medical diagnostic domain. The quite description of mentioned metrics is shown in Table 1.

Table 1

Description of performance evaluation metrics

Metric	Formula	Description
Sensitivity		TP refers to cases with PD label that are correctly classified as PDFN refers to cases with PD label that are classified as healthy controls
Accuracy		TN refers to cases with healthy label that are correctly classified as healthy controlsFP refers to cases with healthy label that are classified as PD
Specificity		TN refers to cases with healthy label that are correctly classified as healthy controlsFP refers to cases with healthy label that are classified as PD

PD – Parkinson’s disease; TP – True positive; TN – True negative; FP – False positive; FN – False negative

Description of performance evaluation metrics PD – Parkinson’s disease; TP – True positive; TN – True negative; FP – False positive; FN – False negative

Results

The data for this study, developed by the University of Oxford in cooperation with the National Center for Voice and Speech, Denver, Colorado, include biomedical phonetic measurements of 31 men and 23 women with PD. The duration of diagnosis was from 0 to 28 years, and the age range was between 46 and 85 years, with an average age of 65.8 years. On an average, six voices were recorded for each person.[16] For the analysis of people with PD, 23 voice frequency characteristics were measured that are demonstrated completely in Table 2.

Table 2

Description of voice frequency characteristics

Name	Description
MDVP: Fo (Hz)	Average vocal fundamental frequency
MDVP: Fhi (Hz)	Maximum vocal fundamental frequency
MDVP: Flo (Hz)	Minimum vocal fundamental frequency
MDVP: Jitter (%)MDVP: Jitter (Abs)MDVP: RAPMDVP: PPQJitter: DDP	Fundamental frequency variation measures
MDVP: ShimmerMDVP: Shimmer (dB)Shimmer: APQ3Shimmer: APQ5MDVP: APQShimmer: DDA	Amplitude variation measures
NHRHNR	Ratio of noisetotonal component measures
RPDED2	Nonlinear dynamical complexity measures
DFA	Signal fractal scaling exponent
Spread1Spread2PPE	Nonlinear measures of fundamental frequency variation
Status	Health status

Description of voice frequency characteristics In order to apply the proposed method, the WEKA software (Developed by University of Waikato, Hamilton, New Zealand) was used. WEKA includes a set of data mining and machine learning algorithms. After initial preprocessing of the data and creating six subsets, several well-known classification algorithms including the k-NN (k with values of 1, 3, and 5), SVM, DT, and NB were used. For a better prediction, all these classifiers were applied on the data using 10-fold cross validation. It is worth mentioning that the ABM1 method was used as a resampling method due to the small number of data samples. The obtained accuracy, sensitivity, and specificity measures of applying each internal classifier on the data are summarized in Tables 3-5, respectively.

Table 3

Accuracy measures of applying internal classifiers

	k-NN (k=1)	k-NN (k=3)	k-NN (k=5)	SVM	DT	NB
D1	90.6	62.5	81.2	81.2	87.5	87.5
D2	90.6	78.1	78.1	84.3	87.5	81.2
D3	87.5	84.3	81.2	90.6	93.7	81.2
D4	93.7	75.0	81.2	84.3	90.6	81.2
D5	78.1	75.0	78.1	84.3	93.7	78.1
D6	87.5	75.0	78.1	75.0	78.1	81.2

SVM – Support vector machine; DT – Decision tree; NB – Naive Bayes; k-NN – k-nearest neighboring

Table 5

Specificity measures of applying internal classifiers

	k-NN (k=1)	k-NN (k=3)	k-NN (k=5)	SVM	DT	NB
D1	87.5	50.0	62.5	75.0	87.5	75.0
D2	75.0	75.0	50.0	75.0	75.0	75.0
D3	75.0	62.5	50.0	75.0	75.0	87.5
D4	87.5	62.5	62.5	75.0	87.5	98.0
D5	75.0	62.5	50.0	75.0	75.0	87.5
D6	75.0	75.0	62.5	50.0	87.5	87.5

SVM – Support vector machine; DT – Decision tree; NB – Naive Bayes; k-NN – k-nearest neighboring

Accuracy measures of applying internal classifiers SVM – Support vector machine; DT – Decision tree; NB – Naive Bayes; k-NN – k-nearest neighboring Sensitivity measures of applying internal classifiers SVM – Support vector machine; DT – Decision tree; NB – Naive Bayes; k-NN – k-nearest neighboring Specificity measures of applying internal classifiers SVM – Support vector machine; DT – Decision tree; NB – Naive Bayes; k-NN – k-nearest neighboring According to the results, k-NN algorithm (k with a value of 1) showed better performance than others with about 90% of accuracy. In the next stage, MLP, AB, RF, and voting were applied as the ultimate classifiers, using 10-fold cross validation. The main motive for using this method was to reduce the error rate. In ensemble learning methods, the probability of a mistake in determining a class label or position of a new sample is much less than the prediction mode with only one model. The obtained accuracy, sensitivity, and specificity measures of applying each ultimate classifier are summarized in Tables 6-8, respectively.

Table 6

Accuracy measures of applying ultimate classifiers

	k-NN (k=1)	k-NN (k=3)	k-NN (k=5)	SVM	DT	NB
MLP	78.1	78.1	71.8	90.6	84.3	90.6
AB	78.1	78.1	65.6	87.5	84.3	90.6
Voting	75.0	75.0	75.0	75.0	75.0	75.0
RF	87.5	81.2	71.8	84.3	87.5	87.5

MLP – Multilayer perceptron; AB – AdaBoost; RF – Random forest; SVM – Support vector machine; DT – Decision tree; NB – Naive Bayes; k-NN – k-nearest neighboring

Table 8

Specificity measures of applying ultimate classifiers

	k-NN (k=1)	k-NN (k=3)	k-NN (k=5)	SVM	DT	NB
MLP	75.0	75.0	37.5	75.0	75.0	75.0
AB	75.0	62.5	37.5	75.0	75.0	75.0
Voting	60.0	60.0	60.0	60.0	60.0	60.0
RF	75.0	75.0	37.5	75.0	75.0	75.0

MLP – Multilayer perceptron; AB – AdaBoost; RF – Random forest; SVM – Support vector machine; DT – Decision tree; NB – Naive Bayes; k-NN – k-nearest neighboring

Accuracy measures of applying ultimate classifiers MLP – Multilayer perceptron; AB – AdaBoost; RF – Random forest; SVM – Support vector machine; DT – Decision tree; NB – Naive Bayes; k-NN – k-nearest neighboring Sensitivity measures of applying ultimate classifiers MLP – Multilayer perceptron; AB – AdaBoost; RF – Random forest; SVM – Support vector machine; DT – Decision tree; NB – Naive Bayes; k-NN – k-nearest neighboring Specificity measures of applying ultimate classifiers MLP – Multilayer perceptron; AB – AdaBoost; RF – Random forest; SVM – Support vector machine; DT – Decision tree; NB – Naive Bayes; k-NN – k-nearest neighboring In addition, comparison between the results of ultimate classification stage is depicted in Charts 1-3.

Chart 1

Comparison between accuracy measures of ultimate classification stage

Chart 3

Comparison between specificity measures of ultimate classification stage

Comparison between accuracy measures of ultimate classification stage Comparison between sensitivity measures of ultimate classification stage Comparison between specificity measures of ultimate classification stage In accordance with the results, both MLP (with SVM and NB kernels) and AB (with NB kernel) ultimate classifiers showed better performance and obtained 90.6% of classification accuracy, 95.8% of sensitivity, and 75.0% of specificity. However, the proposed method has achieved acceptable results by applying an ensemble-based approach; another vital challenge is to compare the current work with other previous studies. For this purpose, some previous studies with similar data are summarized in Table 9. Regarding classification accuracy which is a well-known evaluation metric that is used in related studies, the ensemble-based approach has achieved better performance.

Table 9

Comparison between the proposed method and other works

Author	Method	Accuracy (%)
Gil and Johnson[11]	ANN and SVM	90
Ene[12]	IS, MCS and HS	81
Ullah Khan[13]	k-NN, AB and RF	90.2
Khemphila and Boonjing[14]	ANN	83.3
Ozcift and Gulten[15]	Classifier ensemble construction with a rotation forest approach	87.13
Proposed method	Ensemble based	90.6

ANN – Artificial neural network; SVM – Support vector machine; IS – Incremental search; MCS – Monte Carlo search; HS – Hybrid search; AB – AdaBoost; RF – Random forest; k-NN – k-nearest neighboring

Comparison between the proposed method and other works ANN – Artificial neural network; SVM – Support vector machine; IS – Incremental search; MCS – Monte Carlo search; HS – Hybrid search; AB – AdaBoost; RF – Random forest; k-NN – k-nearest neighboring

Discussion

As mentioned previously, there is no specific test such as electroencephalography or blood test for diagnosing PD. Moreover, timely diagnosis of this disease is a vital task for clinical reasons. In these cases, developing a prediction method by means of machine learning techniques will be helpful. In order to identify PD patients, a combination of various machine learning classifiers has been investigated. The proposed method takes the advantage of gathering internal classifiers by sending their outcomes to the next classification stage. Initially, by applying internal classifiers, different models are created on the training data. The outcomes of internal classifiers next to primary feature vector of samples are considered the ultimate classifier inputs. Then, the final classifiers determine the sample class label which results in a lower error rate. In accordance with the results, both MLP and AB classifiers showed better performance and obtained 90.6% of classification accuracy, 95.8% of sensitivity, and 75% of specificity, admissible compared to those of other relevant studies.

Conclusion

Due to the difficult medical diagnostic of PD and its prevalence, presenting a simple and inexpensive solution for its correct and timely diagnosis is very important. An effective way for diagnosing PD is to compare the voice frequency characteristic of people under controlled conditions. This study presents an ensemble method for detecting patients and healthy samples by class label prediction based on mentioned voice frequency characteristics. It evaluates the idea of merging primary feature vector of samples and predicted class labels. The current experimental results confirm that the idea of using the proposed method has improved the machine learning outcomes and also medical diagnostic tasks.

Financial support and sponsorship

None.

Conflicts of interest

There are no conflicts of interest.

BIOGRAPHIES

Razieh Sheibani has received her B.Sc. in computer engineering from Shahid Chamran University of Ahvaz, Iran in 2018. Her research interest includes data mining, medical image processing and computational intelligence. Email: razieh.sh.agi@gmail.com Elham Nikookar has received her B.Sc. degree in computer engineering from Shahid Chamran University, Ahvaz, Iran, and her M.Sc. degree in computer engineering from University of Tehran, Tehran, Iran, in 2012. She is currently an instructor and a faculty member in Computer Engineering Department of Shahid Chamran University, Ahvaz, Iran. Email: e.nikookar@scu.ac.ir Seyed Enayatollah Alavi has received his B.Sc. degree in computer engineering from Isfahan University, Isfahan, Iran, in 1991, his M.Sc. degree in computer engineering from Shiraz University, Shiraz, Iran, in 1996 and his PhD degree in computer engineering from Belarusian State University, Minsk, Belarus, in 2011. He is currently an assistant professor in Computer Engineering Department of Shahid Chamran University, Ahvaz, Iran. Email: se.alavi@scu.ac.ir

Table 4

Sensitivity measures of applying internal classifiers

	k-NN (k=1)	k-NN (k=3)	k-NN (k=5)	SVM	DT	NB
D1	91.6	66.6	87.5	83.3	87.5	91.6
D2	95.8	79.1	87.5	87.5	91.6	83.3
D3	91.6	91.6	91.6	95.8	98.0	79.1
D4	95.8	79.1	87.5	87.5	91.6	75.0
D5	79.1	79.1	87.5	87.5	98.0	75.0
D6	91.6	75.0	83.3	83.3	75.0	79.1

SVM – Support vector machine; DT – Decision tree; NB – Naive Bayes; k-NN – k-nearest neighboring

Table 7

Sensitivity measures of applying ultimate classifiers

	k-NN (k=1)	k-NN (k=3)	k-NN (k=5)	SVM	DT	NB
MLP	79.1	79.1	83.3	95.8	87.5	95.8
AB	79.1	83.3	75.0	91.6	87.5	95.8
Voting	95.0	95.0	95.0	95.0	95.0	95.0
RF	91.6	83.3	83.3	83.6	91.6	91.6

MLP – Multilayer perceptron; AB – AdaBoost; RF – Random forest; SVM – Support vector machine; DT – Decision tree; NB – Naive Bayes; k-NN – k-nearest neighboring

9 in total

1. A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques. Guideline elaborated by the Committee on Phoniatrics of the European Laryngological Society (ELS).

Authors: P H Dejonckere; P Bradley; P Clemente; G Cornut; L Crevier-Buchman; G Friedrich; P Van De Heyning; M Remacle; V Woisard
Journal: Eur Arch Otorhinolaryngol Date: 2001-02 Impact factor: 2.503

Review 2. Advances in the treatment of Parkinson's disease.

Authors: Neha Singh; Viness Pillay; Yahya E Choonara
Journal: Prog Neurobiol Date: 2007-01-25 Impact factor: 11.685

3. Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms.

Authors: Akin Ozcift; Arif Gulten
Journal: Comput Methods Programs Biomed Date: 2011-04-30 Impact factor: 5.428

4. Collection and analysis of a Parkinson speech dataset with multiple types of sound recordings.

Authors: Betul Erdogdu Sakar; M Erdem Isenkul; C Okan Sakar; Ahmet Sertbas; Fikret Gurgen; Sakir Delil; Hulya Apaydin; Olcay Kursun
Journal: IEEE J Biomed Health Inform Date: 2013-07 Impact factor: 5.772

5. Parkinson's Disease and Its Management: Part 1: Disease Entity, Risk Factors, Pathophysiology, Clinical Presentation, and Diagnosis.

Authors: George DeMaagd; Ashok Philip
Journal: P T Date: 2015-08

Review 6. Hip fractures and Parkinson's disease: A case series.

Authors: Ross Coomber; Zeiad Alshameeri; Antonio Francesco Masia; Federico Mela; Martyn J Parker
Journal: Injury Date: 2017-08-25 Impact factor: 2.586

7. Drug therapy in patients with Parkinson's disease.

Authors: Thomas Müller
Journal: Transl Neurodegener Date: 2012-05-24 Impact factor: 8.014

8. Classification of Parkinson's disease utilizing multi-edit nearest-neighbor and ensemble learning algorithms with speech samples.

Authors: He-Hua Zhang; Liuyang Yang; Yuchuan Liu; Pin Wang; Jun Yin; Yongming Li; Mingguo Qiu; Xueru Zhu; Fang Yan
Journal: Biomed Eng Online Date: 2016-11-16 Impact factor: 2.819

9. Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection.

Authors: Max A Little; Patrick E McSharry; Stephen J Roberts; Declan A E Costello; Irene M Moroz
Journal: Biomed Eng Online Date: 2007-06-26 Impact factor: 2.819

9 in total

3 in total

1. Classification of Dysphonic Voices in Parkinson's Disease with Semi-Supervised Competitive Learning Algorithm.

Authors: Guidong Bao; Mengchen Lin; Xiaoqian Sang; Yangcan Hou; Yixuan Liu; Yunfeng Wu
Journal: Biosensors (Basel) Date: 2022-07-09

Review 2. Imperative Role of Machine Learning Algorithm for Detection of Parkinson's Disease: Review, Challenges and Recommendations.

Authors: Arti Rana; Ankur Dumka; Rajesh Singh; Manoj Kumar Panda; Neeraj Priyadarshi; Bhekisipho Twala
Journal: Diagnostics (Basel) Date: 2022-08-19

3. Deep learning and machine learning-based voice analysis for the detection of COVID-19: A proposal and comparison of architectures.

Authors: Giovanni Costantini; Valerio Cesarini Dr; Carlo Robotti; Marco Benazzo; Filomena Pietrantonio; Stefano Di Girolamo; Antonio Pisani; Pietro Canzi; Simone Mauramati; Giulia Bertino; Irene Cassaniti; Fausto Baldanti; Giovanni Saggio
Journal: Knowl Based Syst Date: 2022-07-28 Impact factor: 8.139

3 in total