Wei Fu1, Linxin Xu1, Qiwen Yu1, Jiajia Fang2, Guohua Zhao2, Yi Li1, Chenying Pan1, Hao Dong3, Di Wang3, Haiyan Ren4, Yi Guo4, Qingjun Liu1, Jun Liu1, Xing Chen1. 1. Department of Biomedical Engineering, Key Laboratory of Biomedical Engineering of Ministry of Education of China, Zhejiang University, Hangzhou, Zhejiang 310027, China. 2. Department of Neurology, the Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu City, Zhejiang Province 322000, P. R. China. 3. Research Center for Intelligent Sensing, Zhejiang Lab, Hangzhou 311100, China. 4. Tianjin University of Traditional Chinese Medicine, Tianjin 301617, China.
Abstract
Background: Currently, Parkinson's disease (PD) diagnosis is mainly based on medical history and physical examination, and there is no objective and consistent basis. By the time of diagnosis, the disease would have progressed to the middle and late stages. Pilot studies have shown that a unique smell was present in the skin sebum of PD patients. This increases the possibility of a noninvasive diagnosis of PD using an odor profile. Methods: Fast gas chromatography (GC) combined with a surface acoustic wave sensor with embedded machine learning (ML) algorithms was proposed to establish an artificial intelligent olfactory (AIO) system for the diagnosis of Parkinson's through smell. Sebum samples of 43 PD patients and 44 healthy controls (HCs) from Fourth Affiliated Hospital of Zhejiang University School of Medicine, China, were smelled by the AIO system. Univariate and multivariate methods were used to identify the significant volatile organic compound (VOC) features in the chromatograms. ML algorithms, including support vector machine, random forest (RF), k nearest neighbor (KNN), AdaBoost (AB), and Naive Bayes (NB), were used to distinguish PD patients from HC based on the VOC peaks in the chromatograms of sebum samples. Results: VOC peaks with average retention times of 5.7, 6.0, and 10.6 s, respectively, corresponding to octanal, hexyl acetate, and perillic aldehyde, were significantly different in PD and HC. The accuracy of the classification based on the significant features was 70.8%. Based on the odor profile, the classification had the highest accuracy and F1 of the five models with 0.855 from NB and 0.846 from AB, respectively, in the process of model establishing. The highest specificity and sensitivity of the five classifiers were 91.6% from NB and 91.7% from RF and KNN, respectively, in the evaluating set. Conclusions: The proposed AIO system can be used to diagnose PD through the odor profile of sebum. Using the AIO system is helpful for the screening and diagnosis of PD and is conducive to further tracking and frequent monitoring of the PD treatment process.
Background: Currently, Parkinson's disease (PD) diagnosis is mainly based on medical history and physical examination, and there is no objective and consistent basis. By the time of diagnosis, the disease would have progressed to the middle and late stages. Pilot studies have shown that a unique smell was present in the skin sebum of PD patients. This increases the possibility of a noninvasive diagnosis of PD using an odor profile. Methods: Fast gas chromatography (GC) combined with a surface acoustic wave sensor with embedded machine learning (ML) algorithms was proposed to establish an artificial intelligent olfactory (AIO) system for the diagnosis of Parkinson's through smell. Sebum samples of 43 PD patients and 44 healthy controls (HCs) from Fourth Affiliated Hospital of Zhejiang University School of Medicine, China, were smelled by the AIO system. Univariate and multivariate methods were used to identify the significant volatile organic compound (VOC) features in the chromatograms. ML algorithms, including support vector machine, random forest (RF), k nearest neighbor (KNN), AdaBoost (AB), and Naive Bayes (NB), were used to distinguish PD patients from HC based on the VOC peaks in the chromatograms of sebum samples. Results: VOC peaks with average retention times of 5.7, 6.0, and 10.6 s, respectively, corresponding to octanal, hexyl acetate, and perillic aldehyde, were significantly different in PD and HC. The accuracy of the classification based on the significant features was 70.8%. Based on the odor profile, the classification had the highest accuracy and F1 of the five models with 0.855 from NB and 0.846 from AB, respectively, in the process of model establishing. The highest specificity and sensitivity of the five classifiers were 91.6% from NB and 91.7% from RF and KNN, respectively, in the evaluating set. Conclusions: The proposed AIO system can be used to diagnose PD through the odor profile of sebum. Using the AIO system is helpful for the screening and diagnosis of PD and is conducive to further tracking and frequent monitoring of the PD treatment process.
Parkinson’s
disease (PD) is the second most common neurodegenerative
disorder of the central nervous system in the world.[1] It has a long course of the disease and requires life-long
treatment, which brings great inconvenience to the patients’
work and life. People with PD often suffer from some motor and nonmotor
symptoms that vary from patient to patient. Motor symptoms include
bradykinesia, tremor, rigidity, and postural instability, and the
nonmotor symptoms comprise depression, memory loss, anosmia, constipation,
and urinary frequency.[1−3] The prevalence was 65.6–12,500/100,000, 537–614/100,000,
and 51.3–176.9/100,000 in Europe,[4] North America,[5] and Asia,[6] respectively, and will double by the next 30 years along
with population aging as predicted by the Global Burden of Disease
(GBD).[7] According to the data from GBD,
the disease caused 211,296 deaths and 3.2 million patients with disability-adjusted
life-years in 6 million patients in 2016.[8] PD does not have an effective way to cure, but early diagnosis of
PD and early medical, psychological, and social interventions can
significantly improve the health-related quality of life, relieve
symptoms, and prolong the patients’ survival time. Therefore,
the standardized diagnosis of PD is very important.[9−11]Currently,
the diagnosis of PD is mainly based on clinical manifestations
supplemented by test rating scales, including Hoehn–Yahr stage
(H–Y),[12,13] unified PD rating scale (UPDRS),[14,15] nonmotor symptom scale,[16,17] PD-cognitive rating
scale,[18,19] and so on. There has been a lack of objective
and consistent diagnostic criteria for a long time. Dopamine transporter
single-photon emission computed tomography is currently a high-value
imaging-assisted diagnosis method,[20,21] but this item
is expensive and unsuitable for routine use. Moreover, the body fluid
(blood and cerebrospinal fluid) biomarkers for the diagnosis of PD
are still in the research and verification stage.[22] Pilot studies have shown that the volatile organic compounds
(VOCs) in the sebum of PD have a different smell from healthy people,
which provides a new idea and method for the diagnosis of PD.Recently, Nazik described that the increased sebum secretion in
PD patients was associated with the increase in the production of
yeast and enzymes in the body,[23] and the
secretion of hormones may lead to seborrheic dermatitis (SD).[24] SD is considered to be one of the premotor symptoms
of PD and has an auxiliary value for diagnosing PD.[25] Trivedi used gas chromatography–mass spectrometry
(GC–MS) to prove that there were different VOCs in the sebum
of PD and healthy people.[26] These VOCs
such as perillic aldehyde and eicosane may change with the increase
of sebum secretion, and the interaction between sebum and the yeast
of the microbiome can make human skin smelly.[26−29] Besides, Tsuda used GC–MS
to analyze the smell of sebum and found that the types and concentrations
of VOCs such as dodecane, acetone, and ethyl acetate released in the
sebum of PD patients were related to UPDRS part 3,[30] representing the severity of motor in PD. These studies
indicated that sebum gas could be used for the detection of PD.At present, the detection and analysis of VOCs in human sebum in
scientific research mainly adopt chromatography. One of the most mainstream
methods is to use a combination of GC and general detection technology
or gas sensors and electronic noses.[31−34] GC–MS is one of the most
common methods, but its bulky size, long analytical time, and high
cost may still be unsuitable for clinical use. Fast GC systems, which
have been used to detect VOC markers for many diseases, had the characteristics
of small size, easy to use, portability, and low cost.[35] These characteristics make it possible to perform
point-of-care testing of sebum’s smell from PD patients.In this study, GC combined with a surface acoustic wave (SAW) sensor
with embedded machine learning (ML) algorithms was proposed to establish
an artificial intelligent olfactory (AIO) system for the diagnosis
of Parkinson’s through smell. The system has the advantages
of being fast, small, easy to operate, portability, and low cost.
Experimental data from 31 PD patients and 32 healthy controls (HCs)
were obtained by the AIO system. Univariate and multivariate analysis
of biomarker features was performed, and ML strategies, including
support vector machine (SVM), random forest (RF), k nearest neighbor (KNN), AdaBoost (AB), and Naive Bayes (NB), were
used to construct diagnostic biomarker-based models and odor profile-based
models. Data from 12 PD and 12 HC were used to evaluate the clinical
usage of the models. The results showed that the AIO system could
diagnose PD through the smell of sebum, indicating the potential usage
of the AIO system in clinical practice.
Results
Clinical Characteristics
The VOC
data of 31 cases of PD and 32 cases of HC were used to build the Parkinson’s
odor diagnosis model. The data of 12 cases of PD and 12 cases of HC
were used to evaluate the models. The characteristics of patients
who participated in the experiment are shown in Table .
Table 1
Characteristics of
Participants in
the Experiments
development
cohort
validation
cohort
characteristics
PD (n = 31)
HC (n = 32)
P-value
PD (n = 12)
HC (n = 12)
P-value
gender (n, % ratio)
male
18 (58.0%)
15 (46.9%)
0.184
6 (50.0%)
5 (41.7%)
0.698
female
13 (41.9%)
17 (53.1%)
6 (50.0%)
7 (58.3%)
age (median, range)
64.74 ± 11.31
64.81 ± 8.89
0.453
64.00 ± 6.86
57.97 ± 16.52
0.251
BMI
23.28 ± 2.26
22.06 ± 1.93
0.032
23.82 ± 2.79
23.72 ± 2.40
0.310
abuse alcohol (n)
2
0
0
0
disease process (Avg)
5.23
0
5.25
0
malignant tumor (N)
0
0
0
0
Calibration of AIO
Calibration of AIO by Using the Mixed Solution
Figure shows the
chromatographic frequency response curve of a mixed solution of nonane,
decane, undecane, dodecane, tridecane, tetradecane, and pentadecane.
The mixed solution was used to prepare standard calibration gases
(the details of preparation are shown in Section ). The AIO can detect the VOCs in the
gas phase. In Figure , the horizontal axis is the retention time, which characterized
different substances, and the vertical axis is the SAW’s response
frequency, which characterized the quality of each substance.
Figure 1
Chromatographic
frequency response curve of a mixed solution in
the AIO system: peak 1 is the internal standard at a concentration
of 1.12 mM; peaks 2, 3, 4, 5, 6, 7, and 8 are nonane, decane, undecane,
dodecane, tridecane, tetradecane, and pentadecane, respectively, and
all VOCs were diluted to parts per million of the original concentration.
Chromatographic
frequency response curve of a mixed solution in
the AIO system: peak 1 is the internal standard at a concentration
of 1.12 mM; peaks 2, 3, 4, 5, 6, 7, and 8 are nonane, decane, undecane,
dodecane, tridecane, tetradecane, and pentadecane, respectively, and
all VOCs were diluted to parts per million of the original concentration.
Calibration of AIO by
Using the Selected
Biomarker Reagent
Figure shows spectra for different concentrations (0.025,
0.25, 0.5, 1.0, 1.5, 2.0, 2.5, 25, and 50 mM) of gas-phase octanal
(Figure A), hexyl
acetate (Figure B),
perillic aldehyde (Figure C), and dodecane (Figure D) generated by VOC solvents. The clinical range of
concentrations was from 0.25 to 2.5 mM.[26,27] The results
showed good reproducibility of the AIO system in the detection of
these four reagents. Figure A shows a linear detection range of the system (R2 = 0.9876, P < 0.0001) from 0.025
to 50 mM covering the reported octanal concentrations in human sebum. Figure B shows hexyl acetate
(R2 = 0.9656, P <
0.0001). Figure C
shows perillic aldehyde (R2 = 0.9099, P < 0.0001). Figure D shows dodecane (R2 =
0.9919, P < 0.0001). The calibration of the AIO
system indicated that the system had good sensitivity and reproducibility
for the quantitative detection of these reagent concentrations.
Figure 2
Calibrations
of AIO by different reagent concentrations of the
gas phase (0.025, 0.25, 0.5, 1.0, 1.5, 2.0, 2.5, 25, and 50 mM): (A)
responses of AIO to the concentrations of nine different samples of
octanal and the detection spectra of different octanal concentrations
in the gas phase; (B) responses of AIO to the concentrations of nine
different samples of hexyl acetate and the detection spectra of different
hexyl acetate concentrations in the gas phase; and (C) responses of
AIO to the concentrations of nine different samples of perillic aldehyde
and the detection spectra of different perillic aldehyde concentrations
in the gas phase. (D) Responses of AIO to the concentrations of nine
different samples of dodecane and the detection spectra of different
dodecane concentrations in the gas phase. The linear correlation with
red dashed lines represents the fitting with 95% confidence interval.
Calibrations
of AIO by different reagent concentrations of the
gas phase (0.025, 0.25, 0.5, 1.0, 1.5, 2.0, 2.5, 25, and 50 mM): (A)
responses of AIO to the concentrations of nine different samples of
octanal and the detection spectra of different octanal concentrations
in the gas phase; (B) responses of AIO to the concentrations of nine
different samples of hexyl acetate and the detection spectra of different
hexyl acetate concentrations in the gas phase; and (C) responses of
AIO to the concentrations of nine different samples of perillic aldehyde
and the detection spectra of different perillic aldehyde concentrations
in the gas phase. (D) Responses of AIO to the concentrations of nine
different samples of dodecane and the detection spectra of different
dodecane concentrations in the gas phase. The linear correlation with
red dashed lines represents the fitting with 95% confidence interval.
Selection of Significant
Features Which Classify
PD
The data detected by the AIO system were preprocessed
(to achieve signal adaptive filtering denoising, baseline correction,
drift compensation, normalized). We conducted further analysis and
statistical tests on the four extracted features’ data to detect
the difference between the PD and HC groups. To evaluate the performance
of these biomarkers, we used the data of the discovery cohort to perform
a nonparametric test. It can be found that octanal and hexyl acetate
had significant differences in the Kolmogorov–Smirnov Z test (p < 0.05). Perillic aldehyde
was lower in PD samples in the box plots and had significant differences
in the Mann–Whitney U test (p < 0.05). However, dodecane was not significantly different in
the two tests (p > 0.05). The results of the nonparametric
test are shown in Table S2. The area under
the curve (AUC) and box plots of the three markers of octanal, hexyl
acetate, and perillic aldehyde are shown in Figure .
Figure 3
ROC curves and box plots of three biomarker
features for the discovery
cohort. The ROC curve comprehensively considers the characteristics
of sensitivity and specificity. Box plots show a comparison of means
of log scaled peak intensities of these analytes, where black dots
are outliers. In the box plots, the green on the left represents HC,
and the orange on the right represents PD patients.
ROC curves and box plots of three biomarker
features for the discovery
cohort. The ROC curve comprehensively considers the characteristics
of sensitivity and specificity. Box plots show a comparison of means
of log scaled peak intensities of these analytes, where black dots
are outliers. In the box plots, the green on the left represents HC,
and the orange on the right represents PD patients.
Model Based on Significant Features Which
Classify PD
Table shows the classification effect of the development cohort’s
(31 PD and 32 HC) three marker features when the classification model
was established. We used 80% for the training set and 20% for the
validation set. The accuracy of the model was 82.6%. The F1 of 0.771
also showed that the model has good robustness. Then, we used a blind
experiment to diagnose 12 PD and 12 HC, evaluated them with clinical
indicators, and found that the accuracy was 70.8%. The sensitivity
was 91.7%, which means that the mistake diagnosis rate was less than
10%. However, the specificity effect of the model was 50%. As shown
in Figure , AUC indicated
that the model established by the AIO system had good accuracy.
Table 2
Model Evaluation Parameters and Clinical
Application Evaluation Results
evaluation
parameters
cohort
accuracy
recall (sensitivity)
precision
F1
specificity
development cohort
0.826
0.982
0.836
0.771
validation cohort
0.708
0.917
0.500
Figure 4
ROC curve of
the development cohort and validation cohort based
on significant features: X-axis: false positive rate, Y-axis: true positive rate, purple line: development cohort,
and pink line: validation cohort. The AUCs were 0.754 and 0.646, respectively.
ROC curve of
the development cohort and validation cohort based
on significant features: X-axis: false positive rate, Y-axis: true positive rate, purple line: development cohort,
and pink line: validation cohort. The AUCs were 0.754 and 0.646, respectively.
Five Different Classifiers for the Classification
of PD Patients by the Odor Profile
Table shows the classification effect of the models.
We used 80% for the training dataset and 20% for the validation dataset.
It can be seen that the odor profile diagnosis model established by
AB obtained the highest accuracy with 0.855 scores. Except for the
model established by KNN, the models established by other classifiers
were all greater than 0.800. The best F1 was 0.846 from AB. The RF
model had the highest recall with a score of 0.981. The SVM model
got the highest precision with a score of 0.882.
Table 3
Model Evaluation Parameters
evaluation
parameters
model
accuracy
recall
precision
F1
SVM
0.812
0.865
0.882
0.813
RF
0.841
0.981
0.836
0.818
KNN
0.754
0.980
0.754
0.689
AB
0.855
0.960
0.857
0.846
NB
0.841
0.924
0.875
0.835
The relationship between specificity and sensitivity was obtained
by plotting the ROC curve. As shown in Figure A, the area under the ROC curve indicated
that the model established by the AIO system through SVM had good
accuracy.
Figure 5
ROC curve analysis to evaluate the performance of different classifier
construction models. Each colored line represents the ROC curve of
the Parkinson’s odor diagnosis model constructed by different
classifiers: (A) development cohort: the ROC curve of the model. The
AUCs of five classifiers (SVM, RF, KNN, AB, and NB) are 0.808, 0.868,
0.800, 0.929, and 0.914, respectively and (B) validation cohort: the
ROC curve of the medical diagnostic tests. The AUCs of five classifiers
are 0.681, 0.819, 0.729, 0.826, and 0.698, respectively.
ROC curve analysis to evaluate the performance of different classifier
construction models. Each colored line represents the ROC curve of
the Parkinson’s odor diagnosis model constructed by different
classifiers: (A) development cohort: the ROC curve of the model. The
AUCs of five classifiers (SVM, RF, KNN, AB, and NB) are 0.808, 0.868,
0.800, 0.929, and 0.914, respectively and (B) validation cohort: the
ROC curve of the medical diagnostic tests. The AUCs of five classifiers
are 0.681, 0.819, 0.729, 0.826, and 0.698, respectively.
Medical Diagnostic Tests for the Established
Model
Considering that the odor diagnosis model of PD had
a better classification effect, the clinical application of the model
was evaluated. To ensure the cleanliness of the test dataset, the
researchers were blinded in this trial, and all participants’
information was kept in the third party—the Fourth Affiliated
Hospital of Zhejiang University School of Medicine. After the analysis
and model diagnosis experiment, the doctor will then send the participants’
information to the experimenter for statistical analysis to evaluate
the clinical application value of the model. The results of the evaluation
of the clinical application for the five models are shown in Table . The ROC curve of
the results is shown in Figure B.
Table 4
Results of the Evaluation of Clinical
Application for Five Different Classifiers
model
sensitivity (%)
specificity
(%)
accuracy (%)
+PV (%)
–PV (%)
SVM
66.7
66.7
66.7
66.7
66.7
RF
91.7
66.7
79.2
73.3
88.9
KNN
91.7
16.7
54.2
52.4
66.7
AB
83.3
66.7
75.0
71.4
80.0
NB
66.7
91.6
62.5
80.0
57.9
Table shows the
results of the evaluation of the clinical application. We blindly
diagnosed 12 PD and 12 HC. The odor diagnosis model constructed by
RF obtained the highest accuracy with 0.792. We could find that the
ensemble-learning classifiers—RF and AB—have higher
accuracy than others. The highest sensitivity was 0.917 from RF and
KNN. NB achieved the highest specificity with 0.916. The highest +PV,
which doctors care about in clinical situations, was 0.800 from NB,
and the highest −PV was 0.889 from RF. NB had the highest accuracy
rate of TP in predicting positives. RF had the highest accuracy rate
of TN in predicting negatives. However, the classification effect
of SVM in the classifier did not reflect good results. The ROC curve
in Figure B also showed
that the ensemble classifiers had better results, and these classifiers
also had the evaluation of the clinical application.
Discussion
This research proposed fast GC combined
with a SAW sensor with
embedded ML algorithms to build an AIO system to diagnose PD through
smell. ML was used to classify the sebaceous skin gas of PD patients
based on the peaks in the chromatogram. There were three significant
biomarkers (octanal, hexyl acetate, and perillic aldehyde) between
PD patients and the control group. Using the three VOC biomarkers
and the odor profile collected by the AIO system, the accuracies of
classification between PD and HC were 70.8 and 79.2%, respectively.The three significant biomarkers might be caused by different metabolic
ways of PD patients. Hexyl acetate was usually found in many fruits
and alcoholic beverages.[36,37] Perillic aldehyde was
used in perfumes, cosmetics, and food.[26] The concentrations of hexyl acetate in HC were lower than those
of PD, but the concentrations of perillic aldehyde in HC were higher
than those in PD. It could be speculated that PD had a special metabolic
ability for these two lipid hydrophobic metabolites (hexyl acetate
and perillic aldehyde). SD was a typical nonmotor symptom of PD and
was developed by the increasing sebum excretion and proliferation
of Malassezia yeasts.[38] Increased sebum excretion in PD patients would increase
yeast and the production of enzymes, leading to sebum inflammation.[23] Both neural and epidermal tissues originated
from the ectoderm. Gioti described that the growth of Malassezia bacteria required specific exogenous lipids,
which might be related to the increase of lipophilic molecules.[39] Both hexyl acetate and perillic aldehyde were
lipophilic molecules and were insoluble in water, so they might be
required for the growth of Malassezia, which might be related to PD. Hirayama pointed out that PD could
cause abnormal sweat glands, leading to excessive sweating and night
sweats.[40] Octanal was a common marker in
human skin sweat.[41,42] Moreover, Agapiou showed that
octanal might be related to oxidative stress.[43] Besides, Puspita demonstrated that oxidative stress was one of the
pathogeneses of PD.[44] Therefore, oxidative
stress might cause an increase of the octanal content on the skin
surface of PD patients. All potential explanations for changes in
odor in PD indicated that the changes in skin physiology and skin
metabolomics were highly specific to PD. This made them useful as
biomarkers for identifying patients with PD. Besides, the clinical
concentration range was expected to be 0.25–2.5 mM according
to the experimental results from Trivedi’s studies. The results
from our study also supported their reports since the response of
AIO was always ranging from 1 to 25 kHz in most of the chromatograms,
which was referring to 0.25–2.5 mM of various markers by the
calibration curves.The results of the classification model
based on the sebum odor
profile are pretty good. The classification accuracies of PD and HC
were 70.8 and 79.2%, respectively, using the significant biomarker
features and odor profile. The model based on the odor profile has
a highest sensitivity of 91.7%, a highest specificity of 91.6%, and
a highest AUC of 0.826. It could be inferred that some peaks were
not significant on the chromatogram and were also included in the
algorithm when classifying based on the odor profile. These insignificant
peaks also contributed to the classification, improving accuracy,
sensitivity, and specificity. Besides, the smell profile, which covered
many VOCs’ peaks, also helped improve the sensitivity and specificity
of the classification model based on the odor profile of sebum.The fast, easy-to-use, and portable AIO system can balance sensitivity,
specificity, linearity, dynamic range, detection limit, detection
efficiency, and discrimination ability. Compared with the GC–MS,
liquid chromatography–MS, or paper spray ionization coupled
with ion mobility-MS, the AIO system greatly improves the detection
speed and reduces the detection cost.[26,27,30,45,46] Compared with traditional clinical PD diagnosis methods, the AIO
system is a fast and noninvasive method. It can also be widely used
in hospitals, clinics, and homes as a PD patient screening or family
self-health check method. PD is a chronic neurodegenerative disease
caused by the loss of dopaminergic neurons in the substantia nigra.
Therefore, when PD patients develop motor symptoms, they have already
lost dopaminergic neurons.[47] The neurodegeneration
process may be too fast, so it is essential to identify PD before
widespread neuron loss occurs, and the AIO provides a possible solution.However, there were several limitations to the present study. First,
AIO used fast GC to separate mixed VOCs. The GC method has a limitation.
According to the principle of GC separation, each peak represented
a pure chemical compound that had a unique retention time to be recognized.
However, in some circumstances, two or more compounds had quite close
retention times, which would cause overlaps of the peaks in the fast
GC separation. We used these retention times to identify the biomarkers
(Table S2). However, it is hard to avoid
the situation when the interfering compounds’ retention time
was the same as that of the biomarkers. Second, the diagnostic accuracy
based on the classification largely depended on the size of the training
set and the representativeness of the sample population. In our study
population, the data distribution of samples generally needed to be
relatively balanced, so the results obtained by the model may be expected
to have high classification accuracy. Furthermore, the controlled
equalization of samples in the PD and HC did not represent the distribution
of the PD disease in real clinical settings, leading to the limited
utility of the model. Third, the metabolic mechanism of hexyl acetate
and perillic aldehyde in the organism was not clear. Moreover, the
differentiation of dodecane, which was proved to be related to the
UPDRS3, was not observed in this study, indicating the existence of
interfering factors such as race. Nonetheless, this study provided
some ideas worthy of further study. If more research supported our
hypothesis, additional biomarkers such as hexyl acetate, perillic
aldehyde, octanal, and dodecane could be incorporated into the risk
score model to identify individuals at a high risk of PD. From a future
perspective, when the high sensitivity and specificity of the AIO
system is reproduced within more extensive studies, the AIO system
might assist clinicians in monitoring the extent of PD in patients
who have PD or who may be at a high risk of PD.
Conclusions
In conclusion, we proposed a fast GC system combined with the SAW
sensor with embedded ML algorithms to establish an AIO system for
the diagnosis of PD through smell. This method presents a new possibility
for the early diagnosis of PD. Compared with olfactory testing, sleep
testing, and other solutions, the combination of the AIO system and
ML may produce a new method of gaseous-assisted diagnosis of PD with
an improved detection speed and a reduced detection cost. Moreover,
the AIO system is a fast, easy-to-use, and noninvasive method that
can be widely used in hospitals, clinics, and homes to screen, diagnose,
and monitor PD treatment.
Materials and Methods
Artificial Intelligence Olfactory System
Design
of the Fast GC System
The
AIO system is a combination of GC and SAW sensor. The system comprises
three modules: gas injection and preconcentration module, chromatographic
separation module (GC), and sensor detection module (SAW). As shown
in Figure A, an adsorbent
tube filled with 10 mg of Tenax TA (60/80 mesh, purchased from Analytical
Columns, Croydon, England), a six-way valve, and a vacuum pump was
used to preconcentrate the sample gases for the injection. A 1 m long
DB-1 capillary column (cut from an Agilent column of 10 m × 0.1
mm × 0.33 mm) with a direct resistive heating component (manufactured
by Ningbo Oulaike Metal Capillary Technology Co., Ltd., Ningbo, China)
was used to fast separate the compounds in the sample gases. A 36° Y–X cut quartz substrate double-ended
resonant Rayleigh wave gas sensor with a center frequency of 500 MHz
(manufactured by Hua Ying Electronics, Wukang, China) was used to
detect the separated compounds with a sensitivity of −69,766
Hz/ng to mass deposition. Also, the AIO system uses Bluetooth to connect
with the host computer.
Figure 6
(A) System design of the AIO system; solid red
line: sampling mode
system operating gas path; dotted blue line analyzing mode system
operating gas path. (B) Process of clinical experiments: (1) basic
information about the participants was recorded; (2) the gauze was
placed on the participants’ back to extract the skin sebum
VOCs, and then, the sample gauze was placed in a glass bottle with
an inert brown background gas; (3) the bottles were transported in
ice packs; (4) the samples were taken back to the laboratory and placed
in the refrigerator; and (5) analytical experiments were carried out
on the collected samples by the AIO system to obtain odor profiles
(created with BioRender.com).
(A) System design of the AIO system; solid red
line: sampling mode
system operating gas path; dotted blue line analyzing mode system
operating gas path. (B) Process of clinical experiments: (1) basic
information about the participants was recorded; (2) the gauze was
placed on the participants’ back to extract the skin sebum
VOCs, and then, the sample gauze was placed in a glass bottle with
an inert brown background gas; (3) the bottles were transported in
ice packs; (4) the samples were taken back to the laboratory and placed
in the refrigerator; and (5) analytical experiments were carried out
on the collected samples by the AIO system to obtain odor profiles
(created with BioRender.com).The AIO system has two working
states controlled by a six-way valve,
including the sampling mode and the analyzing mode. The six-way valve
was the key component of the AIO system. Under the normal conditions,
the six-way valve was in the sampling mode, as indicated by the solid
red line in Figure A. Under the control of the vacuum pump, the gas sample flowed from
the gas inlet to an adsorbent tube, before the sample gets adsorbed
into this tube. After adsorption, the six-way valve was switched to
the analyzing mode, as indicated by the dotted blue line in Figure A. By instantaneously
heating the adsorption tube, the analyte in the adsorbent was desorbed
before flowing into the DB-1 capillary column for future separation.
Additionally, the separated substances were tested for quality on
the SAW.
Design of the Surface
Acoustic Wave Sensor
A 36° Y–X cut quartz
substrate double-ended resonant Rayleigh wave gas sensor with a center
frequency of 500 MHz was used to detect the mass loading of the compounds
separated by GC. The specific parameters of the SAW sensor are shown
in Table . According
to the mass deposition effect formula of the SAW device, the sensitivity
of the sensor mass deposition can be calculated as −69,766
Hz/ng. The SAW sensor vibration spectrum measured by the spectrum
analyzer is shown in Figure S2. Also, the
image of the SAW sensor is shown in Figure S1.
Table 5
Parameters of the SAW Sensor
index
parameter
substrate materials
36° Y–X quartz
electrode material
aluminum
electrode thickness
200 nm
central frequency
500 MHz
input/output transducers
50.5 pairs
reflectors
350 in each side
transducer aperture
800 μm
input/output transducer cycle
(λ)
6.3 μm
reflector cycle
λ
reflector and transducer spacing
λ
input/output transducer
spacing
1.25λ
Operation Method of the
AIO System
The initial temperatures of the injection port,
six-way valve, capillary
column, and sensor were, respectively, set to 80, 130, 30, and 35
°C. The flow rate of carrier helium gas was fixed at 1 mL/min.
After preheating, the gas path was set to the sample mode. The sampling
time was 20 s, and the analysis time was 30 s. Then, about 20 mL of
sample gas was adsorbed into the Tenax TA adsorption tube. For gas
analysis, the Tenax TA adsorption tube was heated. Next, the valve
was used to change the gas path to the analysis mode. The initial
temperature of the capillary column was maintained at 30 °C for
1 s and then increased to 120 °C at a rate of 6 °C/s. Finally,
the SAW sensor was heated to 105 °C for cleaning. All the procedures,
including sampling, separation, detection, and cooling, took 90 s
in one analysis cycle. Moreover, the ambient temperature in the laboratory
was controlled at about 22 °C.
Calibrating
AIO by Standard Samples
Reagents and Materials
Octanal
was (C8H16O, 98.0%) purchased from TOKYO Chemical
Industry. Perillic aldehyde (C10H14O, 95.0%)
was purchased from Shanghai Yuanye Biological Technology Co., Ltd.
Hexyl acetate (C8H16O, 98.0%), nonane (C9H20, 98.0%), decane (C10H22, 99.0%), undecane (C11H24, 99.5%), dodecane
(C12H26, 98.0%), and tetradecane (C14H30, 98.0%) were obtained from Aladdin Industrial Corporation.Tridecane (C13H28, 98.0%) and pentadecane
(C15H32, 98.0%) were purchased from Macklin.
All solutions were stored at 4 °C and protected from light.
Preparation of Standard Samples
The first
part is to test the AIO system’s ability to distinguish
the mixed substances. Standard solutions of nonane, decane, undecane,
dodecane, tridecane, tetradecane, and pentadecane were prepared into
a mixed solution, and all solutions were diluted to parts per million
of the original concentration. The second part is the calibration
of the linear repeatability of the AIO system. Standard samples of
four biomarkers (octanal, hexyl acetate, perillic aldehyde, and dodecane)
were applied to calibrate AIO. Four biomarkers’ reagents were
used to prepare standard solutions, and the corresponding solvents
were used to configure nine different types: 0.025, 0.25, 0.5, 1.0,
1.5, 2.0, 2.5, 25, and 50 mM concentrations of four marker solutions.
Among them, the solvent of octanal, perillic aldehyde, and dodecane
reagents was ethanol, and the solvent of the hexyl acetate reagent
was methanol. Then, AIO was calibrated by injecting samples (0.02
μL) with a microinjector (1 μL, manufactured by Shanghai
High Pigeon Industry and Trade Co. Ltd.), each concentration was measured
3 times, and linear regression analysis was performed (Section ).
Clinical Experiment Design
The experiments
were conducted in the Fourth Affiliated Hospital of Zhejiang University
School of Medicine, Hangzhou, China. This study was approved by the
ethics committee of the Fourth Affiliated Hospital of Zhejiang University
School of Medicine (approval no. K2020052, 15 June 2020). A total
of 87 participants were recruited in our experiment, including 43
PD patients and 44 HC. The patients in the study were diagnosed as
PD and were staged by the neurologist according to the H–Y
clinical staging scale in the hospital. The early stage includes stages
I and II, and the middle and late stages are stages III–V.
Basic clinical data (patients’ ID, age, weight, drug exposure
history, years of education, high-protein diet, smoking and drinking
history, cerebrovascular history, disease duration, and skin condition)
were collected.Each participant had signed informed consent.
To ensure the accuracy of the results, subjects are required to comply
with the following requirements: (1) no bathing, body lotion, or other
cosmetics are allowed 12 h before the sample collection; (2) subjects
are required to refrain from exercise for 12 h before sample collection;
(3) no perfume shall be sprayed 12 h before sample collection; and
(4) fasting 4 h before collection.Initially, considering that
the subjects are mostly elderly, paper
questionnaires were used to record relevant information of the participants,
and the information was organized and stored on the computer. Second,
each participant was swabbed using a medical gauze (7.5 × 7.5
cm) on the upper back to collect sebum samples. The gauze with the
sebum sample was sealed in background-inert plastic bags and transported
from the hospital to the laboratory using the ice bag. The samples
were stored in a −25 °C refrigerator. Next, the above
method (Section ) was used to set up the AIO system for the chromatographic analysis
of the experiment. At last, the data obtained by the analysis were
displayed on the host computer via Bluetooth. The process of the clinical
experiments is shown in Figure B.
Data Analysis
VOC data and data from
each volunteer were recorded in a dataset. Statistical analysis was
done by Statistical Package for the Social Sciences (PASW Statistics
18, IBM Corp., Armonk, NY, USA), GraphPad (Prism 5, GraphPad Software,
Inc., La Jolla, CA, USA), and Python (Python 3.7, the MathWorks, Inc.,
Natick, MA, USA) for Windows.The linear regression was used
to fit the average sensor response to the concentrations of the three
calibration reagent solutions (octanal, hexyl acetate, and perillic
aldehyde). The goodness of fit was evaluated using the correlation
R. The Mann–Whitney U test and Kolmogorov–Smirnov
Z test were used to select the significant features. The differences
with a p-value less than 0.05 were considered to
be statistically significant. Python was used to create the five different
classifiers (SVM, RF, KNN, AB, and NB). Parameters of the classifiers
were selected based on GridsearchCV, and they were as follows. The
linear kernel was utilized for SVM. The number of trees was 25 for
RF. The k parameter was 10 for KNN for all experiments.
The area under the ROC curve (AUC) was a performance indicator for
binary classification problems based on classification.
Authors: Pablo Martinez-Martin; Carmen Rodriguez-Blazquez; Mario Alvarez-Sanchez; Tomoko Arakaki; Alberto Bergareche-Yarza; Anabel Chade; Nelida Garretto; Oscar Gershanik; Monica M Kurtis; Juan Carlos Martinez-Castrillo; Amelia Mendoza-Rodriguez; Henry P Moore; Mayela Rodriguez-Violante; Carlos Singer; Barbara C Tilley; Jing Huang; Glenn T Stebbins; Christopher G Goetz Journal: J Neurol Date: 2012-08-05 Impact factor: 4.849
Authors: Anastasia Gioti; Björn Nystedt; Wenjun Li; Jun Xu; Anna Andersson; Anna F Averette; Karin Münch; Xuying Wang; Catharine Kappauf; Joanne M Kingsbury; Bart Kraak; Louise A Walker; Henrik J Johansson; Tina Holm; Janne Lehtiö; Jason E Stajich; Piotr Mieczkowski; Regine Kahmann; John C Kennell; Maria E Cardenas; Joakim Lundeberg; Charles W Saunders; Teun Boekhout; Thomas L Dawson; Carol A Munro; Piet W J de Groot; Geraldine Butler; Joseph Heitman; Annika Scheynius Journal: MBio Date: 2013-01-22 Impact factor: 7.867