| Literature DB >> 32820214 |
Junmo Kwon1,2, Hyebin Lee1,2, Soohyun Cho3, Chin-Sang Chung3, Mi Ji Lee4, Hyunjin Park5,6.
Abstract
Classification of headache disorders is dependent on a subjective self-report from patients and its interpretation by physicians. We aimed to apply objective data-driven machine learning approaches to analyze patient-reported symptoms and test the feasibility of the automated classification of headache disorders. The self-report data of 2162 patients were analyzed. Headache disorders were merged into five major entities. The patients were divided into training (n = 1286) and test (n = 876) cohorts. We trained a stacked classifier model with four layers of XGBoost classifiers. The first layer classified between migraine and others, the second layer classified between tension-type headache (TTH) and others, and the third layer classified between trigeminal autonomic cephalalgia (TAC) and others, and the fourth layer classified between epicranial and thunderclap headaches. Each layer selected different features from the self-reports by using least absolute shrinkage and selection operator. In the test cohort, our stacked classifier obtained accuracy of 81%, sensitivity of 88%, 69%, 65%, 53%, and 51%, and specificity of 95%, 55%, 46%, 48%, and 51% for migraine, TTH, TAC, epicranial headache, and thunderclap headaches, respectively. We showed that a machine-learning based approach is applicable in analyzing patient-reported questionnaires. Our result could serve as a baseline for future studies in headache research.Entities:
Mesh:
Year: 2020 PMID: 32820214 PMCID: PMC7441379 DOI: 10.1038/s41598-020-70992-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Distribution of primary headache subtypes.
| Information | Training cohort (n = 1,286) | Test cohort (n = 876) | |
|---|---|---|---|
| Mean (SD) | 47 (15) | 45 (15) | 0.1092 |
| Range (IQR) | 11–90 (35–57) | 14–88 (34–56) | |
| male:female | 373:913 | 275:601 | 0.2534 |
| Migraine | 864 | 600 | 0.8329 |
| Tension-type headache | 144 | 91 | |
| Trigeminal autonomic cephalalgia | 79 | 57 | |
| Epicranial headache | 104 | 61 | |
| Thunderclap headache | 95 | 67 | |
Values are reported as mean with standard deviation in parenthesis. p values were obtained from the Kolmogorov–Smirnov test for continuous information and Chi-square test for categorical information.
SD standard deviation, IQR inter-quartile range.
Figure 1Structure of the stacked classifier model. TTH tension-type headache, TAC trigeminal autonomic cephalalgia.
Selected features from different layers of the classifier.
| Layer | Selected features |
|---|---|
First (migraine classifier) 32 features | Mode of onset: gradual (1st), female sex (2nd), absence of lacrimation (3rd), nausea/vomiting, headache triggered by upset stomach, not located in the temple, photophobia, absence of conjunctival injection, absence of brainstem aura: vertigo, ear fullness/tinnitus, not located in the vertex, headache-related disability in daily routines, absence of headache attack during sleep, aggravation by physical activity, not in location: all over the head, not in location: back of the head, osmophobia, not in location: retroauricular, head motion-induced worsening, phonophobia, no pulsating nature, throbbing nature, no stabbing nature, absence of motion sickness, absence of agitation, vertigo, headache-associated ocular pain, nature of pain: vague/cloudy, dull-ache-like nature, general weakness, not in location: forehead, dizziness |
Second (TTH classifier) 19 features | Mode of onset: gradual (1st), nature of pain: vague/cloudy (2nd), cognitive complaint during headache attack (3rd), hypertension, absence of head motion-induced worsening, absence of avoidance of physical activity, nature of pain: dull-ache, no jabbing nature, nature of pain: drumming, absence of headache-induced awakening during sleep, nature of pain: pulsating, nausea/vomiting, headache attack in the afternoon, absence of headache-associated ocular pain, absence of aggravation by physical activity, absence of ocular pain, absence of disability in daily routines, female sex |
Third (TAC classifier) 6 features | Headache attack during sleep (1st), headache triggered by upset stomach (2nd), conjunctival injection (3rd), location: periocular, lacrimation, male sex |
Fourth (Epicranial headache classifier) 22 features | Location: retroauricular (1st), nature of pain: electric shock-like (2nd), nature of pain: jabbing (3rd), absence of headache triggered by upset stomach, no explosive nature, no tightening nature, no drumming nature, mode of onset: gradual, allodynia, nature of pain: tingling, absence of alleviation by sleeping, nature of pain: stabbing, location: temple, absence of aggravation by physical activity, nature of pain: vague/cloudy, no dull-ache-like nature, absence of ocular pain, photophobia, absence of nausea/vomiting, nature of pain: twinge, absence of headache-associated gastrointestinal discomfort |
Fourth (TCH classifier) 22 features | Not in location: retroauricular (1st), no electric shock-like nature (2nd), no jabbing nature (3rd), headache triggered by upset stomach, nature of pain: explosive, nature of pain: tightening, nature of pain: drumming, mode of onset: thunderclap, absence of allodynia, no tingling nature, alleviation by sleeping, no stabbing nature, not in location: temple, aggravation by physical activity, no vague/cloudy nature, nature of pain: dull-ache, ocular pain, absence of photophobia, nausea/vomiting, no twinge nature, headache-associated gastrointestinal discomfort |
The features listed in the right column positively correlated with the target subtype listed in the left column.
TTH tension-type headache, TAC trigeminal autonomic cephalalgia, TCH thunderclap headache.
Classifier performance of both cohorts.
| Cohort | Baseline (%) | Accuracy (%) | Headache subtype | Sensitivity (%) | Specificity (%) |
|---|---|---|---|---|---|
| Training | 67.19 | 81.80 | Migraine | 87.07 | 93.52 |
| Tension-type headache | 66.10 | 54.17 | |||
| Trigeminal autonomic cephalalgia | 85.19 | 58.23 | |||
| Epicranial headache | 64.71 | 63.46 | |||
| Thunderclap headache | 64.29 | 56.84 | |||
| Test | 68.49 | 80.71 | Migraine | 88.47 | 94.67 |
| Tension-type headache | 69.44 | 54.95 | |||
| Trigeminal autonomic cephalalgia | 65.00 | 45.61 | |||
| Epicranial headache | 52.73 | 47.54 | |||
| Thunderclap headache | 50.75 | 50.75 |
Confusion matrix for the training cohort. The bold numbers in the main diagonal denote correctly classified subjects.
| Headache subtype | Migraine | Tension-type headache | Trigeminal autonomic cephalalgia | Epicranial headache | Thunderclap headache |
|---|---|---|---|---|---|
| Migraine | 23 | 3 | 13 | 17 | |
| Tension-type headache | 46 | 1 | 13 | 6 | |
| Trigeminal autonomic cephalalgia | 18 | 4 | 6 | 5 | |
| Epicranial headache | 25 | 9 | 2 | 2 | |
| Thunderclap headache | 31 | 4 | 2 | 4 |
Confusion matrix for the test cohort. The bold numbers in the main diagonal denote correctly classified subjects.
| Headache subtype | Migraine | Tension-type headache | Trigeminal autonomic cephalalgia | Epicranial headache | Thunderclap headache |
|---|---|---|---|---|---|
| Migraine | 5 | 4 | 7 | 16 | |
| Tension-type headache | 25 | 1 | 9 | 6 | |
| Trigeminal autonomic cephalalgia | 21 | 2 | 2 | 6 | |
| Epicranial headache | 11 | 13 | 3 | 5 | |
| Thunderclap headache | 17 | 2 | 6 | 8 |
Comparison of the proposed method with other feature selection methods in the test cohort in terms of classifier performance. The bold values indicate the highest score in each performance metric.
| Feature selection method | Accuracy | Minimum sensitivity | Minimum specificity |
|---|---|---|---|
| LASSO | |||
| SVM-RFE | 0.8014 | 0.4468 | 0.3443 |
| mRMR-MIQ | 0.7180 | 0.1600 | 0.0877 |
| mRMR-MID | 0.7055 | 0.0833 | 0.0597 |
LASSO least absolute shrinkage and selection operator, SVM-RFE support vector machine recursive feature elimination, mRMR-MIQ minimum-redundancy maximum-relevancy mutual information quotient, mRMR-MID minimum-redundancy maximum-relevancy mutual information difference.
Comparison of the proposed method with other classifiers in the test cohort. The bold values indicate the highest score in each performance metric.
| Classifier | Accuracy | Minimum sensitivity | Minimum specificity |
|---|---|---|---|
| XGBoost | |||
| Random forest | 0.8037 | 0.5179 | 0.4035 |
| SVM | 0.7911 | 0.4730 | 0.4035 |
| k-NN | 0.7717 | 0.4355 | 0.3333 |
SVM support vector machine, k-NN k-nearest neighbor.