| Literature DB >> 33927277 |
Yejin Kim1, Jessika Suescun2, Mya C Schiess2, Xiaoqian Jiang3.
Abstract
Our objective is to derive a sequential decision-making rule on the combination of medications to minimize motor symptoms using reinforcement learning (RL). Using an observational longitudinal cohort of Parkinson's disease patients, the Parkinson's Progression Markers Initiative database, we derived clinically relevant disease states and an optimal combination of medications for each of them by using policy iteration of the Markov decision process (MDP). We focused on 8 combinations of medications, i.e., Levodopa, a dopamine agonist, and other PD medications, as possible actions and motor symptom severity, based on the Unified Parkinson Disease Rating Scale (UPDRS) section III, as reward/penalty of decision. We analyzed a total of 5077 visits from 431 PD patients with 55.5 months follow-up. We excluded patients without UPDRS III scores or medication records. We derived a medication regimen that is comparable to a clinician's decision. The RL model achieved a lower level of motor symptom severity scores than what clinicians did, whereas the clinicians' medication rules were more consistent than the RL model. The RL model followed the clinician's medication rules in most cases but also suggested some changes, which leads to the difference in lowering symptoms severity. This is the first study to investigate RL to improve the pharmacological approach of PD patients. Our results contribute to the development of an interactive machine-physician ecosystem that relies on evidence-based medicine and can potentially enhance PD management.Entities:
Mesh:
Year: 2021 PMID: 33927277 PMCID: PMC8085228 DOI: 10.1038/s41598-021-88619-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Workflow data from an optimal medication regimen. We aimed to derive an optimal medication regimen that suggests the best combination of drugs given the current disease state to minimize motor symptoms of PD. (a) PPMI database consists of trajectories of visits of PD patients. Each visit has clinical assessments to measure the current disease status and the record of the medications used to address the motor symptoms. UPDRS III score measures the motor response to the medication. (b) Each visit was characterized as one of the discrete disease states that are defined using a decision tree regressor. (c) We formulated this medication therapy as an MDP. Based on the current disease state, a clinician makes a decision on a combination of medications, which in turn change motor symptoms and alters the disease state iteratively. (d) We computed the optional medication option on each state using policy iteration (in reinforcement learning).
Patient demographics.
| Patients (n = 431) | |
|---|---|
| Age at onset, years (mean, s.d.) | 61.7 (9.8) |
| Num. of visits (mean, s.d.) | 11.8 (3.5) |
| Months of follow-up (mean, s.d.) | 55.5 (26.6) |
| Num. of female | 149 |
| Num. of male | 282 |
| Visit (n = 5077) | |
| Total UPDRS scores (mean, s.d.) | 24.4 (11.6) |
| Total MoCA scores (mean, s.d.) | 26.6 (3.0) |
| Akinetic-rigid type | 4568 |
| Tremor-dominant type | 647 |
| Mixed type | 335 |
| 1 | 1476 |
| 2 | 3800 |
| 3 | 213 |
| 4 | 28 |
| 5 | 6 |
| No drugs | 1839 |
| Levodopa | 1157 |
| Dopamine agonist | 447 |
| Other PD medication | 442 |
| Levodopa + others | 356 |
| Levodopa + Dopamine agonist | 333 |
| Dopamine agonist + others | 260 |
| Levodopa + Dopamine agonist + others | 243 |
PD Parkinson’s disease, UPDRS Unified Parkinson’s Disease Rating Scales Part III motor score.
Statistical significance of variables used in disease state definition (i.e., clinical assessments) and action (medications).
| Time | Variables | Coef | |
|---|---|---|---|
| (Variables used in disease state definition) | |||
| Hoehn and Yahr | 0.8967 | 0 | |
| Age | 0.0864 | 0 | |
| Total MoCA | − 0.0534 | 0.01 | |
| Total UPDRS III scores | 0.7816 | 0 | |
| % change of total UPDRS scores | − 0.0246 | 0 | |
| Subtype AR (vs. mixed type) | 1.5576 | 0 | |
| Subtype TD (vs. mixed type) | 0.6056 | 0.213 | |
| Levodopa | 5.5486 | 0 | |
| Dopamine agonist | 4.5439 | 0 | |
| Other medicine | 2.2017 | 0 | |
| Levodopa + Other medicine | 5.942 | 0 | |
| Levodopa + Dopamine agonist | 5.5351 | 0 | |
| Dopamine agonist + Other medicine | 3.8986 | 0 | |
| Levodopa + Dopamine agonist + Other medicine | 6.1054 | 0 | |
| (Variables used in actions) | |||
| Levodopa | − 6.9801 | 0 | |
| Dopamine agonist | − 5.0293 | 0 | |
| Other medicine | − 0.825 | 0.094 | |
| Levodopa + other medicine | − 6.9232 | 0 | |
| Levodopa + Dopamine agonist | − 7.2501 | 0 | |
| Dopamine agonist + other medicine | − 5.1182 | 0 | |
| Levodopa + Dopamine agonist + other medicine | − 8.5083 | 0 | |
| (Outcome) total UPDRS III scores |
We fitted prediction model as total UPDRS scores at (penalty) ~ variables at (used in disease state) + variables at (actions) using multivariate regression (i.e., generalized least square) and selected variables that are statistically significant on total UPDRS scores on the next timestep (). Note that the % change of total UPDRS scores are 100% (Total UPDRS scores at − Total UPDRS scores at )/Total UPDRS scores at . AR = Akinetic-rigid, TD = tremor-dominant.
Figure 2Comparison of estimated penalty distributions (i.e., the estimated cumulative sum of future total UPDRS III scores). The penalty scores of an individual patient is where is total UPDRS III scores at each visit with a discount factor . The estimate of was computed by importance sampling. We computed the penalty scores distribution across 500 independent bootstrapping with resampled training (80%) and test (20%) set. The final policy was chosen by the majority vote from 500 bootstraps. (a) Comparison of the penalty scores from four different strategies: clinicians AI’s policy, Zero drug (i.e., no drugs are given for all states), and random drug (i.e., any random drugs are given). Each box extends from the lower to upper quartile. A horizontal line in the box is a median. (b) Pairwise comparison of the penalty scores between AI and clinician.
Disease states and recommended actions in each state.
| Disease state | Estimated Total UPDRS at | Actions | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| clinicians | AI | ||||||||||
| State | Subtype | Total UPDRS at | % UPDRS change | Age | L | D | O | L | D | O | |
| 0 | AR | ( , 8] | 10.2 | × | × | ||||||
| 1 | AR | [9, 11] | [− 25.6, ) | 11.9 | × | ||||||
| 2 | AR | [12, 14] | [− 25.6, ) | 14.7 | × | × | × | × | × | ||
| 3 | AR | [15,17] | [− 12.7, ) | 17.0 | × | × | |||||
| 4 | AR | [9,14] | ( , − 25.6] | 17.1 | × | × | × | × | |||
| 5 | AR | [18,19] | [− 12.7, ) | 19.3 | × | × | × | × | |||
| 6 | AR | [20,24] | [16.2, ) | 20.0 | × | × | × | × | |||
| 7 | AR | [15,19] | ( , − 12.7] | 21.9 | × | × | × | × | |||
| 8 | AR | [20,24] | (− 14.6, 16.228] | [65, ) | 23.4 | × | × | × | × | × | × |
| 9 | AR | [20,24] | (− 14.6, 16.228] | ( , 64] | 24.0 | × | × | × | |||
| 10 | AR | [25,33] | (39.6, ) | 25.1 | × | × | × | × | |||
| 11 | AR | [20,24] | ( , − 14.6] | 26.6 | × | × | × | × | × | × | |
| 12 | AR | [25,28] | ( − 7.28, 11.8] | 27.5 | × | × | × | × | |||
| 13 | AR | [25,33] | ( 11.8, 39.6] | 27.7 | × | × | × | × | |||
| 14 | AR | [25,28] | ( , − 7.28] | 29.5 | × | × | × | × | × | ||
| 15 | AR | [29,33] | (− 3.1, 11.8] | 30.5 | × | × | × | × | × | × | |
| 16 | AR | [34,39] | (10.6, ) | 31.3 | × | × | × | × | × | × | |
| 17 | AR | [29,33] | ( , − 3.1] | 32.5 | × | × | × | × | × | × | |
| 18 | AR | [34,39] | ( , 10.6] | 36.8 | × | × | × | × | |||
| 19 | AR | [40,45] | 37.9 | × | × | × | × | ||||
| 20 | AR | [46, ) | 46.5 | × | × | × | × | × | × | ||
| 21 | Mixed | ( ,16] | 13.5 | × | × | × | |||||
| 22 | Mixed | [17,24] | 21.3 | × | × | × | × | ||||
| 23 | Mixed | [25, ) | 32.1 | × | × | × | × | ||||
| 24 | TD | ( , 14] | 11.8 | × | × | ||||||
| 25 | TD | [15,19] | 19.0 | × | × | × | × | ||||
| 26 | TD | [20,28] | 23.0 | × | × | × | × | ||||
| 27 | TD | [29, ) | 33.1 | × | × | × | × | × | × | ||
L Levodopa, D dopamine agonist, O other medicine, AR Akinetic-rigid, TD tremor-dominant.
Figure 3Disease states and suggested medications. Shade intensity is proportional to the estimated subsequent total UPDRS III scores. L levodopa, D dopamine agonist, O other medications.