| Literature DB >> 36100825 |
Fang Li1, Frederike Jörg2,3, Xinyu Li4, Talitha Feenstra4,5.
Abstract
The most appropriate next step in depression treatment after the initial treatment fails is unclear. This study explores the suitability of the Markov decision process for optimizing sequential treatment decisions for depression. We conducted a formal comparison of a Markov decision process approach and mainstream state-transition models as used in health economic decision analysis to clarify differences in the model structure. We performed two reviews: the first to identify existing applications of the Markov decision process in the field of healthcare and the second to identify existing health economic models for depression. We then illustrated the application of a Markov decision process by reformulating an existing health economic model. This provided input for discussing the suitability of a Markov decision process for solving sequential treatment decisions in depression. The Markov decision process and state-transition models differed in terms of flexibility in modeling actions and rewards. In all, 23 applications of a Markov decision process within the context of somatic disease were included, 16 of which concerned sequential treatment decisions. Most existing health economic models relating to depression have a state-transition structure. The example application replicated the health economic model and enabled additional capacity to make dynamic comparisons of more interventions over time than was possible with traditional state-transition models. Markov decision processes have been successfully applied to address sequential treatment-decision problems, although the results have been published mostly in economics journals that are not related to healthcare. One advantage of a Markov decision process compared with state-transition models is that it allows extended action space: the possibility of making dynamic comparisons of different treatments over time. Within the context of depression, although existing state-transition models are too basic to evaluate sequential treatment decisions, the assumptions of a Markov decision process could be satisfied. The Markov decision process could therefore serve as a powerful model for optimizing sequential treatment in depression. This would require a sufficiently elaborate state-transition model at the cohort or patient level.Entities:
Mesh:
Year: 2022 PMID: 36100825 PMCID: PMC9550715 DOI: 10.1007/s40273-022-01185-z
Source DB: PubMed Journal: Pharmacoeconomics ISSN: 1170-7690 Impact factor: 4.558
Elements of a Markov decision process (MDP) and comparable structures in a cohort-level state-transition model (STM)
| MDP element | Definition | Analogous STM component |
|---|---|---|
| Decision epoch | The time at which decisions are made | Cycle time (decisions usually made only in Cycle 1/before the start of the model, by defining different scenarios) |
| State space | Set of mutually exclusive, collectively exhaustive conditions that describe the possible state of the model | States |
| Action space | Set of possible decisions that can be made at each decision epoch | No specific analogy |
| Transition probabilities | Probability of each possible state of the system in the following period (conditional on decision and current state) | Transition probabilities (conditional on current state and scenario) |
| Reward function | The immediate benefits of taking a particular decision at each state | Pay-offs: costs and utilities linked to each state |
| Decision rule | A specified decision for each possible state at a specific epoch | No specific analogy |
| Policy | A sequence of decision rules for all epochs following the beginning time point | The treatment strategy is always defined |
Fig. 1Methodological framework of the present study. MDP Markov decision process, HE health economic
Fig. 2Flow chart of study selection for MDP applications in the field of healthcare. MDP Markov decision process
Summary of MDP model applications in healthcare
| Study | Time horizon (decision epoch) | Disease | State space | Action space | Reward function | Aim | Individual level |
|---|---|---|---|---|---|---|---|
| Eghbali-Zarch et al. (2019) [ | Finite (annual) | Type 2 diabetes | 10 states (defined by a discrete set of clinically relevant ranges of HbA1c levels) | Initial and delayed treatment as actions; (treatments considered: metformin, sulfonylureas, and insulin) | Expected QALYs | Minimize treatment decision-related adverse drug reactions | No |
| Mason et al. (2012) [ | Finite (annual) | Type 2 diabetes | 4 states (defined by adherence rate) | Initial and delayed treatment as actions (treatments considered: statins, metformin, sulfonylureas, and insulin) | Net monetary benefit | Optimize treatment decision | Yes |
| Meng et al. (2020) [ | Finite (every 3 mo) | Type 2 diabetes | 10 states (defined by HbA1c levels) | Initial and delayed treatment as actions (treatments considered: metformin, sulfonylureas, and insulin) | Expected QALYs | Optimize treatment decision | Yes |
| Oh et al. (2021) [ | Finite (annual) | Type 2 diabetes | 72 states (defined by complications of diabetes, risk of diabetes, time elapsed since the occurrence of diabetes, and fasting plasma glucose) | 1. Initial monotherapy treatment (metformin, sulfonylurea, or others) 2. Initial dual-therapy (metformin + sulfonylureas, metformin + DPP-4 inhibitors or others) 3. Initial triple therapy (metformin + sulfonylureas + α-glucose inhibitor) | Expected discounted QALYs | Optimize treatment decision | No |
| Shifrin and Siegelmann (2020) [ | Finite (6 times/d) | Diabetes | Unclear number of states (defined by blood glucose level, carbohydrate intake, and counter of treatment points) | 1. Providing insulin boluses based on sensor data 2. Traditional insulin care | Blood glucose level | Optimize treatment decision | Yes |
| Abdollahian and Das (2015) [ | Finite (annual) | Breast and ovarian cancer | 8492 states (defined by screening status, preventive surgery status, and age) | At ages 30, 40, and 50 y: 1. Do nothing 2. Conduct surgery 3. Start screening At all intermediate ages (31–39, 41–49, and 51–65 y): 1. Do nothing 2. Start screening 3. Stop screening | Costs/QALYs | Optimize treatment decisions | No |
| Akhavan-Tabatabaei et al. (2017) [ | Finite (every 6 mo) | Cervical cancer prevention | 12 states (defined by patient diagnosis, age, and record of last screening test) | 1. Do nothing 2. Pap test 3. Colposcopy without Pap test | Costs | Optimize screening policy | No |
| Kim et al., (2009)[ | Finite (user-defined) | Cancer | Unclear number of states (defined by OAR and tumor) | Choose a non-zero dose in each fraction | Patient utility | Optimize the treatment decision | No |
| Maass and Kim (2020) [ | Finite (user-defined) | Cancer | 11 states (defined by history of treatment, tissue side effect, tumor progression) | 1. Treatment modalities with a high risk 2. Treatment modalities with a lower risk 3. No treatment | Function of side effect and tumor progression | Optimize the treatment decision | No |
| Bazrafshan and Lotfi (2020) [ | Finite (after every 8 treatment cycles) | Gastroesophageal cancer | 3 states (low toxicity, moderate toxicity, high toxicity) | Sequential treatment (5 different types of chemotherapy/5 chemotherapy treatment strategies) | Expected costs | Optimize treatment decision | Yes |
| Alagoz et al. (2004) [ | Infinite (unclear) | Severe liver failure (in need of transplantation) | Unclear number of states (defined by the patient’s end-stage liver score) | 1. Transplant 2. Wait | QALYs | Optimize the timing of transplantation | No |
| Alagoz et al. (2007) [ | Infinite (unclear) | End-stage liver disease | Unclear number of states (defined by the patient’s end-stage liver score) | 1. Accepting the cadaveric offer 2. Accepting the living-donor liver 3. Waiting for one more period | QALYs | Optimize the timing of transplantation | No |
| Liu et al. (2017) [ | Finite (2-year) | Hepatitis C | 8 states (healthy, no fibrosis, portal fibrosis with no septa, portal fibrosis with few septa, numerous septa without cirrhosis, compensated cirrhosis, decompensated cirrhosis, hepatocellular carcinoma, and liver transplantation) | 1. Accepting a specific drug treatment 2. Waiting | QALYs | Optimize sequential treatment | Yes |
| Hauskrecht and Fraser (2000) [ | Infinite (unclear) | Ischemic heart disease | 11 state variables (characterizing multiple combinations of cardiovascular complications and diagnostic test outcomes) Unclear number of states | 1. No action 2. Medication treatment 3. Surgical procedure (angioplasty or coronary artery bypass surgery) 4. Investigative procedures (angiogram or stress test) | Costs | Optimize sequential treatment | Yes |
| Marrero et al. (2021) [ | Finite (annual) | ASCVD | 10 states (healthy, history of CHD but no adverse event, history of stroke but no adverse event, history of CHD and stroke but no adverse event, survived a CHD event, survived a stroke, death from non-cardiovascular disease, death from CHD event, death from stroke and dead) | 1. No treatment 2. Moderate-intensity statin drug 3. High-intensity statin drug | QALYs | Optimize treatment plans of genetic testing | Yes |
| Schell et al. 2016 [ | Finite (annual) | Hypertension | 10 states (healthy, history of CHD, history of stroke, history of CHD and stroke, survived a CHD, survived a stroke, death from non-CVD, death from CHD, death from stroke and dead) | Choose an appropriate medication treatment (from thiazide diuretics, β-blockers, calcium channel blockers, angiotensin-converting enzyme inhibitors, and angiotensin II receptor antagonists) | Discounted QALYs | Optimize sequential treatment | Yes |
| Choi et al. 2017 [ | Finite (annual) | Hypertension | 7 states (well, adverse event without CVD history, MI, stroke, post CVD, adverse event with CVD history, death) | 1. Stop medication treatment 2. Remain on current treatment 3. Change to other medication treatment | Discounted QALYs | Optimize sequential treatment | Yes |
| Ibrahim et al. (2016) [ | Finite (every clinical visit) | Stroke prevention Atrial fibrillation | 15 states (defined by patient sensitivity and the international normalized ratio of response to warfarin) | Choose an appropriate dosage of warfarin (0 mg; 2.5 mg; 5 mg; 7.5 mg; 10 mg) | Discounted QALYs | Optimize treatment decision | Yes |
| Tilson and Tilson (2013) [ | Finite (3 mo) | Aneurysms | 20 states (6 defined by recovery from treatment or a SAH event, plus 13 post-recovery states, and death) | 1. No treatment 2. Surgery 3. Endovascular repair | Discounted QALYs | Optimize initial treatment selection | No |
| Wu et al. (2012) [ | Finite (every hospital stay) | Acute ischemic stroke | 30 states (defined by 6 patient characteristics) | 1. Using anticoagulant agents 2. Using TCM treatment of replenishing 3. Using TCM treatments for clearing heat and extinguishing wind 4. Using use TCM treatments for relaxing the bowels 5. Using herbal medicine | Neurological functional impairment scores | Compare effectiveness of different treatment combinations | No |
| Shen et al. (2020) [ | Finite (every 7 d) | Stroke | 236 states (defined by age, disease history, complications, Western medicine diagnosis, and TCM syndrome differentiation) | 1. Use rehabilitation therapy 2. Use traditional Chinese medicine decoction 3. Use acupuncture treatment 4. No intervention | Scores for neurological function impairment | Optimize initial treatment selection | No |
| Escandell-Montero et al. (2014) [ | Finite (unclear) | Anemia | Unclear number of states (defined by degree of anemia, hemoglobin trend, dose of darbepoetin alfa, patient group) | Choose an appropriate dosage of darbepoetin alpha (0 µg/kg/wk; 0.25 µg/kg/wk; 0.50 µg/kg/wk; 0.70 µg/kg/wk; 1 µg/kg/wk) | Hemoglobin level | Optimize treatment decision | Yes |
| Suen et al. (2018) [ | Finite (every mo) | Tuberculosis | 7 states (cured of tuberculosis, patients with DS tuberculosis, patients with DR tuberculosis, healthy patients defaulted from treatment, DS patients defaulted from treatment, DR patients defaulted from treatment, death) | 1. Drug-sensitivity tests 2. Waiting and observing 3. Waiting and not observing | Net monetary benefit | Optimize treatment decision | No |
ASCVD atherosclerotic cardiovascular disease, CHD coronary heart disease, CVD cardiovascular disease, DPP-4 dipeptidyl peptidase 4, DR drug resistant, DS drug sensitivity, ESA erythropoiesis-stimulating agents, MDP Markov decision process, MI myocardial infarction, net monetary benefit monetary value of QALYs-total cost, OAR organs at risk, QALYs quality-adjusted life-years, SAH subarachnoid hemorrhage, TCM traditional Chinese medicine
Fig. 3Simplification of model structure in the original paper [74]
Fig. 4Process of the Markov decision process (MDP) model based on the original model by Ssegonja et al. [74]. Note: s denotes the current state; denotes the next state; denotes the reward at time t. The variable is a discount factor. indicates the monetary value of quality-adjusted life-years in the state at time t, taking the decision ; indicates the total cost in the state at time t, taking the decision ; denotes state value function, which is the expected monetary return starting from state s; indicates the expected monetary return starting from state s, taking action a at time t; indicates the optimal value function over all decisions in the state ; is the optimal value function for action a in the state ; t is measured in years
Value of different states with different willingness-to-pay thresholds
| WTP = 20,000 | |||||
| Healthy | Subthreshold depression | Depression | Recovered | Remission | |
| 137 | 122 | −52 | −12 | 32 | |
| WTP = 60,000 | |||||
| Healthy | Subthreshold depression | Depression | Recovered | Remission | |
| 430 | 407 | 160 | 216 | 280 | |
| WTP = 100,000 | |||||
| Healthy | Subthreshold depression | Depression | Recovered | Remission | |
| 722 | 692 | 372 | 445 | 528 | |
| This article demonstrates that the Markov decision process (MDP) has the potential to steer the optimization of sequential treatment to facilitate personalized treatment decisions. |
| This article specifically identifies applications of the MDP that have been used to address sequential decision problems in somatic diseases. The results indicate that the MDP could potentially be useful for addressing sequential decision making in depression. |
| Our study reveals that, although the structure of the state-transition model could potentially be suitable for extension into the MDP model, doing so would require a sufficiently extensive model. |