Literature DB >> 35702420

Model Selection for Offline Reinforcement Learning: Practical Considerations for Healthcare Settings.

Shengpu Tang1, Jenna Wiens1.   

Abstract

Reinforcement learning (RL) can be used to learn treatment policies and aid decision making in healthcare. However, given the need for generalization over complex state/action spaces, the incorporation of function approximators (e.g., deep neural networks) requires model selection to reduce overfitting and improve policy performance at deployment. Yet a standard validation pipeline for model selection requires running a learned policy in the actual environment, which is often infeasible in a healthcare setting. In this work, we investigate a model selection pipeline for offline RL that relies on off-policy evaluation (OPE) as a proxy for validation performance. We present an in-depth analysis of popular OPE methods, highlighting the additional hyperparameters and computational requirements (fitting/inference of auxiliary models) when used to rank a set of candidate policies. We compare the utility of different OPE methods as part of the model selection pipeline in the context of learning to treat patients with sepsis. Among all the OPE methods we considered, fitted Q evaluation (FQE) consistently leads to the best validation ranking, but at a high computational cost. To balance this trade-off between accuracy of ranking and computational efficiency, we propose a simple two-stage approach to accelerate model selection by avoiding potentially unnecessary computation. Our work serves as a practical guide for offline RL model selection and can help RL practitioners select policies using real-world datasets. To facilitate reproducibility and future extensions, the code accompanying this paper is available online.

Entities:  

Year:  2021        PMID: 35702420      PMCID: PMC9190764     

Source DB:  PubMed          Journal:  Proc Mach Learn Res


  4 in total

1.  Mastering the game of Go without human knowledge.

Authors:  David Silver; Julian Schrittwieser; Karen Simonyan; Ioannis Antonoglou; Aja Huang; Arthur Guez; Thomas Hubert; Lucas Baker; Matthew Lai; Adrian Bolton; Yutian Chen; Timothy Lillicrap; Fan Hui; Laurent Sifre; George van den Driessche; Thore Graepel; Demis Hassabis
Journal:  Nature       Date:  2017-10-18       Impact factor: 49.962

2.  Identifying Distinct, Effective Treatments for Acute Hypotension with SODA-RL: Safely Optimized Diverse Accurate Reinforcement Learning.

Authors:  Joseph Futoma; Muhammad A Masood; Finale Doshi-Velez
Journal:  AMIA Jt Summits Transl Sci Proc       Date:  2020-05-30

3.  Human-level control through deep reinforcement learning.

Authors:  Volodymyr Mnih; Koray Kavukcuoglu; David Silver; Andrei A Rusu; Joel Veness; Marc G Bellemare; Alex Graves; Martin Riedmiller; Andreas K Fidjeland; Georg Ostrovski; Stig Petersen; Charles Beattie; Amir Sadik; Ioannis Antonoglou; Helen King; Dharshan Kumaran; Daan Wierstra; Shane Legg; Demis Hassabis
Journal:  Nature       Date:  2015-02-26       Impact factor: 49.962

4.  The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care.

Authors:  Matthieu Komorowski; Leo A Celi; Omar Badawi; Anthony C Gordon; A Aldo Faisal
Journal:  Nat Med       Date:  2018-10-22       Impact factor: 53.440

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.