Literature DB >> 16763665

A Generalization Error for Q-Learning.

Susan A Murphy1.   

Abstract

Planning problems that involve learning a policy from a single training set of finite horizon trajectories arise in both social science and medical fields. We consider Q-learning with function approximation for this setting and derive an upper bound on the generalization error. This upper bound is in terms of quantities minimized by a Q-learning algorithm, the complexity of the approximation space and an approximation term due to the mismatch between Q-learning and the goal of learning a policy that maximizes the value function.

Year:  2005        PMID: 16763665      PMCID: PMC1475741     

Source DB:  PubMed          Journal:  J Mach Learn Res        ISSN: 1532-4435            Impact factor:   3.654


  6 in total

1.  Less is more? STI in acute and chronic HIV-1 infection.

Authors:  M Altfeld; B D Walker
Journal:  Nat Med       Date:  2001-08       Impact factor: 53.440

2.  National Institute of Mental Health Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE): Alzheimer disease trial methodology.

Authors:  L S Schneider; P N Tariot; C G Lyketsos; K S Dagerman; K L Davis; S Davis; J K Hsiao; D V Jeste; I R Katz; J T Olin; B G Pollock; P V Rabins; R A Rosenheck; G W Small; B Lebowitz; J A Lieberman
Journal:  Am J Geriatr Psychiatry       Date:  2001       Impact factor: 4.105

3.  Evaluating multiple treatment courses in clinical trials.

Authors:  P F Thall; R E Millikan; H G Sung
Journal:  Stat Med       Date:  2000-04-30       Impact factor: 2.373

4.  Texas Medication Algorithm Project, phase 3 (TMAP-3): rationale and study design.

Authors:  A John Rush; M Lynn Crismon; T Michael Kashner; Marcia G Toprac; Thomas J Carmody; Madhukar H Trivedi; Trisha Suppes; Alexander L Miller; Melanie M Biggs; Kathy Shores-Wilson; Bradley P Witte; Steven P Shon; William V Rago; Kenneth Z Altshuler
Journal:  J Clin Psychiatry       Date:  2003-04       Impact factor: 4.384

Review 5.  Background and rationale for the sequenced treatment alternatives to relieve depression (STAR*D) study.

Authors:  Maurizio Fava; A John Rush; Madhukar H Trivedi; Andrew A Nierenberg; Michael E Thase; Harold A Sackeim; Frederic M Quitkin; Steven Wisniewski; Philip W Lavori; Jerrold F Rosenbaum; David J Kupfer
Journal:  Psychiatr Clin North Am       Date:  2003-06

Review 6.  Using behavioral reinforcement to improve methadone treatment participation.

Authors:  Robert K Brooner; Michael Kidorf
Journal:  Sci Pract Perspect       Date:  2002-07
  6 in total
  53 in total

1.  Dynamic treatment regimes for managing chronic health conditions: a statistical perspective.

Authors:  Bibhas Chakraborty
Journal:  Am J Public Health       Date:  2010-11-18       Impact factor: 9.308

2.  Reinforcement learning design for cancer clinical trials.

Authors:  Yufan Zhao; Michael R Kosorok; Donglin Zeng
Journal:  Stat Med       Date:  2009-11-20       Impact factor: 2.373

3.  DOUBLY ROBUST ESTIMATION OF OPTIMAL TREATMENT REGIMES FOR SURVIVAL DATA-WITH APPLICATION TO AN HIV/AIDS STUDY.

Authors:  Runchao Jiang; Wenbin Lu; Rui Song; Michael G Hudgens; Sonia Naprvavnik
Journal:  Ann Appl Stat       Date:  2017-10-05       Impact factor: 2.083

4.  iqLearn: Interactive Q-Learning in R.

Authors:  Kristin A Linn; Eric B Laber; Leonard A Stefanski
Journal:  J Stat Softw       Date:  2015-03-20       Impact factor: 6.440

5.  Robust regression for optimal individualized treatment rules.

Authors:  W Xiao; H H Zhang; W Lu
Journal:  Stat Med       Date:  2019-02-11       Impact factor: 2.373

6.  A First Step Towards Behavioral Coaching for Managing Stress: A Case Study on Optimal Policy Estimation with Multi-stage Threshold Q-learning.

Authors:  Xinyu Hu; Pei-Yun S Hsueh; Ching-Hua Chen; Keith M Diaz; Ying-Kuen K Cheung; Min Qian
Journal:  AMIA Annu Symp Proc       Date:  2018-04-16

Review 7.  Recent development on statistical methods for personalized medicine discovery.

Authors:  Yingqi Zhao; Donglin Zeng
Journal:  Front Med       Date:  2013-02-02       Impact factor: 4.592

8.  Q-learning residual analysis: application to the effectiveness of sequences of antipsychotic medications for patients with schizophrenia.

Authors:  Ashkan Ertefaie; Susan Shortreed; Bibhas Chakraborty
Journal:  Stat Med       Date:  2016-01-10       Impact factor: 2.373

Review 9.  Inference for non-regular parameters in optimal dynamic treatment regimes.

Authors:  Bibhas Chakraborty; Susan Murphy; Victor Strecher
Journal:  Stat Methods Med Res       Date:  2009-07-16       Impact factor: 3.021

10.  A single-index model with multiple-links.

Authors:  Hyung Park; Eva Petkova; Thaddeus Tarpey; R Todd Ogden
Journal:  J Stat Plan Inference       Date:  2019-07-04       Impact factor: 1.111

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.