Literature DB >> 35321441

Performance Guarantees for Policy Learning.

Alex Luedtke1,2, Antoine Chambaz3,4.   

Abstract

This article gives performance guarantees for the regret decay in optimal policy estimation. We give a margin-free result showing that the regret decay for estimating a within-class optimal policy is second-order for empirical risk minimizers over Donsker classes when the data are generated from a fixed data distribution that does not change with sample size, with regret decaying at a faster rate than the standard error of an efficient estimator of the value of an optimal policy. We also present a result giving guarantees on the regret decay of policy estimators for the case that the policy falls within a restricted class and the data are generated from local perturbations of a fixed distribution, where this guarantee is uniform in the direction of the local perturbation. Finally, we give a result from the classification literature that shows that faster regret decay is possible via plug-in estimation provided a margin condition holds. Three examples are considered. In these examples, the regret is expressed in terms of either the mean value or the median value, and the number of possible actions is either two or finitely many.

Entities:  

Keywords:  individualized treatment rules; personalized medicine; policy learning; precision medicine

Year:  2020        PMID: 35321441      PMCID: PMC8939837          DOI: 10.1214/19-aihp1034

Source DB:  PubMed          Journal:  Ann I H P Probab Stat        ISSN: 0246-0203            Impact factor:   1.851


  13 in total

1.  Targeted maximum likelihood estimation of natural direct effects.

Authors:  Wenjing Zheng; Mark J van der Laan
Journal:  Int J Biostat       Date:  2012-01-06       Impact factor: 0.968

2.  A doubly robust censoring unbiased transformation.

Authors:  Daniel Rubin; Mark J van der Laan
Journal:  Int J Biostat       Date:  2007       Impact factor: 0.968

3.  Comment.

Authors:  Alexander R Luedtke; Mark J van der Laan
Journal:  J Am Stat Assoc       Date:  2017-01-04       Impact factor: 5.033

4.  TARGETED SEQUENTIAL DESIGN FOR TARGETED LEARNING INFERENCE OF THE OPTIMAL TREATMENT RULE AND ITS MEAN REWARD.

Authors:  Antoine Chambaz; Wenjing Zheng; Mark J van der Laan
Journal:  Ann Stat       Date:  2017-12-15       Impact factor: 4.028

5.  Interactive Q-learning for Quantiles.

Authors:  Kristin A Linn; Eric B Laber; Leonard A Stefanski
Journal:  J Am Stat Assoc       Date:  2017-03-31       Impact factor: 5.033

6.  Targeted Learning of the Mean Outcome under an Optimal Dynamic Treatment Rule.

Authors:  Mark J van der Laan; Alexander R Luedtke
Journal:  J Causal Inference       Date:  2015-03

7.  Super-Learning of an Optimal Dynamic Treatment Rule.

Authors:  Alexander R Luedtke; Mark J van der Laan
Journal:  Int J Biostat       Date:  2016-05-01       Impact factor: 0.968

8.  Estimating Individualized Treatment Rules Using Outcome Weighted Learning.

Authors:  Yingqi Zhao; Donglin Zeng; A John Rush; Michael R Kosorok
Journal:  J Am Stat Assoc       Date:  2012-09-01       Impact factor: 5.033

9.  Estimating Optimal Treatment Regimes from a Classification Perspective.

Authors:  Baqun Zhang; Anastasios A Tsiatis; Marie Davidian; Min Zhang; Eric Laber
Journal:  Stat       Date:  2012-01-01

10.  Statistical issues and limitations in personalized medicine research with clinical trials.

Authors:  Daniel B Rubin; Mark J van der Laan
Journal:  Int J Biostat       Date:  2012-07-20       Impact factor: 0.968

View more
  1 in total

1.  Rejoinder: Optimal individualized decision rules using instrumental variable methods.

Authors:  Hongxiang Qiu; Marco Carone; Ekaterina Sadikova; Maria Petukhova; Ronald C Kessler; Alex Luedtke
Journal:  J Am Stat Assoc       Date:  2021       Impact factor: 4.369

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.