Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Improved small-sample estimation of nonlinear cross-validated prediction metrics.

Literature DB >> 33716360

Improved small-sample estimation of nonlinear cross-validated prediction metrics.

David Benkeser¹, Maya Petersen², Mark J van der Laan^2,3.

Abstract

When predicting an outcome is the scientific goal, one must decide on a metric by which to evaluate the quality of predictions. We consider the problem of measuring the performance of a prediction algorithm with the same data that were used to train the algorithm. Typical approaches involve bootstrapping or cross-validation. However, we demonstrate that bootstrap-based approaches often fail and standard cross-validation estimators may perform poorly. We provide a general study of cross-validation-based estimators that highlights the source of this poor performance, and propose an alternative framework for estimation using techniques from the efficiency theory literature. We provide a theorem establishing the weak convergence of our estimators. The general theorem is applied in detail to two specific examples and we discuss possible extensions to other parameters of interest. For the two explicit examples that we consider, our estimators demonstrate remarkable finite-sample improvements over standard approaches.

Entities: Chemical Disease Gene Species

Keywords: AUC; cross-validation; estimating equations; machine learning; prediction; targeted minimum loss-based estimation

Year: 2019 PMID： 33716360 PMCID： PMC7954141 DOI： 10.1080/01621459.2019.1668794

Source DB: PubMed Journal: J Am Stat Assoc ISSN： 0162-1459 Impact factor: 5.033

15 in total

1. SisPorto 2.0: a program for automated analysis of cardiotocograms.

Authors: D Ayres-de Campos; J Bernardes; A Garrido; J Marques-de-Sá; L Pereira-Leite
Journal: J Matern Fetal Med Date: 2000 Sep-Oct

2. Time-dependent ROC curves for censored survival data and a diagnostic marker.

Authors: P J Heagerty; T Lumley; M S Pepe
Journal: Biometrics Date: 2000-06 Impact factor: 2.571

3. Internal validation of predictive models: efficiency of some procedures for logistic regression analysis.

Authors: E W Steyerberg; F E Harrell; G J Borsboom; M J Eijkemans; Y Vergouwe; J D Habbema
Journal: J Clin Epidemiol Date: 2001-08 Impact factor: 6.437

Review 4. Risk prediction models: II. External validation, model updating, and impact assessment.

Authors: Karel G M Moons; Andre Pascal Kengne; Diederick E Grobbee; Patrick Royston; Yvonne Vergouwe; Douglas G Altman; Mark Woodward
Journal: Heart Date: 2012-03-07 Impact factor: 5.994

Review 5. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.

Authors: F E Harrell; K L Lee; D B Mark
Journal: Stat Med Date: 1996-02-28 Impact factor: 2.373

6. Online cross-validation-based ensemble learning.

Authors: David Benkeser; Cheng Ju; Sam Lendle; Mark van der Laan
Journal: Stat Med Date: 2017-05-04 Impact factor: 2.373

7. Constrained binary classification using ensemble learning: an application to cost-efficient targeted PrEP strategies.

Authors: Wenjing Zheng; Laura Balzer; Mark van der Laan; Maya Petersen
Journal: Stat Med Date: 2017-04-06 Impact factor: 2.373

8. Super-Learning of an Optimal Dynamic Treatment Rule.

Authors: Alexander R Luedtke; Mark J van der Laan
Journal: Int J Biostat Date: 2016-05-01 Impact factor: 0.968

9. A Generally Efficient Targeted Minimum Loss Based Estimator based on the Highly Adaptive Lasso.

Authors: Mark van der Laan
Journal: Int J Biostat Date: 2017-10-12 Impact factor: 0.968

10. Validation of prediction models: examining temporal and geographic stability of baseline risk and estimated covariate effects.

Authors: Peter C Austin; David van Klaveren; Yvonne Vergouwe; Daan Nieboer; Douglas S Lee; Ewout W Steyerberg
Journal: Diagn Progn Res Date: 2017-04-13

2 in total

1. Accounting for motion in resting-state fMRI: What part of the spectrum are we characterizing in autism spectrum disorder?

Authors: Mary Beth Nebel; Daniel E Lidstone; Liwei Wang; David Benkeser; Stewart H Mostofsky; Benjamin B Risk
Journal: Neuroimage Date: 2022-05-10 Impact factor: 7.400

2. Testing a global null hypothesis using ensemble machine learning methods.

Authors: Sunwoo Han; Youyi Fong; Ying Huang
Journal: Stat Med Date: 2022-03-07 Impact factor: 2.497

2 in total