A Mayr1, B Hofner, M Schmid. 1. Institut für Medizininformatik, Biometrie und Epidemiologie, Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstr. 6, 91054 Erlangen, Germany. Andreas.Mayr@imbe.med.uni-erlangen.de
Abstract
OBJECTIVES: Component-wise boosting algorithms have evolved into a popular estimation scheme in biomedical regression settings. The iteration number of these algorithms is the most important tuning parameter to optimize their performance. To date, no fully automated strategy for determining the optimal stopping iteration of boosting algorithms has been proposed. METHODS: We propose a fully data-driven sequential stopping rule for boosting algorithms. It combines resampling methods with a modified version of an earlier stopping approach that depends on AIC-based information criteria. The new "subsampling after AIC" stopping rule is applied to component-wise gradient boosting algorithms. RESULTS: The newly developed sequential stopping rule outperformed earlier approaches if applied to both simulated and real data. Specifically, it improved purely AIC-based methods when used for the microarray-based prediction of the recurrence of metastases for stage II colon cancer patients. CONCLUSIONS: The proposed sequential stopping rule for boosting algorithms can help to identify the optimal stopping iteration already during the fitting process of the algorithm, at least for the most common loss functions.
OBJECTIVES: Component-wise boosting algorithms have evolved into a popular estimation scheme in biomedical regression settings. The iteration number of these algorithms is the most important tuning parameter to optimize their performance. To date, no fully automated strategy for determining the optimal stopping iteration of boosting algorithms has been proposed. METHODS: We propose a fully data-driven sequential stopping rule for boosting algorithms. It combines resampling methods with a modified version of an earlier stopping approach that depends on AIC-based information criteria. The new "subsampling after AIC" stopping rule is applied to component-wise gradient boosting algorithms. RESULTS: The newly developed sequential stopping rule outperformed earlier approaches if applied to both simulated and real data. Specifically, it improved purely AIC-based methods when used for the microarray-based prediction of the recurrence of metastases for stage II colon cancerpatients. CONCLUSIONS: The proposed sequential stopping rule for boosting algorithms can help to identify the optimal stopping iteration already during the fitting process of the algorithm, at least for the most common loss functions.
Authors: Andreas Mayr; Benjamin Hofner; Elisabeth Waldmann; Tobias Hepp; Sebastian Meyer; Olaf Gefeller Journal: Comput Math Methods Med Date: 2017-08-02 Impact factor: 2.238
Authors: Jan Menzenbach; Vera Guttenthaler; Andrea Kirfel; Arcangelo Ricchiuto; Claudia Neumann; Linda Adler; Marjetka Kieback; Lisa Velten; Rolf Fimmers; Andreas Mayr; Maria Wittmann Journal: Contemp Clin Trials Commun Date: 2019-12-04