Literature DB >> 28088842

Unbiased split variable selection for random survival forests using maximally selected rank statistics.

Marvin N Wright1, Theresa Dankowski1, Andreas Ziegler1,2,3,4.   

Abstract

The most popular approach for analyzing survival data is the Cox regression model. The Cox model may, however, be misspecified, and its proportionality assumption may not always be fulfilled. An alternative approach for survival prediction is random forests for survival outcomes. The standard split criterion for random survival forests is the log-rank test statistic, which favors splitting variables with many possible split points. Conditional inference forests avoid this split variable selection bias. However, linear rank statistics are utilized by default in conditional inference forests to select the optimal splitting variable, which cannot detect non-linear effects in the independent variables. An alternative is to use maximally selected rank statistics for the split point selection. As in conditional inference forests, splitting variables are compared on the p-value scale. However, instead of the conditional Monte-Carlo approach used in conditional inference forests, p-value approximations are employed. We describe several p-value approximations and the implementation of the proposed random forest approach. A simulation study demonstrates that unbiased split variable selection is possible. However, there is a trade-off between unbiased split variable selection and runtime. In benchmark studies of prediction performance on simulated and real datasets, the new method performs better than random survival forests if informative dichotomous variables are combined with uninformative variables with more categories and better than conditional inference forests if non-linear covariate effects are included. In a runtime comparison, the method proves to be computationally faster than both alternatives, if a simple p-value approximation is used.
Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

Entities:  

Keywords:  maximally selected statistics; random forests; rank statistics; survival analysis; trees

Mesh:

Year:  2017        PMID: 28088842     DOI: 10.1002/sim.7212

Source DB:  PubMed          Journal:  Stat Med        ISSN: 0277-6715            Impact factor:   2.373


  21 in total

1.  Nuclear NADPH oxidase-4 associated with disease progression in renal cell carcinoma.

Authors:  Dharam Kaushik; Keith A Ashcraft; Hanzhang Wang; Karthigayan Shanmugasundaram; Pankil K Shah; Gabriela Gonzalez; Alia Nazarullah; Cooper B Tye; Michael A Liss; Deepak K Pruthi; Ahmed M Mansour; Wasim Chowdhury; Dean Bacich; Hao Zhang; Amanda L Watson; Karen Block; Denise O'Keefe; Ronald Rodriguez
Journal:  Transl Res       Date:  2020-05-31       Impact factor: 7.012

2.  Machine learning for optimized individual survival prediction in resectable upper gastrointestinal cancer.

Authors:  Jin-On Jung; Nerma Crnovrsanin; Naita Maren Wirsik; Henrik Nienhüser; Leila Peters; Felix Popp; André Schulze; Martin Wagner; Beat Peter Müller-Stich; Markus Wolfgang Büchler; Thomas Schmidt
Journal:  J Cancer Res Clin Oncol       Date:  2022-05-26       Impact factor: 4.553

3.  A Selective Review on Random Survival Forests for High Dimensional Data.

Authors:  Hong Wang; Gang Li
Journal:  Quant Biosci       Date:  2017

4.  Global prevalence of non-perennial rivers and streams.

Authors:  Mathis Loïc Messager; Bernhard Lehner; Charlotte Cockburn; Nicolas Lamouroux; Hervé Pella; Ton Snelder; Klement Tockner; Tim Trautmann; Caitlin Watt; Thibault Datry
Journal:  Nature       Date:  2021-06-16       Impact factor: 49.962

5.  Development of a dual-energy spectral computed tomography-based nomogram for the preoperative discrimination of histological grade in colorectal adenocarcinoma patients.

Authors:  Yuntai Cao; Guojin Zhang; Haihua Bao; Jialiang Ren; Zhan Wang; Jing Zhang; Zhiyong Zhao; Xiaohong Yan; Yanjun Chai; Junlin Zhou
Journal:  J Gastrointest Oncol       Date:  2021-04

6.  Extreme learning machine Cox model for high-dimensional survival analysis.

Authors:  Hong Wang; Gang Li
Journal:  Stat Med       Date:  2019-01-10       Impact factor: 2.497

7.  A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data.

Authors:  Justine B Nasejje; Henry Mwambi; Keertan Dheda; Maia Lesosky
Journal:  BMC Med Res Methodol       Date:  2017-07-28       Impact factor: 4.615

8.  Circulating PD-L1 is associated with T cell infiltration and predicts prognosis in patients with CRLM following hepatic resection.

Authors:  Xiuxing Chen; Ziming Du; Mayan Huang; Deshen Wang; William Pat Fong; Jieying Liang; Lei Fan; Yun Wang; Hui Yang; Zhigang Chen; Mingtao Hu; Ruihua Xu; Yuhong Li
Journal:  Cancer Immunol Immunother       Date:  2021-07-28       Impact factor: 6.968

9.  Application of random survival forests in understanding the determinants of under-five child mortality in Uganda in the presence of covariates that satisfy the proportional and non-proportional hazards assumption.

Authors:  Justine B Nasejje; Henry Mwambi
Journal:  BMC Res Notes       Date:  2017-09-07

10.  Predictive scores for identifying patients with type 2 diabetes mellitus at risk of acute myocardial infarction and sudden cardiac death.

Authors:  Sharen Lee; Jiandong Zhou; Cosmos Liutao Guo; Wing Tak Wong; Tong Liu; Ian Chi Kei Wong; Kamalan Jeevaratnam; Qingpeng Zhang; Gary Tse
Journal:  Endocrinol Diabetes Metab       Date:  2021-02-19
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.