Jaime Lynn Speiser1, Kathryn E Callahan2, Denise K Houston2, Jason Fanning3, Thomas M Gill4, Jack M Guralnik5, Anne B Newman6, Marco Pahor7, W Jack Rejeski3, Michael E Miller1. 1. Department of Biostatistics and Data Science, Wake Forest School of Medicine, Winston-Salem, North Carolina. 2. Department of Internal Medicine, Section on Gerontology and Geriatric Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina. 3. Department of Health and Exercise Science, Wake Forest University, Winston-Salem, North Carolina. 4. Department of Internal Medicine, Yale School of Medicine, New Haven, Connecticut. 5. Department of Epidemiology and Public Health, University of Maryland School of Medicine, Baltimore. 6. Department of Epidemiology, Graduate School of Public Health, University of Pittsburgh, Pennsylvania. 7. Department of Aging and Geriatric Research, University of Florida, Gainesville.
Abstract
BACKGROUND: Advances in computational algorithms and the availability of large datasets with clinically relevant characteristics provide an opportunity to develop machine learning prediction models to aid in diagnosis, prognosis, and treatment of older adults. Some studies have employed machine learning methods for prediction modeling, but skepticism of these methods remains due to lack of reproducibility and difficulty in understanding the complex algorithms that underlie models. We aim to provide an overview of two common machine learning methods: decision tree and random forest. We focus on these methods because they provide a high degree of interpretability. METHOD: We discuss the underlying algorithms of decision tree and random forest methods and present a tutorial for developing prediction models for serious fall injury using data from the Lifestyle Interventions and Independence for Elders (LIFE) study. RESULTS: Decision tree is a machine learning method that produces a model resembling a flow chart. Random forest consists of a collection of many decision trees whose results are aggregated. In the tutorial example, we discuss evaluation metrics and interpretation for these models. Illustrated using data from the LIFE study, prediction models for serious fall injury were moderate at best (area under the receiver operating curve of 0.54 for decision tree and 0.66 for random forest). CONCLUSIONS: Machine learning methods offer an alternative to traditional approaches for modeling outcomes in aging, but their use should be justified and output should be carefully described. Models should be assessed by clinical experts to ensure compatibility with clinical practice.
BACKGROUND: Advances in computational algorithms and the availability of large datasets with clinically relevant characteristics provide an opportunity to develop machine learning prediction models to aid in diagnosis, prognosis, and treatment of older adults. Some studies have employed machine learning methods for prediction modeling, but skepticism of these methods remains due to lack of reproducibility and difficulty in understanding the complex algorithms that underlie models. We aim to provide an overview of two common machine learning methods: decision tree and random forest. We focus on these methods because they provide a high degree of interpretability. METHOD: We discuss the underlying algorithms of decision tree and random forest methods and present a tutorial for developing prediction models for serious fall injury using data from the Lifestyle Interventions and Independence for Elders (LIFE) study. RESULTS: Decision tree is a machine learning method that produces a model resembling a flow chart. Random forest consists of a collection of many decision trees whose results are aggregated. In the tutorial example, we discuss evaluation metrics and interpretation for these models. Illustrated using data from the LIFE study, prediction models for serious fall injury were moderate at best (area under the receiver operating curve of 0.54 for decision tree and 0.66 for random forest). CONCLUSIONS: Machine learning methods offer an alternative to traditional approaches for modeling outcomes in aging, but their use should be justified and output should be carefully described. Models should be assessed by clinical experts to ensure compatibility with clinical practice.
Authors: Marinka Zitnik; Francis Nguyen; Bo Wang; Jure Leskovec; Anna Goldenberg; Michael M Hoffman Journal: Inf Fusion Date: 2018-09-21 Impact factor: 12.975
Authors: Sulantha Mathotaarachchi; Tharick A Pascoal; Monica Shin; Andrea L Benedet; Min Su Kang; Thomas Beaudry; Vladimir S Fonov; Serge Gauthier; Pedro Rosa-Neto Journal: Neurobiol Aging Date: 2017-07-11 Impact factor: 4.673
Authors: Morgan E Grams; Lauren M Kucirka; Colleen F Hanrahan; Robert A Montgomery; Allan B Massie; Dorry L Segev Journal: J Am Geriatr Soc Date: 2012-01 Impact factor: 5.562
Authors: Wei Luo; Dinh Phung; Truyen Tran; Sunil Gupta; Santu Rana; Chandan Karmakar; Alistair Shilton; John Yearwood; Nevenka Dimitrova; Tu Bao Ho; Svetha Venkatesh; Michael Berk Journal: J Med Internet Res Date: 2016-12-16 Impact factor: 5.428
Authors: Jaime L Speiser; Kathryn E Callahan; Edward H Ip; Michael E Miller; Janet A Tooze; Stephen B Kritchevsky; Denise K Houston Journal: J Gerontol A Biol Sci Med Sci Date: 2022-05-05 Impact factor: 6.591
Authors: Weronika Grabowska; Wren Burton; Matthew H Kowalski; Robert Vining; Cynthia R Long; Anthony Lisi; Jeffrey M Hausdorff; Brad Manor; Dennis Muñoz-Vergara; Peter M Wayne Journal: BMC Musculoskelet Disord Date: 2022-09-05 Impact factor: 2.562