Literature DB >> 21915433

Probability machines: consistent probability estimation using nonparametric learning machines.

J D Malley1, J Kruppa, A Dasgupta, K G Malley, A Ziegler.   

Abstract

BACKGROUND: Most machine learning approaches only provide a classification for binary responses. However, probabilities are required for risk estimation using individual patient characteristics. It has been shown recently that every statistical learning machine known to be consistent for a nonparametric regression problem is a probability machine that is provably consistent for this estimation problem.
OBJECTIVES: The aim of this paper is to show how random forests and nearest neighbors can be used for consistent estimation of individual probabilities.
METHODS: Two random forest algorithms and two nearest neighbor algorithms are described in detail for estimation of individual probabilities. We discuss the consistency of random forests, nearest neighbors and other learning machines in detail. We conduct a simulation study to illustrate the validity of the methods. We exemplify the algorithms by analyzing two well-known data sets on the diagnosis of appendicitis and the diagnosis of diabetes in Pima Indians.
RESULTS: Simulations demonstrate the validity of the method. With the real data application, we show the accuracy and practicality of this approach. We provide sample code from R packages in which the probability estimation is already available. This means that all calculations can be performed using existing software.
CONCLUSIONS: Random forest algorithms as well as nearest neighbor approaches are valid machine learning methods for estimating individual probabilities for binary responses. Freely available implementations are available in R and may be used for applications.

Entities:  

Mesh:

Year:  2011        PMID: 21915433      PMCID: PMC3250568          DOI: 10.3414/ME00-01-0052

Source DB:  PubMed          Journal:  Methods Inf Med        ISSN: 0026-1270            Impact factor:   2.176


  12 in total

1.  On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data.

Authors:  Daniel F Schwarz; Inke R König; Andreas Ziegler
Journal:  Bioinformatics       Date:  2010-05-26       Impact factor: 6.937

2.  A fast and efficient segmentation scheme for cell microscopic image.

Authors:  G Lebrun; C Charrier; O Lezoray; C Meurie; H Cardot
Journal:  Cell Mol Biol (Noisy-le-grand)       Date:  2007-04-27       Impact factor: 1.770

3.  Patient-centered yes/no prognosis using learning machines.

Authors:  I R König; J D Malley; S Pajevic; C Weimar; H-C Diener; A Ziegler
Journal:  Int J Data Min Bioinform       Date:  2008       Impact factor: 0.667

4.  On Graphically Checking Goodness-of-fit of Binary Logistic Regression Models.

Authors:  Gerhard Gillmann; C E Minder
Journal:  Methods Inf Med       Date:  2009-03-31       Impact factor: 2.176

5.  Robust Model-Free Multiclass Probability Estimation.

Authors:  Yichao Wu; Hao Helen Zhang; Yufeng Liu
Journal:  J Am Stat Assoc       Date:  2010-03-01       Impact factor: 5.033

6.  Bayesian network to predict breast cancer risk of mammographic microcalcifications and reduce number of benign biopsy results: initial experience.

Authors:  Elizabeth S Burnside; Daniel L Rubin; Jason P Fine; Ross D Shachter; Gale A Sisney; Winifred K Leung
Journal:  Radiology       Date:  2006-09       Impact factor: 11.105

7.  Assessing the generalizability of prognostic information.

Authors:  A C Justice; K E Covinsky; J A Berlin
Journal:  Ann Intern Med       Date:  1999-03-16       Impact factor: 25.391

8.  Developmental validation of the IrisPlex system: determination of blue and brown iris colour for forensic intelligence.

Authors:  Susan Walsh; Alexander Lindenbergh; Sofia B Zuniga; Titia Sijen; Peter de Knijff; Manfred Kayser; Kaye N Ballantyne
Journal:  Forensic Sci Int Genet       Date:  2010-10-14       Impact factor: 4.882

9.  Probability estimation of final height.

Authors:  T Tanaka; K Komatsu; G Takada; M Miyashita; T Ohno
Journal:  Endocr J       Date:  1998-04       Impact factor: 2.349

10.  The assessment of laboratory tests in the diagnosis of acute appendicitis.

Authors:  A Marchand; F Van Lente; R S Galen
Journal:  Am J Clin Pathol       Date:  1983-09       Impact factor: 2.493

View more
  56 in total

1.  Prediction of remission in obsessive compulsive disorder using a novel machine learning strategy.

Authors:  Kathleen D Askland; Sarah Garnaat; Nicholas J Sibrava; Christina L Boisseau; David Strong; Maria Mancebo; Benjamin Greenberg; Steve Rasmussen; Jane Eisen
Journal:  Int J Methods Psychiatr Res       Date:  2015-05-21       Impact factor: 4.035

2.  The Effect of Splitting on Random Forests.

Authors:  Hemant Ishwaran
Journal:  Mach Learn       Date:  2014-07-02       Impact factor: 2.940

3.  Personalized machine learning approach to predict candidemia in medical wards.

Authors:  Andrea Ripoli; Emanuela Sozio; Francesco Sbrana; Giacomo Bertolino; Carlo Pallotto; Gianluigi Cardinali; Simone Meini; Filippo Pieralli; Anna Maria Azzini; Ercole Concia; Bruno Viaggi; Carlo Tascini
Journal:  Infection       Date:  2020-08-01       Impact factor: 3.553

4.  Global analysis of adenylate-forming enzymes reveals β-lactone biosynthesis pathway in pathogenic Nocardia.

Authors:  Serina L Robinson; Barbara R Terlouw; Megan D Smith; Sacha J Pidot; Timothy P Stinear; Marnix H Medema; Lawrence P Wackett
Journal:  J Biol Chem       Date:  2020-08-21       Impact factor: 5.157

5.  Classification of hospital admissions into emergency and elective care: a machine learning approach.

Authors:  Jonas Krämer; Jonas Schreyögg; Reinhard Busse
Journal:  Health Care Manag Sci       Date:  2017-11-25

6.  Some methods for heterogeneous treatment effect estimation in high dimensions.

Authors:  Scott Powers; Junyang Qian; Kenneth Jung; Alejandro Schuler; Nigam H Shah; Trevor Hastie; Robert Tibshirani
Journal:  Stat Med       Date:  2018-03-06       Impact factor: 2.373

7.  Can Machine Learning Algorithms Predict Which Patients Will Achieve Minimally Clinically Important Differences From Total Joint Arthroplasty?

Authors:  Mark Alan Fontana; Stephen Lyman; Gourab K Sarker; Douglas E Padgett; Catherine H MacLean
Journal:  Clin Orthop Relat Res       Date:  2019-06       Impact factor: 4.176

8.  Ensemble of trees approaches to risk adjustment for evaluating a hospital's performance.

Authors:  Yang Liu; Mikhail Traskin; Scott A Lorch; Edward I George; Dylan Small
Journal:  Health Care Manag Sci       Date:  2014-04-29

9.  Looking for childhood-onset schizophrenia: diagnostic algorithms for classifying children and adolescents with psychosis.

Authors:  Deanna Greenstein; Rachna Kataria; Peter Gochman; Abhijit Dasgupta; James D Malley; Judith Rapoport; Nitin Gogtay
Journal:  J Child Adolesc Psychopharmacol       Date:  2014-07-14       Impact factor: 2.576

10.  A system-level pathway-phenotype association analysis using synthetic feature random forest.

Authors:  Qinxin Pan; Ting Hu; James D Malley; Angeline S Andrew; Margaret R Karagas; Jason H Moore
Journal:  Genet Epidemiol       Date:  2014-02-17       Impact factor: 2.135

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.