Literature DB >> 30533972

An evaluation of the bootstrap for model validation in mixture models.

Thomas Jaki1, Ting-Li Su2, Minjung Kim3, M Lee Van Horn4.   

Abstract

Bootstrapping has been used as a diagnostic tool for validating model results for a wide array of statistical models. Here we evaluate the use of the non-parametric bootstrap for model validation in mixture models. We show that the bootstrap is problematic for validating the results of class enumeration and demonstrating the stability of parameter estimates in both finite mixture and regression mixture models. In only 44% of simulations did bootstrapping detect the correct number of classes in at least 90% of the bootstrap samples for a finite mixture model without any model violations. For regression mixture models and cases with violated model assumptions, the performance was even worse. Consequently, we cannot recommend the non-parametric bootstrap for validating mixture models. The cause of the problem is that when resampling is used influential individual observations have a high likelihood of being sampled many times. The presence of multiple replications of even moderately extreme observations is shown to lead to additional latent classes being extracted. To verify that these replications cause the problems we show that leave-k-out cross-validation where sub-samples taken without replacement does not suffer from the same problem.

Entities:  

Keywords:  Finite mixture models; leave-k-out cross-validation; model validation; nonparametric Bootstrap; regression mixture models

Year:  2017        PMID: 30533972      PMCID: PMC6284826          DOI: 10.1080/03610918.2017.1303726

Source DB:  PubMed          Journal:  Commun Stat Simul Comput        ISSN: 0361-0918            Impact factor:   1.118


  8 in total

1.  Distributional assumptions of growth mixture models: implications for overextraction of latent trajectory classes.

Authors:  Daniel J Bauer; Patrick J Curran
Journal:  Psychol Methods       Date:  2003-09

2.  The integration of continuous and discrete latent variable models: potential problems and promising opportunities.

Authors:  Daniel J Bauer; Patrick J Curran
Journal:  Psychol Methods       Date:  2004-03

3.  Using Multilevel Mixtures to Evaluate Intervention Effects in Group Randomized Trials.

Authors:  M Lee Van Horn; Abigail A Fagan; Thomas Jaki; Eric C Brown; J David Hawkins; Michael W Arthur; Robert D Abbott; Richard F Catalano
Journal:  Multivariate Behav Res       Date:  2008 Apr-Jun       Impact factor: 5.923

4.  Using regression mixture models with non-normal data: Examining an ordered polytomous approach.

Authors:  Melissa R W George; Na Yang; M Lee Van Horn; Jessalyn Smith; Thomas Jaki; Dan Feaster; Katherine Masyn; George Howe
Journal:  J Stat Comput Simul       Date:  2013-01-01       Impact factor: 1.424

5.  Not quite normal: Consequences of violating the assumption of normality in regression mixture models.

Authors:  M Lee Van Horn; Jessalyn Smith; Abigail A Fagan; Thomas Jaki; Daniel J Feaster; Katherine Masyn; J David Hawkins; George Howe
Journal:  Struct Equ Modeling       Date:  2012-05-17       Impact factor: 6.125

6.  Assessing differential effects: applying regression mixture models to identify variations in the influence of family resources on academic achievement.

Authors:  M Lee Van Horn; Thomas Jaki; Katherine Masyn; Sharon Landesman Ramey; Jessalyn A Smith; Susan Antaramian
Journal:  Dev Psychol       Date:  2009-09

7.  Differential Effects for Sexual Risk Behavior: An Application of Finite Mixture Regression.

Authors:  Stephanie T Lanza; Kari C Kugler; Charu Mathur
Journal:  Open Fam Stud J       Date:  2011

8.  Evaluating differential effects using regression interactions and regression mixture models.

Authors:  M Lee Van Horn; Thomas Jaki; Katherine Masyn; George Howe; Daniel J Feaster; Andrea E Lamont; Melissa R W George; Minjung Kim
Journal:  Educ Psychol Meas       Date:  2014-10-28       Impact factor: 2.821

  8 in total
  2 in total

1.  ClickGene: an open cloud-based platform for big pan-cancer data genome-wide association study, visualization and exploration.

Authors:  Jia-Hao Bi; Yi-Fan Tong; Zhe-Wei Qiu; Xing-Feng Yang; John Minna; Adi F Gazdar; Kai Song
Journal:  BioData Min       Date:  2019-06-26       Impact factor: 2.522

2.  Analysis of risk factors and establishment of a risk prediction model for post-transplant diabetes mellitus after kidney transplantation.

Authors:  Fang Cheng; Qiang Li; Jinglin Wang; Zhendi Wang; Fang Zeng; Yu Zhang
Journal:  Saudi Pharm J       Date:  2022-06-02       Impact factor: 4.562

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.