Literature DB >> 23934941

A comparative study of variable selection methods in the context of developing psychiatric screening instruments.

Feihan Lu1, Eva Petkova.   

Abstract

The development of screening instruments for psychiatric disorders involves item selection from a pool of items in existing questionnaires assessing clinical and behavioral phenotypes. A screening instrument should consist of only a few items and have good accuracy in classifying cases and non-cases. Variable/item selection methods such as Least Absolute Shrinkage and Selection Operator (LASSO), Elastic Net, Classification and Regression Tree, Random Forest, and the two-sample t-test can be used in such context. Unlike situations where variable selection methods are most commonly applied (e.g., ultra high-dimensional genetic or imaging data), psychiatric data usually have lower dimensions and are characterized by the following factors: correlations and possible interactions among predictors, unobservability of important variables (i.e., true variables not measured by available questionnaires), amount and pattern of missing values in the predictors, and prevalence of cases in the training data. We investigate how these factors affect the performance of several variable selection methods and compare them with respect to selection performance and prediction error rate via simulations. Our results demonstrated that: (1) for complete data, LASSO and Elastic Net outperformed other methods with respect to variable selection and future data prediction, and (2) for certain types of incomplete data, Random Forest induced bias in imputation, leading to incorrect ranking of variable importance. We propose the Imputed-LASSO combining Random Forest imputation and LASSO; this approach offsets the bias in Random Forest and offers a simple yet efficient item selection approach for missing data. As an illustration, we apply the methods to items from the standard Autism Diagnostic Interview-Revised version.
Copyright © 2013 John Wiley & Sons, Ltd.

Entities:  

Keywords:  classification and regression tree; elastic net; least absolute shrinkage and selection operator; missing data imputation; random forest; two-sample t-test

Mesh:

Year:  2013        PMID: 23934941      PMCID: PMC4026268          DOI: 10.1002/sim.5937

Source DB:  PubMed          Journal:  Stat Med        ISSN: 0277-6715            Impact factor:   2.373


  14 in total

1.  A quick and reliable screening measure for OCD in youth: reliability and validity of the obsessive compulsive scale of the Child Behavior Checklist.

Authors:  Daniel A Geller; Robert Doyle; David Shaw; Benjamin Mullin; Barbara Coffey; Carter Petty; Fe Vivas; Joseph Biederman
Journal:  Compr Psychiatry       Date:  2006 May-Jun       Impact factor: 3.735

2.  Prediction of outcome in bulimia nervosa by early change in treatment.

Authors:  Christopher G Fairburn; W Stewart Agras; B Timothy Walsh; G Terence Wilson; Eric Stice
Journal:  Am J Psychiatry       Date:  2004-12       Impact factor: 18.112

Review 3.  A review of feature selection techniques in bioinformatics.

Authors:  Yvan Saeys; Iñaki Inza; Pedro Larrañaga
Journal:  Bioinformatics       Date:  2007-08-24       Impact factor: 6.937

4.  A Selective Overview of Variable Selection in High Dimensional Feature Space.

Authors:  Jianqing Fan; Jinchi Lv
Journal:  Stat Sin       Date:  2010-01       Impact factor: 1.261

5.  Molecular abnormalities in the major psychiatric illnesses: Classification and Regression Tree (CRT) analysis of post-mortem prefrontal markers.

Authors:  M B Knable; B M Barci; J J Bartko; M J Webster; E F Torrey
Journal:  Mol Psychiatry       Date:  2002       Impact factor: 15.992

6.  Autism Diagnostic Interview-Revised: a revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders.

Authors:  C Lord; M Rutter; A Le Couteur
Journal:  J Autism Dev Disord       Date:  1994-10

7.  Autism diagnostic observation schedule: a standardized observation of communicative and social behavior.

Authors:  C Lord; M Rutter; S Goode; J Heemsbergen; H Jordan; L Mawhood; E Schopler
Journal:  J Autism Dev Disord       Date:  1989-06

8.  Bias in random forest variable importance measures: illustrations, sources and a solution.

Authors:  Carolin Strobl; Anne-Laure Boulesteix; Achim Zeileis; Torsten Hothorn
Journal:  BMC Bioinformatics       Date:  2007-01-25       Impact factor: 3.169

9.  Assessing stability of gene selection in microarray data analysis.

Authors:  Xing Qiu; Yuanhui Xiao; Alexander Gordon; Andrei Yakovlev
Journal:  BMC Bioinformatics       Date:  2006-02-01       Impact factor: 3.169

10.  Use of machine learning to shorten observation-based screening and diagnosis of autism.

Authors:  D P Wall; J Kosmicki; T F Deluca; E Harstad; V A Fusaro
Journal:  Transl Psychiatry       Date:  2012-04-10       Impact factor: 6.222

View more
  14 in total

1.  A predictive model for conversion to psychosis in clinical high-risk patients.

Authors:  Adam J Ciarleglio; Gary Brucato; Michael D Masucci; Rebecca Altschuler; Tiziano Colibazzi; Cheryl M Corcoran; Francesca M Crump; Guillermo Horga; Eugénie Lehembre-Shiah; Wei Leong; Scott A Schobel; Melanie M Wall; Lawrence H Yang; Jeffrey A Lieberman; Ragy R Girgis
Journal:  Psychol Med       Date:  2018-06-28       Impact factor: 7.723

Review 2.  Evidence-based statistical analysis and methods in biomedical research (SAMBR) checklists according to design features.

Authors:  Alok Kumar Dwivedi; Rakesh Shukla
Journal:  Cancer Rep (Hoboken)       Date:  2019-08-22

Review 3.  Psychometric and Machine Learning Approaches to Reduce the Length of Scales.

Authors:  Oscar Gonzalez
Journal:  Multivariate Behav Res       Date:  2020-08-04       Impact factor: 5.923

4.  Three machine learning algorithms and their utility in exploring risk factors associated with primary cesarean section in low-risk women: A methods paper.

Authors:  Rebecca R S Clark; Jintong Hou
Journal:  Res Nurs Health       Date:  2021-03-02       Impact factor: 2.238

5.  Variable selection in omics data: A practical evaluation of small sample sizes.

Authors:  Alexander Kirpich; Elizabeth A Ainsworth; Jessica M Wedow; Jeremy R B Newman; George Michailidis; Lauren M McIntyre
Journal:  PLoS One       Date:  2018-06-21       Impact factor: 3.240

6.  Classification and Regression Tree and Computer Adaptive Testing in Cardiac Rehabilitation: Instrument Validation Study.

Authors:  Linda Peute; Thom Scheeve; Monique Jaspers
Journal:  J Med Internet Res       Date:  2020-01-30       Impact factor: 5.428

7.  Principal variable selection to explain grain yield variation in winter wheat from features extracted from UAV imagery.

Authors:  Jiating Li; Arun-Narenthiran Veeranampalayam-Sivakumar; Madhav Bhatta; Nicholas D Garst; Hannah Stoll; P Stephen Baenziger; Vikas Belamkar; Reka Howard; Yufeng Ge; Yeyin Shi
Journal:  Plant Methods       Date:  2019-11-01       Impact factor: 4.993

8.  A new framework for prediction and variable selection for uncommon events in a large prospective cohort study.

Authors:  Hye-Seung Lee; Jeffrey P Krischer
Journal:  Model Assist Stat Appl       Date:  2017-08-30

9.  Development of an Algorithm to Identify Patients with Physician-Documented Insomnia.

Authors:  Uri Kartoun; Rahul Aggarwal; Andrew L Beam; Jennifer K Pai; Arnaub K Chatterjee; Timothy P Fitzgerald; Isaac S Kohane; Stanley Y Shaw
Journal:  Sci Rep       Date:  2018-05-18       Impact factor: 4.379

10.  Clinical diagnosis of partial or complete anterior cruciate ligament tears using patients' history elements and physical examination tests.

Authors:  Simon Décary; Michel Fallaha; Sylvain Belzile; Johanne Martel-Pelletier; Jean-Pierre Pelletier; Debbie Feldman; Marie-Pierre Sylvestre; Pascal-André Vendittoli; François Desmeules
Journal:  PLoS One       Date:  2018-06-12       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.