Literature DB >> 26805004

Using data mining techniques to characterize participation in observational studies.

Ariel Linden1,2, Paul R Yarnold3.   

Abstract

Data mining techniques are gaining in popularity among health researchers for an array of purposes, such as improving diagnostic accuracy, identifying high-risk patients and extracting concepts from unstructured data. In this paper, we describe how these techniques can be applied to another area in the health research domain: identifying characteristics of individuals who do and do not choose to participate in observational studies. In contrast to randomized studies where individuals have no control over their treatment assignment, participants in observational studies self-select into the treatment arm and therefore have the potential to differ in their characteristics from those who elect not to participate. These differences may explain part, or all, of the difference in the observed outcome, making it crucial to assess whether there is differential participation based on observed characteristics. As compared to traditional approaches to this assessment, data mining offers a more precise understanding of these differences. To describe and illustrate the application of data mining in this domain, we use data from a primary care-based medical home pilot programme and compare the performance of commonly used classification approaches - logistic regression, support vector machines, random forests and classification tree analysis (CTA) - in correctly classifying participants and non-participants. We find that CTA is substantially more accurate than the other models. Moreover, unlike the other models, CTA offers transparency in its computational approach, ease of interpretation via the decision rules produced and provides statistical results familiar to health researchers. Beyond their application to research, data mining techniques could help administrators to identify new candidates for participation who may most benefit from the intervention.
© 2016 John Wiley & Sons, Ltd.

Entities:  

Keywords:  data mining; machine learning; observational studies; observed characteristics; selection; selection bias

Mesh:

Year:  2016        PMID: 26805004     DOI: 10.1111/jep.12515

Source DB:  PubMed          Journal:  J Eval Clin Pract        ISSN: 1356-1294            Impact factor:   2.431


  8 in total

1.  Predictors of enrollment in individual- and couple-based lifestyle intervention trials for cancer survivors.

Authors:  Emily Cox-Martin; Jaejoon Song; Wendy Demark-Wahnefried; Elizabeth J Lyons; Karen Basen-Engquist
Journal:  Support Care Cancer       Date:  2018-02-08       Impact factor: 3.603

2.  Patterns of attendance to health checks in a municipality setting: the Danish 'Check Your Health Preventive Program'.

Authors:  Anne-Louise Bjerregaard; Helle T Maindal; Niels Henrik Bruun; Annelli Sandbæk
Journal:  Prev Med Rep       Date:  2016-12-21

3.  Evaluation of the diagnostic performance of a decision tree model in suspected acute appendicitis with equivocal preoperative computed tomography findings compared with Alvarado, Eskelinen, and adult appendicitis scores: A STARD compliant article.

Authors:  Hyo Jung Kang; Hyuncheol Kang; Bohyun Kim; Min Seok Chae; Young Rock Ha; Seong Beom Oh; Jung Hwan Ahn
Journal:  Medicine (Baltimore)       Date:  2019-10       Impact factor: 1.889

Review 4.  A review of statistical methods for dietary pattern analysis.

Authors:  Junkang Zhao; Zhiyao Li; Qian Gao; Haifeng Zhao; Shuting Chen; Lun Huang; Wenjie Wang; Tong Wang
Journal:  Nutr J       Date:  2021-04-19       Impact factor: 3.271

5.  Activity Preferences Among Older People With Dementia Residing in Nursing Homes.

Authors:  Eun-Young Park; Jung-Hee Kim
Journal:  Front Psychol       Date:  2022-01-20

6.  Improving the reproducibility of findings by updating research methodology.

Authors:  Joseph Klein
Journal:  Qual Quant       Date:  2021-07-08

7.  Persistent erectile dysfunction in men exposed to the 5α-reductase inhibitors, finasteride, or dutasteride.

Authors:  Tina Kiguradze; William H Temps; Steven M Belknap; Paul R Yarnold; John Cashy; Robert E Brannigan; Beatrice Nardone; Giuseppe Micali; Dennis Paul West
Journal:  PeerJ       Date:  2017-03-09       Impact factor: 2.984

8.  Using Predictive Analytics to Identify Children at High Risk of Defaulting From a Routine Immunization Program: Feasibility Study.

Authors:  Subhash Chandir; Danya Arif Siddiqi; Owais Ahmed Hussain; Tahira Niazi; Mubarak Taighoon Shah; Vijay Kumar Dharma; Ali Habib; Aamir Javed Khan
Journal:  JMIR Public Health Surveill       Date:  2018-09-04
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.