| Literature DB >> 20703563 |
Dario Gregori1, Michele Petrinco, Simona Bo, Rosalba Rosato, Eva Pagano, Paola Berchialla, Franco Merletti.
Abstract
We aim at evaluating how data-mining statistical techniques can be applied on medical records and administrative data of diabetes and how they differ in terms of capabilities of predicting outcomes (e.g. death). Data on 3,892 outpatient patients with a diagnosis of type 2 diabetes from the San Giovanni Battista Hospital in Torino. Six statistical classifiers were applied: Logistic regression (LR), Generalized Additive Model (GAM), Projection pursuit Regression (PPR), Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), Artificial Neural Networks (ANN). All models selected the same subset of covariates. ANN is the model performing worse, whereas simpler models, like LR, GAM and LDA seem to perform better. GAM is associated with a very small misclassification rate. The agreement in predicting individual outcomes among models is 0.23 (SE 0.06, Kappa). Monitoring on the basis of patients' characteristics is highly dependent from the statistical properties of the chosen statistical model.Entities:
Mesh:
Year: 2009 PMID: 20703563 DOI: 10.1007/s10916-009-9363-9
Source DB: PubMed Journal: J Med Syst ISSN: 0148-5598 Impact factor: 4.460