Literature DB >> 31377681

Studies in the use of data mining, prediction algorithms, and a universal exchange and inference language in the analysis of socioeconomic health data.

Barry Robson1, S Boray2.   

Abstract

While clinical and biomedical information in digital form has been escalating, it is socioeconomic factors that are important determinants of health on the national and global scale. We show how collective use of data mining and prediction algorithms to analyze socioeconomic population health data can stand beside classical correlation analysis in routine data analysis. The underlying theoretical basis is the Dirac notation and algebra that is a scientific standard but unusual outside of the physical sciences, combined with a theory of expected information first developed for analyzing sparse data but still largely confined to bioinformatics. The latter was important here because the records analyzed (which are for US counties and equivalents, not patients) are very few by contemporary data mining standards. The approach is very unlikely to be familiar to socioeconomic researchers, so the theory and the advantages of our inference nets over the Bayes Net are reviewed here, mostly using socioeconomic examples. While our expertise and focus is in regard to novel analytical methods rather than socioeconomics per se, a significant negative (countertrending) relationship between population health and equity was initially surprising, at least to the present authors. This encouraged deeper exploration including that of the relationship between our data mining methods and traditional Pearson's correlation. The latter is susceptible to giving wrong conclusions if a phenomenon called Simpson's paradox applies, so this is also investigated. Also discussed is that, even for very few records, associative data mining can still demand significant computational resources due to a combinatorial explosion.
Copyright © 2019 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Bayes net; Data analytics; Data mining; Decision support; Hyperbolic Dirac net; Inference net; Population health; Socioeconomic; Sparse data

Year:  2019        PMID: 31377681     DOI: 10.1016/j.compbiomed.2019.103369

Source DB:  PubMed          Journal:  Comput Biol Med        ISSN: 0010-4825            Impact factor:   4.589


  3 in total

Review 1.  Towards faster response against emerging epidemics and prediction of variants of concern.

Authors:  B Robson
Journal:  Inform Med Unlocked       Date:  2022-05-20

Review 2.  A scoping review on the use of machine learning in research on social determinants of health: Trends and research prospects.

Authors:  Shiho Kino; Yu-Tien Hsu; Koichiro Shiba; Yung-Shin Chien; Carol Mita; Ichiro Kawachi; Adel Daoud
Journal:  SSM Popul Health       Date:  2021-06-05

3.  Computers and viral diseases. Preliminary bioinformatics studies on the design of a synthetic vaccine and a preventative peptidomimetic antagonist against the SARS-CoV-2 (2019-nCoV, COVID-19) coronavirus.

Authors:  B Robson
Journal:  Comput Biol Med       Date:  2020-02-26       Impact factor: 4.589

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.