| Literature DB >> 27754498 |
Hong-Jun Yoon1, Georgia Tourassi2.
Abstract
Openly available online sources can be very valuable for executing in silico case-control epidemiological studies. Adjustment of confounding factors to isolate the association between an observing factor and disease is essential for such studies. However, such information is not always readily available online. This paper suggests natural language processing methods for extracting socio-demographic information from content openly available online. Feasibility of the suggested method is demonstrated by performing a case-control study focusing on the association between age, gender, and income level and lung cancer risk. The study shows stronger association between older age and lower socioeconomic status and higher lung cancer risk, which is consistent with the findings reported in traditional cancer epidemiology studies.Entities:
Year: 2016 PMID: 27754498 PMCID: PMC5049879 DOI: 10.1109/BHI.2016.7455958
Source DB: PubMed Journal: IEEE EMBS Int Conf Biomed Health Inform