Literature DB >> 26615183

The utility of web mining for epidemiological research: studying the association between parity and cancer risk.

Georgia Tourassi1, Hong-Jun Yoon2, Songhua Xu3, Xuesong Han4.   

Abstract

BACKGROUND: The World Wide Web has emerged as a powerful data source for epidemiological studies related to infectious disease surveillance. However, its potential for cancer-related epidemiological discoveries is largely unexplored.
METHODS: Using advanced web crawling and tailored information extraction procedures, the authors automatically collected and analyzed the text content of 79 394 online obituary articles published between 1998 and 2014. The collected data included 51 911 cancer (27 330 breast; 9470 lung; 6496 pancreatic; 6342 ovarian; 2273 colon) and 27 483 non-cancer cases. With the derived information, the authors replicated a case-control study design to investigate the association between parity (i.e., childbearing) and cancer risk. Age-adjusted odds ratios (ORs) with 95% confidence intervals (CIs) were calculated for each cancer type and compared to those reported in large-scale epidemiological studies.
RESULTS: Parity was found to be associated with a significantly reduced risk of breast cancer (OR = 0.78, 95% CI, 0.75-0.82), pancreatic cancer (OR = 0.78, 95% CI, 0.72-0.83), colon cancer (OR = 0.67, 95% CI, 0.60-0.74), and ovarian cancer (OR = 0.58, 95% CI, 0.54-0.62). Marginal association was found for lung cancer risk (OR = 0.87, 95% CI, 0.81-0.92). The linear trend between increased parity and reduced cancer risk was dramatically more pronounced for breast and ovarian cancer than the other cancers included in the analysis.
CONCLUSION: This large web-mining study on parity and cancer risk produced findings very similar to those reported with traditional observational studies. It may be used as a promising strategy to generate study hypotheses for guiding and prioritizing future epidemiological studies.
© The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  cancer risk; digital epidemiology; parity; web mining

Mesh:

Year:  2015        PMID: 26615183      PMCID: PMC4901372          DOI: 10.1093/jamia/ocv141

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  53 in total

Review 1.  Parity and risk of lung cancer in women: systematic review and meta-analysis of epidemiological studies.

Authors:  Issa J Dahabreh; Thomas A Trikalinos; Jessica K Paulus
Journal:  Lung Cancer       Date:  2011-12-09       Impact factor: 5.705

2.  Accelerated clinical discovery using self-reported patient data collected online and a patient-matching algorithm.

Authors:  Paul Wicks; Timothy E Vaughan; Michael P Massagli; James Heywood
Journal:  Nat Biotechnol       Date:  2011-04-24       Impact factor: 54.908

3.  Infodemiology and infoveillance tracking online health information and cyberbehavior for public health.

Authors:  Gunther Eysenbach
Journal:  Am J Prev Med       Date:  2011-05       Impact factor: 5.043

Review 4.  Reproductive factors and breast cancer.

Authors:  J L Kelsey; M D Gammon; E M John
Journal:  Epidemiol Rev       Date:  1993       Impact factor: 6.222

5.  Childbearing, oral contraceptive use, and breast cancer.

Authors:  V Beral; G Reeves
Journal:  Lancet       Date:  1993-04-24       Impact factor: 79.321

6.  Reproductive factors, oral contraceptive use, and risk of colorectal cancer.

Authors:  R Troisi; C Schairer; W H Chow; A Schatzkin; L A Brinton; J F Fraumeni
Journal:  Epidemiology       Date:  1997-01       Impact factor: 4.822

7.  Ovarian cancer risk factors in African-American and white women.

Authors:  Patricia G Moorman; Rachel T Palmieri; Lucy Akushevich; Andrew Berchuck; Joellen M Schildkraut
Journal:  Am J Epidemiol       Date:  2009-07-15       Impact factor: 4.897

8.  Oral contraceptives, reproductive history and risk of colorectal cancer in the European Prospective Investigation into Cancer and Nutrition.

Authors:  K K Tsilidis; N E Allen; T J Key; K Bakken; E Lund; F Berrino; A Fournier; A Olsen; A Tjønneland; K Overvad; M-C Boutron-Ruault; F Clavel-Chapelon; G Byrnes; V Chajes; S Rinaldi; J Chang-Claude; R Kaaks; M Bergmann; H Boeing; Y Koumantaki; G Stasinopoulou; A Trichopoulou; D Palli; G Tagliabue; S Panico; R Tumino; P Vineis; H B Bueno-de-Mesquita; F J B van Duijnhoven; C H van Gils; P H M Peeters; L Rodríguez; C A González; M-J Sánchez; M-D Chirlaque; A Barricarte; M Dorronsoro; S Borgquist; J Manjer; B van Guelpen; G Hallmans; S A Rodwell; K-T Khaw; T Norat; D Romaguera; E Riboli
Journal:  Br J Cancer       Date:  2010-11-02       Impact factor: 7.640

9.  Reproductive factors and epithelial ovarian cancer risk by histologic type: a multiethnic case-control study.

Authors:  Ko-Hui Tung; Marc T Goodman; Anna H Wu; Katharine McDuffie; Lynne R Wilkens; Laurence N Kolonel; Abraham M Y Nomura; Keith Y Terada; Michael E Carney; Leslie H Sobin
Journal:  Am J Epidemiol       Date:  2003-10-01       Impact factor: 4.897

10.  Patient-reported outcomes as a source of evidence in off-label prescribing: analysis of data from PatientsLikeMe.

Authors:  Jeana Frost; Sally Okun; Timothy Vaughan; James Heywood; Paul Wicks
Journal:  J Med Internet Res       Date:  2011-01-21       Impact factor: 5.428

View more
  2 in total

1.  A novel web informatics approach for automated surveillance of cancer mortality trends.

Authors:  Georgia Tourassi; Hong-Jun Yoon; Songhua Xu
Journal:  J Biomed Inform       Date:  2016-04-01       Impact factor: 6.317

Review 2.  Digital Epidemiology: Use of Digital Data Collected for Non-epidemiological Purposes in Epidemiological Studies.

Authors:  Hyeoun-Ae Park; Hyesil Jung; Jeongah On; Seul Ki Park; Hannah Kang
Journal:  Healthc Inform Res       Date:  2018-10-31
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.