Literature DB >> 30292855

Social media mining for birth defects research: A rule-based, bootstrapping approach to collecting data for rare health-related events on Twitter.

Ari Z Klein1, Abeed Sarker2, Haitao Cai3, Davy Weissenbacher4, Graciela Gonzalez-Hernandez5.   

Abstract

BACKGROUND: Although birth defects are the leading cause of infant mortality in the United States, methods for observing human pregnancies with birth defect outcomes are limited.
OBJECTIVE: The primary objectives of this study were (i) to assess whether rare health-related events-in this case, birth defects-are reported on social media, (ii) to design and deploy a natural language processing (NLP) approach for collecting such sparse data from social media, and (iii) to utilize the collected data to discover a cohort of women whose pregnancies with birth defect outcomes could be observed on social media for epidemiological analysis.
METHODS: To assess whether birth defects are mentioned on social media, we mined 432 million tweets posted by 112,647 users who were automatically identified via their public announcements of pregnancy on Twitter. To retrieve tweets that mention birth defects, we developed a rule-based, bootstrapping approach, which relies on a lexicon, lexical variants generated from the lexicon entries, regular expressions, post-processing, and manual analysis guided by distributional properties. To identify users whose pregnancies with birth defect outcomes could be observed for epidemiological analysis, inclusion criteria were (i) tweets indicating that the user's child has a birth defect, and (ii) accessibility to the user's tweets during pregnancy. We conducted a semi-automatic evaluation to estimate the recall of the tweet-collection approach, and performed a preliminary assessment of the prevalence of selected birth defects among the pregnancy cohort derived from Twitter.
RESULTS: We manually annotated 16,822 retrieved tweets, distinguishing tweets indicating that the user's child has a birth defect (true positives) from tweets that merely mention birth defects (false positives). Inter-annotator agreement was substantial: κ = 0.79 (Cohen's kappa). Analyzing the timelines of the 646 users whose tweets were true positives resulted in the discovery of 195 users that met the inclusion criteria. Congenital heart defects are the most common type of birth defect reported on Twitter, consistent with findings in the general population. Based on an evaluation of 4169 tweets retrieved using alternative text mining methods, the recall of the tweet-collection approach was 0.95.
CONCLUSIONS: Our contributions include (i) evidence that rare health-related events are indeed reported on Twitter, (ii) a generalizable, systematic NLP approach for collecting sparse tweets, (iii) a semi-automatic method to identify undetected tweets (false negatives), and (iv) a collection of publicly available tweets by pregnant users with birth defect outcomes, which could be used for future epidemiological analysis. In future work, the annotated tweets could be used to train machine learning algorithms to automatically identify users reporting birth defect outcomes, enabling the large-scale use of social media mining as a complementary method for such epidemiological research.
Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Birth defects; Cohort discovery; Epidemiology; Natural language processing; Patient-reported pregnancy outcomes; Social media mining

Mesh:

Year:  2018        PMID: 30292855      PMCID: PMC6295660          DOI: 10.1016/j.jbi.2018.10.001

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  23 in total

Review 1.  Difficulties in the study of adverse fetal and neonatal effects of drug therapy during pregnancy.

Authors:  R M Ward
Journal:  Semin Perinatol       Date:  2001-06       Impact factor: 3.300

2.  Birth defects data from population-based birth defects surveillance programs in the United States, 2007 to 2011: highlighting orofacial clefts.

Authors:  Cara T Mai; Cynthia H Cassell; Robert E Meyer; Jennifer Isenburg; Mark A Canfield; Russel Rickard; Richard S Olney; Erin B Stallings; Meredith Beck; S Shahrukh Hashmi; Sook Ja Cho; Russell S Kirby
Journal:  Birth Defects Res A Clin Mol Teratol       Date:  2014-11-14

Review 3.  Utilizing social media data for pharmacovigilance: A review.

Authors:  Abeed Sarker; Rachel Ginn; Azadeh Nikfarjam; Karen O'Connor; Karen Smith; Swetha Jayaraman; Tejaswi Upadhaya; Graciela Gonzalez
Journal:  J Biomed Inform       Date:  2015-02-23       Impact factor: 6.317

4.  Infant Mortality Statistics From the 2013 Period Linked Birth/Infant Death Data Set.

Authors:  T J Matthews; Marian F MacDorman; Marie E Thoma
Journal:  Natl Vital Stat Rep       Date:  2015-08-06

5.  Understanding interobserver agreement: the kappa statistic.

Authors:  Anthony J Viera; Joanne M Garrett
Journal:  Fam Med       Date:  2005-05       Impact factor: 1.756

6.  Performing research in pregnancy: Challenges and perspectives.

Authors:  Rebecca I Hartman; Alexa B Kimball
Journal:  Clin Dermatol       Date:  2016-02-11       Impact factor: 3.541

7.  Prevalence of congenital heart defects in metropolitan Atlanta, 1998-2005.

Authors:  Mark D Reller; Matthew J Strickland; Tiffany Riehle-Colarusso; William T Mahle; Adolfo Correa
Journal:  J Pediatr       Date:  2008-07-26       Impact factor: 4.406

8.  Update on overall prevalence of major birth defects--Atlanta, Georgia, 1978-2005.

Authors: 
Journal:  MMWR Morb Mortal Wkly Rep       Date:  2008-01-11       Impact factor: 17.586

Review 9.  Systematic review on the prevalence, frequency and comparative value of adverse events data in social media.

Authors:  Su Golder; Gill Norman; Yoon K Loke
Journal:  Br J Clin Pharmacol       Date:  2015-09-16       Impact factor: 4.335

Review 10.  A review of influenza detection and prediction through social networking sites.

Authors:  Ali Alessa; Miad Faezipour
Journal:  Theor Biol Med Model       Date:  2018-02-01       Impact factor: 2.432

View more
  9 in total

1.  Using Twitter Data for Cohort Studies of Drug Safety in Pregnancy: Proof-of-concept With β-Blockers.

Authors:  Ari Z Klein; Karen O'Connor; Lisa D Levine; Graciela Gonzalez-Hernandez
Journal:  JMIR Form Res       Date:  2022-06-30

Review 2.  The role of machine learning applications in diagnosing and assessing critical and non-critical CHD: a scoping review.

Authors:  Stephanie M Helman; Elizabeth A Herrup; Adam B Christopher; Salah S Al-Zaiti
Journal:  Cardiol Young       Date:  2021-11-02       Impact factor: 1.093

3.  A Year of Papers Using Biomedical Texts: Findings from the Section on Natural Language Processing of the IMIA Yearbook.

Authors:  Natalia Grabar; Cyril Grouin
Journal:  Yearb Med Inform       Date:  2019-08-16

4.  Towards scaling Twitter for digital epidemiology of birth defects.

Authors:  Ari Z Klein; Abeed Sarker; Davy Weissenbacher; Graciela Gonzalez-Hernandez
Journal:  NPJ Digit Med       Date:  2019-10-01

5.  Twitter as a sentinel tool to monitor public opinion on vaccination: an opinion mining analysis from September 2016 to August 2017 in Italy.

Authors:  Lara Tavoschi; Filippo Quattrone; Eleonora D'Andrea; Pietro Ducange; Marco Vabanesi; Francesco Marcelloni; Pier Luigi Lopalco
Journal:  Hum Vaccin Immunother       Date:  2020-03-02       Impact factor: 3.452

6.  Pregnancy and health in the age of the Internet: A content analysis of online "birth club" forums.

Authors:  Anna Wexler; Anahita Davoudi; Davy Weissenbacher; Rebekah Choi; Karen O'Connor; Holly Cummings; Graciela Gonzalez-Hernandez
Journal:  PLoS One       Date:  2020-04-14       Impact factor: 3.240

7.  An annotated data set for identifying women reporting adverse pregnancy outcomes on Twitter.

Authors:  Ari Z Klein; Graciela Gonzalez-Hernandez
Journal:  Data Brief       Date:  2020-08-31

8.  Toward Using Twitter Data to Monitor COVID-19 Vaccine Safety in Pregnancy: Proof-of-Concept Study of Cohort Identification.

Authors:  Ari Z Klein; Karen O'Connor; Graciela Gonzalez-Hernandez
Journal:  JMIR Form Res       Date:  2022-01-06

9.  Pharmacoepidemiologic Evaluation of Birth Defects from Health-Related Postings in Social Media During Pregnancy.

Authors:  Su Golder; Stephanie Chiuve; Davy Weissenbacher; Ari Klein; Karen O'Connor; Martin Bland; Murray Malin; Mondira Bhattacharya; Linda J Scarazzini; Graciela Gonzalez-Hernandez
Journal:  Drug Saf       Date:  2019-03       Impact factor: 5.606

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.