Literature DB >> 24970840

Induced lexico-syntactic patterns improve information extraction from online medical forums.

Sonal Gupta1, Diana L MacLean1, Jeffrey Heer1, Christopher D Manning1.   

Abstract

OBJECTIVE: To reliably extract two entity types, symptoms and conditions (SCs), and drugs and treatments (DTs), from patient-authored text (PAT) by learning lexico-syntactic patterns from data annotated with seed dictionaries. BACKGROUND AND SIGNIFICANCE: Despite the increasing quantity of PAT (eg, online discussion threads), tools for identifying medical entities in PAT are limited. When applied to PAT, existing tools either fail to identify specific entity types or perform poorly. Identification of SC and DT terms in PAT would enable exploration of efficacy and side effects for not only pharmaceutical drugs, but also for home remedies and components of daily care.
MATERIALS AND METHODS: We use SC and DT term dictionaries compiled from online sources to label several discussion forums from MedHelp (http://www.medhelp.org). We then iteratively induce lexico-syntactic patterns corresponding strongly to each entity type to extract new SC and DT terms.
RESULTS: Our system is able to extract symptom descriptions and treatments absent from our original dictionaries, such as 'LADA', 'stabbing pain', and 'cinnamon pills'. Our system extracts DT terms with 58-70% F1 score and SC terms with 66-76% F1 score on two forums from MedHelp. We show improvements over MetaMap, OBA, a conditional random field-based classifier, and a previous pattern learning approach.
CONCLUSIONS: Our entity extractor based on lexico-syntactic patterns is a successful and preferable technique for identifying specific entity types in PAT. To the best of our knowledge, this is the first paper to extract SC and DT entities from PAT. We exhibit learning of informal terms often used in PAT but missing from typical dictionaries. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

Entities:  

Keywords:  medical entites extraction; natural language processing; online health forums; text mining

Mesh:

Year:  2014        PMID: 24970840      PMCID: PMC4147618          DOI: 10.1136/amiajnl-2014-002669

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  17 in total

1.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.

Authors:  A R Aronson
Journal:  Proc AMIA Symp       Date:  2001

2.  Accelerated clinical discovery using self-reported patient data collected online and a patient-matching algorithm.

Authors:  Paul Wicks; Timothy E Vaughan; Michael P Massagli; James Heywood
Journal:  Nat Biotechnol       Date:  2011-04-24       Impact factor: 54.908

3.  An overview of MetaMap: historical perspective and recent advances.

Authors:  Alan R Aronson; François-Michel Lang
Journal:  J Am Med Inform Assoc       Date:  2010 May-Jun       Impact factor: 4.497

4.  A novel signal detection algorithm for identifying hidden drug-drug interactions in adverse event reports.

Authors:  Nicholas P Tatonetti; Guy Haskin Fernald; Russ B Altman
Journal:  J Am Med Inform Assoc       Date:  2011-06-14       Impact factor: 4.497

5.  Web-scale pharmacovigilance: listening to signals from the crowd.

Authors:  Ryen W White; Nicholas P Tatonetti; Nigam H Shah; Russ B Altman; Eric Horvitz
Journal:  J Am Med Inform Assoc       Date:  2013-03-06       Impact factor: 4.497

6.  Automated identification of drug and food allergies entered using non-standard terminology.

Authors:  Richard H Epstein; Paul St Jacques; Michael Stockin; Brian Rothman; Jesse M Ehrenfeld; Joshua C Denny
Journal:  J Am Med Inform Assoc       Date:  2013-06-07       Impact factor: 4.497

7.  PatientsLikeMe: Consumer health vocabulary as a folksonomy.

Authors:  Catherine Arnott Smith; Paul J Wicks
Journal:  AMIA Annu Symp Proc       Date:  2008-11-06

8.  The open biomedical annotator.

Authors:  Clement Jonquet; Nigam H Shah; Mark A Musen
Journal:  Summit Transl Bioinform       Date:  2009-03-01

9.  Using rule-based natural language processing to improve disease normalization in biomedical text.

Authors:  Ning Kang; Bharat Singh; Zubair Afzal; Erik M van Mulligen; Jan A Kors
Journal:  J Am Med Inform Assoc       Date:  2012-10-06       Impact factor: 4.497

10.  Identifying medical terms in patient-authored text: a crowdsourcing-based approach.

Authors:  Diana Lynn MacLean; Jeffrey Heer
Journal:  J Am Med Inform Assoc       Date:  2013-05-05       Impact factor: 4.497

View more
  14 in total

1.  Trends in biomedical informatics: automated topic analysis of JAMIA articles.

Authors:  Dong Han; Shuang Wang; Chao Jiang; Xiaoqian Jiang; Hyeon-Eui Kim; Jimeng Sun; Lucila Ohno-Machado
Journal:  J Am Med Inform Assoc       Date:  2015-11       Impact factor: 4.497

Review 2.  Identifying Complementary and Alternative Medicine Usage Information from Internet Resources. A Systematic Review.

Authors:  Vivekanand Sharma; John H Holmes; Indra N Sarkar
Journal:  Methods Inf Med       Date:  2016-06-28       Impact factor: 2.176

3.  Automating the generation of lexical patterns for processing free text in clinical documents.

Authors:  Frank Meng; Craig Morioka
Journal:  J Am Med Inform Assoc       Date:  2015-05-14       Impact factor: 4.497

Review 4.  Aspiring to Unintended Consequences of Natural Language Processing: A Review of Recent Developments in Clinical and Consumer-Generated Text Processing.

Authors:  D Demner-Fushman; N Elhadad
Journal:  Yearb Med Inform       Date:  2016-11-10

5.  A systematic review of natural language processing and text mining of symptoms from electronic patient-authored text data.

Authors:  Caitlin Dreisbach; Theresa A Koleck; Philip E Bourne; Suzanne Bakken
Journal:  Int J Med Inform       Date:  2019-02-20       Impact factor: 4.046

6.  Combination of Deep Recurrent Neural Networks and Conditional Random Fields for Extracting Adverse Drug Reactions from User Reviews.

Authors:  Elena Tutubalina; Sergey Nikolenko
Journal:  J Healthc Eng       Date:  2017-09-05       Impact factor: 2.682

7.  Enhanced Quality Measurement Event Detection: An Application to Physician Reporting.

Authors:  Suzanne R Tamang; Tina Hernandez-Boussard; Elsie Gyang Ross; Gregory Gaskin; Manali I Patel; Nigam H Shah
Journal:  EGEMS (Wash DC)       Date:  2017-05-30

8.  Spelling Errors and Shouting Capitalization Lead to Additive Penalties to Trustworthiness of Online Health Information: Randomized Experiment With Laypersons.

Authors:  Harry J Witchel; Georgina A Thompson; Christopher I Jones; Carina E I Westling; Juan Romero; Alessia Nicotra; Bruno Maag; Hugo D Critchley
Journal:  J Med Internet Res       Date:  2020-06-10       Impact factor: 5.428

Review 9.  Utility of social media and crowd-intelligence data for pharmacovigilance: a scoping review.

Authors:  Andrea C Tricco; Wasifa Zarin; Erin Lillie; Serena Jeblee; Rachel Warren; Paul A Khan; Reid Robson; Ba' Pham; Graeme Hirst; Sharon E Straus
Journal:  BMC Med Inform Decis Mak       Date:  2018-06-14       Impact factor: 2.796

10.  Machine learning to support social media empowered patients in cancer care and cancer treatment decisions.

Authors:  Daswin De Silva; Weranja Ranasinghe; Tharindu Bandaragoda; Achini Adikari; Nishan Mills; Lahiru Iddamalgoda; Damminda Alahakoon; Nathan Lawrentschuk; Raj Persad; Evgeny Osipov; Richard Gray; Damien Bolton
Journal:  PLoS One       Date:  2018-10-18       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.