Literature DB >> 31145455

Development and application of a high throughput natural language processing architecture to convert all clinical documents in a clinical data warehouse into standardized medical vocabularies.

Majid Afshar1,2, Dmitriy Dligach1,2,3, Brihat Sharma3, Xiaoyuan Cai4, Jason Boyda4, Steven Birch4, Daniel Valdez4, Suzan Zelisko4, Cara Joyce1,2, François Modave1,2, Ron Price1,4.   

Abstract

OBJECTIVE: Natural language processing (NLP) engines such as the clinical Text Analysis and Knowledge Extraction System are a solution for processing notes for research, but optimizing their performance for a clinical data warehouse remains a challenge. We aim to develop a high throughput NLP architecture using the clinical Text Analysis and Knowledge Extraction System and present a predictive model use case.
MATERIALS AND METHODS: The CDW was comprised of 1 103 038 patients across 10 years. The architecture was constructed using the Hadoop data repository for source data and 3 large-scale symmetric processing servers for NLP. Each named entity mention in a clinical document was mapped to the Unified Medical Language System concept unique identifier (CUI).
RESULTS: The NLP architecture processed 83 867 802 clinical documents in 13.33 days and produced 37 721 886 606 CUIs across 8 standardized medical vocabularies. Performance of the architecture exceeded 500 000 documents per hour across 30 parallel instances of the clinical Text Analysis and Knowledge Extraction System including 10 instances dedicated to documents greater than 20 000 bytes. In a use-case example for predicting 30-day hospital readmission, a CUI-based model had similar discrimination to n-grams with an area under the curve receiver operating characteristic of 0.75 (95% CI, 0.74-0.76). DISCUSSION AND
CONCLUSION: Our health system's high throughput NLP architecture may serve as a benchmark for large-scale clinical research using a CUI-based approach.
© The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  clinical text and knowledge extraction system; data architecture; natural language processing; unified medical language system; unstructured data

Mesh:

Year:  2019        PMID: 31145455      PMCID: PMC7647210          DOI: 10.1093/jamia/ocz068

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  25 in total

1.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications.

Authors:  Guergana K Savova; James J Masanz; Philip V Ogren; Jiaping Zheng; Sunghwan Sohn; Karin C Kipper-Schuler; Christopher G Chute
Journal:  J Am Med Inform Assoc       Date:  2010 Sep-Oct       Impact factor: 4.497

2.  Scaling-up NLP Pipelines to Process Large Corpora of Clinical Notes.

Authors:  G Divita; M Carter; A Redd; Q Zeng; K Gupta; B Trautner; M Samore; A Gundlapalli
Journal:  Methods Inf Med       Date:  2015-11-04       Impact factor: 2.176

3.  Detecting Opioid-Related Aberrant Behavior using Natural Language Processing.

Authors:  Jesse M Lingeman; Priscilla Wang; William Becker; Hong Yu
Journal:  AMIA Annu Symp Proc       Date:  2018-04-16

4.  Casemix adjustment of managed care claims data using the clinical classification for health policy research method.

Authors:  M E Cowen; D J Dusseau; B G Toth; C Guisinger; M W Zodet; Y Shyr
Journal:  Med Care       Date:  1998-07       Impact factor: 2.983

Review 5.  Capturing the Patient's Perspective: a Review of Advances in Natural Language Processing of Health-Related Text.

Authors:  G Gonzalez-Hernandez; A Sarker; K O'Connor; G Savova
Journal:  Yearb Med Inform       Date:  2017-09-11

6.  Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources.

Authors:  Sheng Yu; Katherine P Liao; Stanley Y Shaw; Vivian S Gainer; Susanne E Churchill; Peter Szolovits; Shawn N Murphy; Isaac S Kohane; Tianxi Cai
Journal:  J Am Med Inform Assoc       Date:  2015-04-29       Impact factor: 4.497

7.  Automated feature selection of predictors in electronic medical records data.

Authors:  Jessica Gronsbell; Jessica Minnier; Sheng Yu; Katherine Liao; Tianxi Cai
Journal:  Biometrics       Date:  2019-04-02       Impact factor: 2.571

8.  Using natural language processing to identify problem usage of prescription opioids.

Authors:  David S Carrell; David Cronkite; Roy E Palmer; Kathleen Saunders; David E Gross; Elizabeth T Masters; Timothy R Hylan; Michael Von Korff
Journal:  Int J Med Inform       Date:  2015-09-25       Impact factor: 4.046

9.  Surrogate-assisted feature extraction for high-throughput phenotyping.

Authors:  Sheng Yu; Abhishek Chakrabortty; Katherine P Liao; Tianrun Cai; Ashwin N Ananthakrishnan; Vivian S Gainer; Susanne E Churchill; Peter Szolovits; Shawn N Murphy; Isaac S Kohane; Tianxi Cai
Journal:  J Am Med Inform Assoc       Date:  2017-04-01       Impact factor: 4.497

Review 10.  Data Processing and Text Mining Technologies on Electronic Medical Records: A Review.

Authors:  Wencheng Sun; Zhiping Cai; Yangyang Li; Fang Liu; Shengqun Fang; Guoyan Wang
Journal:  J Healthc Eng       Date:  2018-04-08       Impact factor: 2.682

View more
  8 in total

1.  Validation of an alcohol misuse classifier in hospitalized patients.

Authors:  Daniel To; Brihat Sharma; Niranjan Karnik; Cara Joyce; Dmitriy Dligach; Majid Afshar
Journal:  Alcohol       Date:  2019-09-28       Impact factor: 2.405

2.  Prediction of severe chest injury using natural language processing from the electronic health record.

Authors:  Sujay Kulshrestha; Dmitriy Dligach; Cara Joyce; Marshall S Baker; Richard Gonzalez; Ann P O'Rourke; Joshua M Glazer; Anne Stey; Jacqueline M Kruser; Matthew M Churpek; Majid Afshar
Journal:  Injury       Date:  2020-10-25       Impact factor: 2.586

3.  Translational NLP: A New Paradigm and General Principles for Natural Language Processing Research.

Authors:  Denis Newman-Griffis; Jill Fain Lehman; Carolyn Rosé; Harry Hochheiser
Journal:  Proc Conf       Date:  2021-06

Review 4.  Desiderata for delivering NLP to accelerate healthcare AI advancement and a Mayo Clinic NLP-as-a-service implementation.

Authors:  Andrew Wen; Sunyang Fu; Sungrim Moon; Mohamed El Wazir; Andrew Rosenbaum; Vinod C Kaggal; Sijia Liu; Sunghwan Sohn; Hongfang Liu; Jungwei Fan
Journal:  NPJ Digit Med       Date:  2019-12-17

5.  External validation of an opioid misuse machine learning classifier in hospitalized adult patients.

Authors:  Majid Afshar; Brihat Sharma; Sameer Bhalla; Hale M Thompson; Dmitriy Dligach; Randy A Boley; Ekta Kishen; Alan Simmons; Kathryn Perticone; Niranjan S Karnik
Journal:  Addict Sci Clin Pract       Date:  2021-03-17

Review 6.  Epidemiological challenges in pandemic coronavirus disease (COVID-19): Role of artificial intelligence.

Authors:  Abhijit Dasgupta; Abhisek Bakshi; Srijani Mukherjee; Kuntal Das; Soumyajeet Talukdar; Pratyayee Chatterjee; Sagnik Mondal; Puspita Das; Subhrojit Ghosh; Archisman Som; Pritha Roy; Rima Kundu; Akash Sarkar; Arnab Biswas; Karnelia Paul; Sujit Basak; Krishnendu Manna; Chinmay Saha; Satinath Mukhopadhyay; Nitai P Bhattacharyya; Rajat K De
Journal:  Wiley Interdiscip Rev Data Min Knowl Discov       Date:  2022-06-28

7.  Publicly available machine learning models for identifying opioid misuse from the clinical notes of hospitalized patients.

Authors:  Brihat Sharma; Dmitriy Dligach; Kristin Swope; Elizabeth Salisbury-Afshar; Niranjan S Karnik; Cara Joyce; Majid Afshar
Journal:  BMC Med Inform Decis Mak       Date:  2020-04-29       Impact factor: 3.298

8.  Natural language processing algorithms for mapping clinical text fragments onto ontology concepts: a systematic review and recommendations for future studies.

Authors:  Martijn G Kersloot; Florentien J P van Putten; Ameen Abu-Hanna; Ronald Cornet; Derk L Arts
Journal:  J Biomed Semantics       Date:  2020-11-16
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.