Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Development and application of a high throughput natural language processing architecture to convert all clinical documents in a clinical data warehouse into standardized medical vocabularies.

Literature DB >> 31145455

Development and application of a high throughput natural language processing architecture to convert all clinical documents in a clinical data warehouse into standardized medical vocabularies.

Majid Afshar^1,2, Dmitriy Dligach^1,2,3, Brihat Sharma³, Xiaoyuan Cai⁴, Jason Boyda⁴, Steven Birch⁴, Daniel Valdez⁴, Suzan Zelisko⁴, Cara Joyce^1,2, François Modave^1,2, Ron Price^1,4.

Abstract

OBJECTIVE: Natural language processing (NLP) engines such as the clinical Text Analysis and Knowledge Extraction System are a solution for processing notes for research, but optimizing their performance for a clinical data warehouse remains a challenge. We aim to develop a high throughput NLP architecture using the clinical Text Analysis and Knowledge Extraction System and present a predictive model use case.
MATERIALS AND METHODS: The CDW was comprised of 1 103 038 patients across 10 years. The architecture was constructed using the Hadoop data repository for source data and 3 large-scale symmetric processing servers for NLP. Each named entity mention in a clinical document was mapped to the Unified Medical Language System concept unique identifier (CUI).
RESULTS: The NLP architecture processed 83 867 802 clinical documents in 13.33 days and produced 37 721 886 606 CUIs across 8 standardized medical vocabularies. Performance of the architecture exceeded 500 000 documents per hour across 30 parallel instances of the clinical Text Analysis and Knowledge Extraction System including 10 instances dedicated to documents greater than 20 000 bytes. In a use-case example for predicting 30-day hospital readmission, a CUI-based model had similar discrimination to n-grams with an area under the curve receiver operating characteristic of 0.75 (95% CI, 0.74-0.76). DISCUSSION AND
CONCLUSION: Our health system's high throughput NLP architecture may serve as a benchmark for large-scale clinical research using a CUI-based approach.

Entities: Chemical Species

Keywords: clinical text and knowledge extraction system; data architecture; natural language processing; unified medical language system; unstructured data

Mesh：

Year: 2019 PMID： 31145455 PMCID： PMC7647210 DOI： 10.1093/jamia/ocz068

Source DB: PubMed Journal: J Am Med Inform Assoc ISSN： 1067-5027 Impact factor: 4.497

25 in total

1. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications.

Authors: Guergana K Savova; James J Masanz; Philip V Ogren; Jiaping Zheng; Sunghwan Sohn; Karin C Kipper-Schuler; Christopher G Chute
Journal: J Am Med Inform Assoc Date: 2010 Sep-Oct Impact factor: 4.497

2. Scaling-up NLP Pipelines to Process Large Corpora of Clinical Notes.

Authors: G Divita; M Carter; A Redd; Q Zeng; K Gupta; B Trautner; M Samore; A Gundlapalli
Journal: Methods Inf Med Date: 2015-11-04 Impact factor: 2.176

3. Detecting Opioid-Related Aberrant Behavior using Natural Language Processing.

Authors: Jesse M Lingeman; Priscilla Wang; William Becker; Hong Yu
Journal: AMIA Annu Symp Proc Date: 2018-04-16

4. Casemix adjustment of managed care claims data using the clinical classification for health policy research method.

Authors: M E Cowen; D J Dusseau; B G Toth; C Guisinger; M W Zodet; Y Shyr
Journal: Med Care Date: 1998-07 Impact factor: 2.983

Review 5. Capturing the Patient's Perspective: a Review of Advances in Natural Language Processing of Health-Related Text.

Authors: G Gonzalez-Hernandez; A Sarker; K O'Connor; G Savova
Journal: Yearb Med Inform Date: 2017-09-11

6. Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources.

Authors: Sheng Yu; Katherine P Liao; Stanley Y Shaw; Vivian S Gainer; Susanne E Churchill; Peter Szolovits; Shawn N Murphy; Isaac S Kohane; Tianxi Cai
Journal: J Am Med Inform Assoc Date: 2015-04-29 Impact factor: 4.497

7. Automated feature selection of predictors in electronic medical records data.

Authors: Jessica Gronsbell; Jessica Minnier; Sheng Yu; Katherine Liao; Tianxi Cai
Journal: Biometrics Date: 2019-04-02 Impact factor: 2.571

8. Using natural language processing to identify problem usage of prescription opioids.

Authors: David S Carrell; David Cronkite; Roy E Palmer; Kathleen Saunders; David E Gross; Elizabeth T Masters; Timothy R Hylan; Michael Von Korff
Journal: Int J Med Inform Date: 2015-09-25 Impact factor: 4.046

9. Surrogate-assisted feature extraction for high-throughput phenotyping.

Authors: Sheng Yu; Abhishek Chakrabortty; Katherine P Liao; Tianrun Cai; Ashwin N Ananthakrishnan; Vivian S Gainer; Susanne E Churchill; Peter Szolovits; Shawn N Murphy; Isaac S Kohane; Tianxi Cai
Journal: J Am Med Inform Assoc Date: 2017-04-01 Impact factor: 4.497

Review 10. Data Processing and Text Mining Technologies on Electronic Medical Records: A Review.

Authors: Wencheng Sun; Zhiping Cai; Yangyang Li; Fang Liu; Shengqun Fang; Guoyan Wang
Journal: J Healthc Eng Date: 2018-04-08 Impact factor: 2.682

8 in total

1. Validation of an alcohol misuse classifier in hospitalized patients.

Authors: Daniel To; Brihat Sharma; Niranjan Karnik; Cara Joyce; Dmitriy Dligach; Majid Afshar
Journal: Alcohol Date: 2019-09-28 Impact factor: 2.405

2. Prediction of severe chest injury using natural language processing from the electronic health record.

Authors: Sujay Kulshrestha; Dmitriy Dligach; Cara Joyce; Marshall S Baker; Richard Gonzalez; Ann P O'Rourke; Joshua M Glazer; Anne Stey; Jacqueline M Kruser; Matthew M Churpek; Majid Afshar
Journal: Injury Date: 2020-10-25 Impact factor: 2.586

3. Translational NLP: A New Paradigm and General Principles for Natural Language Processing Research.

Authors: Denis Newman-Griffis; Jill Fain Lehman; Carolyn Rosé; Harry Hochheiser
Journal: Proc Conf Date: 2021-06

Review 4. Desiderata for delivering NLP to accelerate healthcare AI advancement and a Mayo Clinic NLP-as-a-service implementation.

Authors: Andrew Wen; Sunyang Fu; Sungrim Moon; Mohamed El Wazir; Andrew Rosenbaum; Vinod C Kaggal; Sijia Liu; Sunghwan Sohn; Hongfang Liu; Jungwei Fan
Journal: NPJ Digit Med Date: 2019-12-17

5. External validation of an opioid misuse machine learning classifier in hospitalized adult patients.

Authors: Majid Afshar; Brihat Sharma; Sameer Bhalla; Hale M Thompson; Dmitriy Dligach; Randy A Boley; Ekta Kishen; Alan Simmons; Kathryn Perticone; Niranjan S Karnik
Journal: Addict Sci Clin Pract Date: 2021-03-17

Review 6. Epidemiological challenges in pandemic coronavirus disease (COVID-19): Role of artificial intelligence.

Authors: Abhijit Dasgupta; Abhisek Bakshi; Srijani Mukherjee; Kuntal Das; Soumyajeet Talukdar; Pratyayee Chatterjee; Sagnik Mondal; Puspita Das; Subhrojit Ghosh; Archisman Som; Pritha Roy; Rima Kundu; Akash Sarkar; Arnab Biswas; Karnelia Paul; Sujit Basak; Krishnendu Manna; Chinmay Saha; Satinath Mukhopadhyay; Nitai P Bhattacharyya; Rajat K De
Journal: Wiley Interdiscip Rev Data Min Knowl Discov Date: 2022-06-28

7. Publicly available machine learning models for identifying opioid misuse from the clinical notes of hospitalized patients.

Authors: Brihat Sharma; Dmitriy Dligach; Kristin Swope; Elizabeth Salisbury-Afshar; Niranjan S Karnik; Cara Joyce; Majid Afshar
Journal: BMC Med Inform Decis Mak Date: 2020-04-29 Impact factor: 3.298

8. Natural language processing algorithms for mapping clinical text fragments onto ontology concepts: a systematic review and recommendations for future studies.

Authors: Martijn G Kersloot; Florentien J P van Putten; Ameen Abu-Hanna; Ronald Cornet; Derk L Arts
Journal: J Biomed Semantics Date: 2020-11-16

8 in total