Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Boosting ICD multi-label classification of health records with contextual embeddings and label-granularity.

Literature DB >> 31851906

Boosting ICD multi-label classification of health records with contextual embeddings and label-granularity.

Alberto Blanco¹, Olatz Perez-de-Viñaspre², Alicia Pérez², Arantza Casillas².

Abstract

BACKGROUND AND
OBJECTIVE: This work deals with clinical text mining, a field of Natural Language Processing applied to biomedical informatics. The aim is to classify Electronic Health Records with respect to the International Classification of Diseases, which is the foundation for the identification of international health statistics, and the standard for reporting diseases and health conditions. Within the framework of data mining, the goal is the multi-label classification, as each health record has assigned multiple International Classification of Diseases codes. We investigate five Deep Learning architectures with a dataset obtained from the Basque Country Health System, and six different perspectives derived from shifts in the input and the output.
METHODS: We evaluate a Feed Forward Neural Network as the baseline and several Recurrent models based on the Bidirectional GRU architecture, putting our research focus on the text representation layer and testing three variants, from standard word embeddings to meta word embeddings techniques and contextual embeddings.
RESULTS: The results showed that the recurrent models overcome the non-recurrent model. The meta word embeddings techniques are capable of beating the standard word embeddings, but the contextual embeddings exhibit as the most robust for the downstream task overall. Additionally, the label-granularity alone has an impact on the classification performance.
CONCLUSIONS: The contributions of this work are a) a comparison among five classification approaches based on Deep Learning on a Spanish dataset to cope with the multi-label health text classification problem; b) the study of the impact of document length and label-set size and granularity in the multi-label context; and c) the study of measures to mitigate multi-label text classification problems related to label-set size and sparseness.

Entities: Disease

Keywords: Contextual embeddings; Electronic health record; International classification of diseases; Label-granularity; Multi-label classification; Recurrent neural networks

Year: 2019 PMID： 31851906 DOI： 10.1016/j.cmpb.2019.105264

Source DB: PubMed Journal: Comput Methods Programs Biomed ISSN： 0169-2607 Impact factor: 5.428

Keyword Cloud
Cited

4 in total

1. Automatic International Classification of Diseases Coding System: Deep Contextualized Language Model With Rule-Based Approaches.

Authors: Pei-Fu Chen; Kuan-Chih Chen; Wei-Chih Liao; Feipei Lai; Tai-Liang He; Sheng-Che Lin; Wei-Jen Chen; Chi-Yu Yang; Yu-Cheng Lin; I-Chang Tsai; Chi-Hao Chiu; Shu-Chih Chang; Fang-Ming Hung
Journal: JMIR Med Inform Date: 2022-06-29

2. Automatic Identification of Patients With Unexplained Left Ventricular Hypertrophy in Electronic Health Record Data to Improve Targeted Treatment and Family Screening.

Authors: Arjan Sammani; Mark Jansen; Nynke M de Vries; Nicolaas de Jonge; Annette F Baas; Anneline S J M Te Riele; Folkert W Asselbergs; Marish I F J Oerlemans
Journal: Front Cardiovasc Med Date: 2022-04-15

Review 3. AI-based language models powering drug discovery and development.

Authors: Zhichao Liu; Ruth A Roberts; Madhu Lal-Nag; Xi Chen; Ruili Huang; Weida Tong
Journal: Drug Discov Today Date: 2021-06-30 Impact factor: 7.851

4. Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports.

Authors: Ayoub Bagheri; T Katrien J Groenhof; Folkert W Asselbergs; Saskia Haitjema; Michiel L Bots; Wouter B Veldhuis; Pim A de Jong; Daniel L Oberski
Journal: J Healthc Eng Date: 2021-07-09 Impact factor: 2.682

4 in total