| Literature DB >> 34893665 |
Gian Maria Zaccaria1, Vito Colella2, Simona Colucci2, Felice Clemente3, Fabio Pavone3, Maria Carmela Vegliante3, Flavia Esposito3,4, Giuseppina Opinto3, Anna Scattone5, Giacomo Loseto3, Carla Minoia3, Bernardo Rossini3, Angela Maria Quinto3, Vito Angiulli6, Luigi Alfredo Grieco2, Angelo Fama7, Simone Ferrero8,9, Riccardo Moia10, Alice Di Rocco11, Francesca Maria Quaglia12, Valentina Tabanelli13, Attilio Guarini3, Sabino Ciavarella3.
Abstract
The unstructured nature of Real-World (RW) data from onco-hematological patients and the scarce accessibility to integrated systems restrain the use of RW information for research purposes. Natural Language Processing (NLP) might help in transposing unstructured reports into standardized electronic health records. We exploited NLP to develop an automated tool, named ARGO (Automatic Record Generator for Onco-hematology) to recognize information from pathology reports and populate electronic case report forms (eCRFs) pre-implemented by REDCap. ARGO was applied to hemo-lymphopathology reports of diffuse large B-cell, follicular, and mantle cell lymphomas, and assessed for accuracy (A), precision (P), recall (R) and F1-score (F) on internal (n = 239) and external (n = 93) report series. 326 (98.2%) reports were converted into corresponding eCRFs. Overall, ARGO showed high performance in capturing (1) identification report number (all metrics > 90%), (2) biopsy date (all metrics > 90% in both series), (3) specimen type (86.6% and 91.4% of A, 98.5% and 100.0% of P, 92.5% and 95.5% of F, and 87.2% and 91.4% of R for internal and external series, respectively), (4) diagnosis (100% of P with A, R and F of 90% in both series). We developed and validated a generalizable tool that generates structured eCRFs from real-life pathology reports.Entities:
Mesh:
Year: 2021 PMID: 34893665 PMCID: PMC8664934 DOI: 10.1038/s41598-021-03204-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379