| Literature DB >> 31106327 |
Cong Liu1, Fabricio Sampaio Peres Kury1, Ziran Li1, Casey Ta1, Kai Wang2,3, Chunhua Weng1.
Abstract
We present Doc2Hpo, an interactive web application that enables interactive and efficient phenotype concept curation from clinical text with automated concept normalization using the Human Phenotype Ontology (HPO). Users can edit the HPO concepts automatically extracted by Doc2Hpo in real time, and export the extracted HPO concepts into gene prioritization tools. Our evaluation showed that Doc2Hpo significantly reduced manual effort while achieving high accuracy in HPO concept curation. Doc2Hpo is freely available at https://impact2.dbmi.columbia.edu/doc2hpo/. The source code is available at https://github.com/stormliucong/doc2hpo for local installation for protected health data.Entities:
Year: 2019 PMID: 31106327 PMCID: PMC6602487 DOI: 10.1093/nar/gkz386
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.(A) Architecture of Doc2Hpo. (B) Interactive user interface. 1. Automated extraction includes three parts: light green highlights the clinical entity identified within the text, the term in the appended dark green box identifies the standardized HPO concept, and the ‘N’ button appended indicates the negation modifier (red for negation). 2–3. The user can highlight any text and add an HPO concept. The user can also double click to delete an erroneous extraction; 4–5. The user can change the standardized HPO concept by clicking the dark green box. The user could search for the desired HPO concept and make updates in real-time. 6. The user can click the ‘N’ button to change the negation modifier.
Figure 2.(A) Box and whisker plots of time consumption by using Doc2Hpo and manual curation across three different annotators. (B) The accuracy (precision, recall, and F1 score) for different approaches across different annotators. The error bars indicate the 95% confidence intervals. ‘Auto’ refers to fully automated HPO curation based on MetaMap without negation detection. ‘Auto (negex)’ refers to fully automated HPO curation based on MetaMap with negation detection. ‘Doc2hpo’ refers to Doc2Hpo-aided user curation. ‘Manual’ refers to user curation without using Doc2Hpo.