| Literature DB >> 21347150 |
Brett R South1, Shuying Shen, Wendy W Chapman, Sylvain Delisle, Matthew H Samore, Adi V Gundlapalli.
Abstract
Text classifiers have been used for biosurveillance tasks to identify patients with diseases or conditions of interest. When compared to a clinical reference standard of 280 cases of Acute Respiratory Infection (ARI), a text classifier consisting of simple rules and NegEx plus string matching for specific concepts of interest produced 569 (4%) false positive (FP) cases. Using instance level manual annotation we estimate the prevalence of contextual attributes and error types leading to FP cases. Errors were due to (1) Deletion errors from abbreviations, spelling mistakes and missing synonyms (57%); (2) Insertion errors from templated document structures such as check boxes, and lists of signs and symptoms (36%) and; (3) Substitution errors from irrelevant concepts and alternate meanings for the same word (6%). We demonstrate that specific concept attributes contribute to false positive cases. These results will inform modifications and adaptations to improve text classifier performance.Entities:
Year: 2010 PMID: 21347150 PMCID: PMC3041533
Source DB: PubMed Journal: Summit Transl Bioinform ISSN: 2153-6430
Concepts related to Acute Respiratory Infection
| Semantic concept | Number of synonyms, term variants, abbreviations |
|---|---|
| Cough | 13 |
| Fever | 39 |
| Chills | 14 |
| Night sweats | 12 |
| Pleuritic chest pain | 14 |
| Myalgia | 29 |
| Sore throat | 35 |
| Headache | 30 |
Figure 1.Concepts identified by the text classifier and manual annotation. IAA = Inter-Annotator Agreement
Concepts identified by manual annotation. IAA = Inter-Annotator Agreement
| Negation (93%) | affirmed | 884 (60%) |
| hypothetical | 157 (11%) | |
| negated | 427 (29%) | |
| Duration (92%) | <=7 days | 149 (10%) |
| > 7 days | 112 (8%) | |
| unknown | 1207 (82%) | |
| Templating (93%) | Signs and symptoms | 405 (28%) |
| Instructions | 94 (6%) | |
| Free text only | 968 (66%) |
Figure 2.Example of templated pick list
Figure 3.Example of an out of context concept