| Literature DB >> 20374657 |
Kirsten McKenzie1, Margaret A Campbell, Deborah A Scott, Tim R Discoll, James E Harrison, Roderick J McClure.
Abstract
BACKGROUND: Work-related injuries in Australia are estimated to cost around $57.5 billion annually, however there are currently insufficient surveillance data available to support an evidence-based public health response. Emergency departments (ED) in Australia are a potential source of information on work-related injuries though most ED's do not have an 'Activity Code' to identify work-related cases with information about the presenting problem recorded in a short free text field. This study compared methods for interrogating text fields for identifying work-related injuries presenting at emergency departments to inform approaches to surveillance of work-related injury.Entities:
Mesh:
Year: 2010 PMID: 20374657 PMCID: PMC3161343 DOI: 10.1186/1472-6947-10-19
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Activity codes by text search methods
| Work Activity code | Other Activity code | ||||
|---|---|---|---|---|---|
| Term Search Methods | n | % | n | % | Total |
| Keyword search | |||||
| 'Work' in text string | 12,457 | 58.16 | 1,916 | 1.03 | 14,373 |
| 'Work' not in text string | 8,962 | 41.84 | 184,946 | 98.97 | 193,908 |
| Index term search | |||||
| Work index term in string | 13,252 | 61.87 | 23,434 | 12.54 | 36,686 |
| Index term not in string | 8,167 | 38.13 | 163,428 | 87.46 | 171,595 |
| Keyword OR index term search | |||||
| Index or keyword in string | 17,004 | 79.4 | 24,416 | 13.1 | 41,420 |
| No index or keyword in string | 4,415 | 20.6 | 162,446 | 86.9 | 166,861 |
| Total | 21,419 | 100 | 186,862 | 100 | 208,281 |
Activity codes by content analytic text mining approaches
| Work Activity code | Other Activity code | ||||
|---|---|---|---|---|---|
| Content Analytic Text Mining Methods | n | % | n | % | Total |
| Binary classification | |||||
| 'Work SPV Tag' | 17,299 | 80.8 | 13,894 | 7.4 | 31,193 |
| Not 'Work SPV Tag' | 4,120 | 19.2 | 172,968 | 92.6 | 177,088 |
| Adjusted probability classification | |||||
| Classified as 'Work Activity' | 16,424 | 76.7 | 8,699 | 4.7 | 25,123 |
| Classified as 'Other Activity' | 4,995 | 23.3 | 178,163 | 95.3 | 183,158 |
| Total | 21,419 | 100 | 186,862 | 100 | 208,281 |
Summary of case identification results using each method
| Approach | Number of true cases identified | Number of false positives | Sensitivity | Specificity | PPV |
|---|---|---|---|---|---|
| Text Search Approaches | |||||
| Basic keyword search | 12,457 | 1,916 | 0.58 | 0.99 | 0.87 |
| Index search | 13,252 | 23,434 | 0.62 | 0.87 | 0.36 |
| Keyword OR index | 17,004 | 24,416 | 0.79 | 0.87 | 0.41 |
| Content Analytic Approaches | |||||
| Binary classification | 17,299 | 13,894 | 0.81 | 0.93 | 0.55 |
| Adjusted probability classification | 16,424 | 8,699 | 0.77 | 0.95 | 0.65 |
Figure 1Identification of cases of work-related injuries using 'Work Activity' code, a text search approach and a content analytic text mining approach.