Kirsten Vallmuur1. 1. Centre for Accident Research and Road Safety - Queensland, School of Psychology and Counselling, Faculty of Health, Queensland University of Technology, Kelvin Grove 4059, Brisbane, Queensland, Australia. Electronic address: k.vallmuur@qut.edu.au.
Abstract
OBJECTIVE: To synthesise recent research on the use of machine learning approaches to mining textual injury surveillance data. DESIGN: Systematic review. DATA SOURCES: The electronic databases which were searched included PubMed, Cinahl, Medline, Google Scholar, and Proquest. The bibliography of all relevant articles was examined and associated articles were identified using a snowballing technique. SELECTION CRITERIA: For inclusion, articles were required to meet the following criteria: (a) used a health-related database, (b) focused on injury-related cases, AND used machine learning approaches to analyse textual data. METHODS: The papers identified through the search were screened resulting in 16 papers selected for review. Articles were reviewed to describe the databases and methodology used, the strength and limitations of different techniques, and quality assurance approaches used. Due to heterogeneity between studies meta-analysis was not performed. RESULTS: Occupational injuries were the focus of half of the machine learning studies and the most common methods described were Bayesian probability or Bayesian network based methods to either predict injury categories or extract common injury scenarios. Models were evaluated through either comparison with gold standard data or content expert evaluation or statistical measures of quality. Machine learning was found to provide high precision and accuracy when predicting a small number of categories, was valuable for visualisation of injury patterns and prediction of future outcomes. However, difficulties related to generalizability, source data quality, complexity of models and integration of content and technical knowledge were discussed. CONCLUSIONS: The use of narrative text for injury surveillance has grown in popularity, complexity and quality over recent years. With advances in data mining techniques, increased capacity for analysis of large databases, and involvement of computer scientists in the injury prevention field, along with more comprehensive use and description of quality assurance methods in text mining approaches, it is likely that we will see a continued growth and advancement in knowledge of text mining in the injury field.
OBJECTIVE: To synthesise recent research on the use of machine learning approaches to mining textual injury surveillance data. DESIGN: Systematic review. DATA SOURCES: The electronic databases which were searched included PubMed, Cinahl, Medline, Google Scholar, and Proquest. The bibliography of all relevant articles was examined and associated articles were identified using a snowballing technique. SELECTION CRITERIA: For inclusion, articles were required to meet the following criteria: (a) used a health-related database, (b) focused on injury-related cases, AND used machine learning approaches to analyse textual data. METHODS: The papers identified through the search were screened resulting in 16 papers selected for review. Articles were reviewed to describe the databases and methodology used, the strength and limitations of different techniques, and quality assurance approaches used. Due to heterogeneity between studies meta-analysis was not performed. RESULTS:Occupational injuries were the focus of half of the machine learning studies and the most common methods described were Bayesian probability or Bayesian network based methods to either predict injury categories or extract common injury scenarios. Models were evaluated through either comparison with gold standard data or content expert evaluation or statistical measures of quality. Machine learning was found to provide high precision and accuracy when predicting a small number of categories, was valuable for visualisation of injury patterns and prediction of future outcomes. However, difficulties related to generalizability, source data quality, complexity of models and integration of content and technical knowledge were discussed. CONCLUSIONS: The use of narrative text for injury surveillance has grown in popularity, complexity and quality over recent years. With advances in data mining techniques, increased capacity for analysis of large databases, and involvement of computer scientists in the injury prevention field, along with more comprehensive use and description of quality assurance methods in text mining approaches, it is likely that we will see a continued growth and advancement in knowledge of text mining in the injury field.
Authors: Alysha R Meyers; Ibraheem S Al-Tarawneh; Steven J Wurzelbacher; P Timothy Bushnell; Michael P Lampl; Jennifer L Bell; Stephen J Bertke; David C Robins; Chih-Yu Tseng; Chia Wei; Jill A Raudabaugh; Teresa M Schnorr Journal: J Occup Environ Med Date: 2018-01 Impact factor: 2.162
Authors: Kirsten Vallmuur; Helen R Marucci-Wellman; Jennifer A Taylor; Mark Lehto; Helen L Corns; Gordon S Smith Journal: Inj Prev Date: 2016-01-04 Impact factor: 2.399
Authors: Albeliz Santiago-Colón; Carissa M Rocheleau; Stephen Bertke; Annette Christianson; Devon T Collins; Emma Trester-Wilson; Wayne Sanderson; Martha A Waters; Jennita Reefhuis Journal: Ann Work Expo Health Date: 2021-07-03 Impact factor: 2.179
Authors: Frederico M Bublitz; Arlene Oetomo; Kirti S Sahu; Amethyst Kuang; Laura X Fadrique; Pedro E Velmovitsky; Raphael M Nobrega; Plinio P Morita Journal: Int J Environ Res Public Health Date: 2019-10-11 Impact factor: 3.390
Authors: Christopher Thomas Picard; Manal Kleib; Hannah M O'Rourke; Colleen M Norris; Matthew J Douma Journal: BMJ Open Date: 2022-04-13 Impact factor: 3.006
Authors: Mohamed Zul Fadhli Khairuddin; Khairunnisa Hasikin; Nasrul Anuar Abd Razak; Khin Wee Lai; Mohd Zamri Osman; Muhammet Fatih Aslan; Kadir Sabanci; Muhammad Mokhzaini Azizan; Suresh Chandra Satapathy; Xiang Wu Journal: Front Public Health Date: 2022-09-15