Literature DB >> 28865927

Construction accident narrative classification: An evaluation of text mining techniques.

Yang Miang Goh1, C U Ubeynarayana2.   

Abstract

Learning from past accidents is fundamental to accident prevention. Thus, accident and near miss reporting are encouraged by organizations and regulators. However, for organizations managing large safety databases, the time taken to accurately classify accident and near miss narratives will be very significant. This study aims to evaluate the utility of various text mining classification techniques in classifying 1000 publicly available construction accident narratives obtained from the US OSHA website. The study evaluated six machine learning algorithms, including support vector machine (SVM), linear regression (LR), random forest (RF), k-nearest neighbor (KNN), decision tree (DT) and Naive Bayes (NB), and found that SVM produced the best performance in classifying the test set of 251 cases. Further experimentation with tokenization of the processed text and non-linear SVM were also conducted. In addition, a grid search was conducted on the hyperparameters of the SVM models. It was found that the best performing classifiers were linear SVM with unigram tokenization and radial basis function (RBF) SVM with uni-gram tokenization. In view of its relative simplicity, the linear SVM is recommended. Across the 11 labels of accident causes or types, the precision of the linear SVM ranged from 0.5 to 1, recall ranged from 0.36 to 0.9 and F1 score was between 0.45 and 0.92. The reasons for misclassification were discussed and suggestions on ways to improve the performance were provided.
Copyright © 2017 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Accident classification; Construction safety; Data mining; Support vector machine; Text mining

Mesh:

Year:  2017        PMID: 28865927     DOI: 10.1016/j.aap.2017.08.026

Source DB:  PubMed          Journal:  Accid Anal Prev        ISSN: 0001-4575


  4 in total

1.  Application of a Machine Learning-Based Decision Support Tool to Improve an Injury Surveillance System Workflow.

Authors:  Jesani Catchpoole; Gaurav Nanda; Kirsten Vallmuur; Goshad Nand; Mark Lehto
Journal:  Appl Clin Inform       Date:  2022-05-29       Impact factor: 2.762

2.  Identifying and Evaluating the Essential Factors Affecting the Incidence of Site Accidents Caused by Human Errors in Industrial Parks Construction Projects.

Authors:  Adel Rafieyan; Hadi Sarvari; Daniel W M Chan
Journal:  Int J Environ Res Public Health       Date:  2022-08-17       Impact factor: 4.614

3.  Predicting occupational injury causal factors using text-based analytics: A systematic review.

Authors:  Mohamed Zul Fadhli Khairuddin; Khairunnisa Hasikin; Nasrul Anuar Abd Razak; Khin Wee Lai; Mohd Zamri Osman; Muhammet Fatih Aslan; Kadir Sabanci; Muhammad Mokhzaini Azizan; Suresh Chandra Satapathy; Xiang Wu
Journal:  Front Public Health       Date:  2022-09-15

4.  Exploring Fatalities and Injuries in Construction by Considering Thermal Comfort Using Uncertainty and Relative Importance Analysis.

Authors:  Minsu Lee; Jaemin Jeong; Jaewook Jeong; Jaehyun Lee
Journal:  Int J Environ Res Public Health       Date:  2021-05-23       Impact factor: 3.390

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.