Literature DB >> 32559659

Examining imbalanced classification algorithms in predicting real-time traffic crash risk.

Yichuan Peng1, Chongyi Li2, Ke Wang3, Zhen Gao4, Rongjie Yu5.   

Abstract

The Active Traffic Management (ATM) system has been widely used in the United States and the European countries to improve the traffic safety of urban expressways. The accurate real-time crash risk prediction is fundamental to the system running well. Crash data are characterized by small probability, which poses a typical Imbalanced Data Classification problem. Most previous studies mainly improved the prediction methods only in data level or algorithm level, which may be inadequate to predict the crash risk accurately especially in a continuous real-time traffic data environment. The comprehensive imbalanced classification algorithm was examined in this research to build more accurate real-time traffic crash risk prediction model. At the output level, the Youden index method has been proved to be of the best ability to divide the prediction results and Probability Calibration Method was proposed to optimize the prediction results in further. At the data level, Under-sampling and Synthetic Minority Oversampling Technique(SMOTE) methods were compared to solve the imbalanced data classification problem by changing the data distribution. At the algorithm level, the cost-sensitive MLP algorithm and Adaboost algorithm were examined and finally the random sampling cost-sensitive MLP model(RCSMLP) and Rusboost model were constructed by synthesizing the optimization methods from three levels. The sensitivity of the RCSMLP model reached 78.10 % and the specificity of the model reached 81.44 %. The AUC and sensitivity of the Rusboost model reached 0.892 and 0.842 while the specificity of the model reached 0.816, which shows the better performance in dealing with the imbalanced traffic crash risk prediction problem compared to existed prediction models. The proposed method of improving prediction accuracy in this study is universal and can be applied to many other prediction models to predict real-time traffic crash risk.
Copyright © 2020 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Continuous data environment; Imbalanced data classification; RCSMLP; Real-time crash risk prediction models; Rusboost model

Year:  2020        PMID: 32559659     DOI: 10.1016/j.aap.2020.105610

Source DB:  PubMed          Journal:  Accid Anal Prev        ISSN: 0001-4575


  3 in total

1.  Application of Extremely Randomised Trees for exploring influential factors on variant crash severity data.

Authors:  Farshid Afshar; Seyedehsan Seyedabrishami; Sara Moridpour
Journal:  Sci Rep       Date:  2022-07-07       Impact factor: 4.996

2.  Lane-Level Regional Risk Prediction of Mainline at Freeway Diverge Area.

Authors:  Nengchao Lyu; Jiaqiang Wen; Wei Hao
Journal:  Int J Environ Res Public Health       Date:  2022-05-11       Impact factor: 4.614

3.  Experimental Study and Comparison of Imbalance Ensemble Classifiers with Dynamic Selection Strategy.

Authors:  Dongxue Zhao; Xin Wang; Yashuang Mu; Lidong Wang
Journal:  Entropy (Basel)       Date:  2021-06-28       Impact factor: 2.524

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.