| Literature DB >> 32825008 |
Shengda Luo1, Alex Po Leung1, Xingzhao Qiu1,2, Jan Y K Chan1, Haozhi Huang1,3,4.
Abstract
To monitor road safety, billions of records can be generated by Controller Area Network bus each day on public transportation. Automation to determine whether certain driving behaviour of drivers on public transportation can be considered safe on the road using artificial intelligence or machine learning techniques for big data analytics has become a possibility recently. Due to the high false classification rates of the current methods, our goal is to build a practical and accurate method for road safety predictions that automatically determine if the driving behaviour is safe on public transportation. In this paper, our main contributions include (1) a novel feature extraction method because of the lack of informative features in raw CAN bus data, (2) a novel boosting method for driving behaviour classification (safe or unsafe) to combine advantages of deep learning and shallow learning methods with much improved performance, and (3) an evaluation of our method using a real-world data to provide accurate labels from domain experts in the public transportation industry for the first time. The experiments show that the proposed boosting method with our proposed features outperforms seven other popular methods on the real-world dataset by 5.9% and 5.5%.Entities:
Keywords: controller area network; deep learning; machine learning; transportation
Mesh:
Year: 2020 PMID: 32825008 PMCID: PMC7506606 DOI: 10.3390/s20174671
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Features in our dataset to evaluate road safety.
| Feature Name | Meaning | Used to Train |
|---|---|---|
| LOGID | Bus identifier | No |
| GPSDATE | Time | No |
| VELOCITY | Instantaneous speed | Yes |
| MILEAGE | GPS mileage | Yes |
| TOTAL | Total mileage | Yes |
| FRONT | Front pressure | Yes |
| REAR | Rear pressure | Yes |
| ENGINESPEED | Engine speed | Yes |
| ENGINETEMP | Engine temperature | Yes |
| CARSWITCH | Switches of the bus | No |
| CARLIGHT | Switches of light | No |
| CANALARM | Switches of alarm | No |
| CREATETIME | Time | No |
| GPS | Instantaneous speed | Yes |
| DRIVERID | Driver identifier | No |
| LONGITUDE | Longitude | Yes |
| LATITUDE | Latitude | Yes |
| DIRECTION | Turn | Yes |
| STATIONID | Station identifier | No |
| ROUTEID | Route identifier | No |
| BUSSTATE | Bus status | No |
| ALARM | Alarm light status | No |
| STATION | Mileage | Yes |
| UPDOWN | Up and down | No |
Descriptive statistics on features used for training models.
| Feature Name | Mean | Std. Dev. | Median |
|---|---|---|---|
| VELOCITY | 147.9 | 159.9 | 90 |
| MILEAGE | 65,220,323.4 | 12,868,396.8 | 77,710,500 |
| TOTAL | 66,584,813.3 | 15,653,632.7 | 81,771,740 |
| FRONT | 843.4 | 71.7 | 852 |
| REAR | 794.4 | 32.7 | 796 |
| ENGINESPEED | 976.5 | 342.6 | 803 |
| ENGINETEMP | 81.6 | 7.1 | 83 |
| GPS | 141.2 | 152.8 | 90 |
| LONGITUDE | 113.5 | 0.0066 | 113.5 |
| LATITUDE | 22.1 | 0.0195 | 22.1 |
| DIRECTION | 197.3 | 104.8 | 195 |
| STATION | 346.7 | 589.3 | 170 |
Seven state-of-the-art machine learning methods used in the proposed boosting method.
| Machine Learning Methods |
|---|
| Support Vector Machine (SVM) |
| Random Forest (RF) |
| Naive Bayes |
| Discriminant Analysis |
| Adaptive Boosting (AdaBoost) |
| Long Short-Term Memory (LSTM) |
Figure 1The long short-term memory (LSTM) network used in the proposed boosting method.
Four components of the LSTM network to avoid the vanishing gradient problem in the RNN.
| Components | Purposes |
|---|---|
| Input gate | Preprocess the input data |
| Output gate | Updata the output hidden state |
| Forget gate | Reset (forget) the input data |
| Cell candidate | Update the output cell state |
The comparison among eight methods on the Warrigal dataset without our time-series features. The proposed method gets the highest values of accuracy and AUC.
| Classifying the Warrigal Dataset WITHOUT Our Features | ||||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
| ||||||
|
|
|
|
|
|
|
|
| |
| Our Method | 92.9 | 0.923 | 0.932 | 0.912 | 93.7 | 0.931 | 0.939 | 0.922 |
| AdaBoost | 79.5 | 0.761 | 0.832 | 0.608 | 83.5 | 0.798 | 0.871 | 0.652 |
| Simple Bayes | 77.3 | 0.742 | 0.825 | 0.513 | 80.1 | 0.764 | 0.857 | 0.522 |
| Discriminant | 77.1 | 0.728 | 0.815 | 0.551 | 82.5 | 0.776 | 0.869 | 0.607 |
| KNN | 83.2 | 0.733 | 0.873 | 0.628 | 84.3 | 0.745 | 0.886 | 0.628 |
| RF | 90.7 | 0.903 | 0.923 | 0.828 | 91.8 | 0.910 | 0.935 | 0.830 |
| SVM | 75.9 | 0.756 | 0.722 | 0.896 | 82.3 | 0.815 | 0.796 | 0.906 |
| LSTM | 91.4 | 0.818 | 0.938 | 0.791 | 92.6 | 0.822 | 0.951 | 0.801 |
A comparison among eight methods on the Warrigal dataset with our feature extraction method. The comparison between the values of accuracy and AUC in Table 5 and this table shows that the performance of all machine learning methods is improved with our feature extraction method.
| Classifying the Warrigal Dataset WITH Our Features | ||||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
| ||||||
|
|
|
|
|
|
|
|
| |
| Our Method | 94.1 | 0.944 | 0.943 | 0.927 | 95.8 | 0.952 | 0.962 | 0.937 |
| AdaBoost | 80.7 | 0.781 | 0.833 | 0.674 | 84.3 | 0.820 | 0.868 | 0.718 |
| Simple Bayes | 80.2 | 0.759 | 0.828 | 0.669 | 82.7 | 0.804 | 0.858 | 0.672 |
| Discriminant | 79.5 | 0.776 | 0.814 | 0.697 | 82.9 | 0.808 | 0.848 | 0.731 |
| KNN | 83.5 | 0.751 | 0.854 | 0.737 | 87.4 | 0.782 | 0.898 | 0.752 |
| RF | 91.3 | 0.913 | 0.919 | 0.879 | 93.9 | 0.937 | 0.946 | 0.903 |
| SVM | 78.2 | 0.755 | 0.746 | 0.907 | 82.8 | 0.819 | 0.801 | 0.907 |
| LSTM | 92.7 | 0.825 | 0.941 | 0.853 | 94.8 | 0.901 | 0.961 | 0.883 |
The comparison among eight methods on our real-world dataset without the proposed feature extraction method. In terms of classification accuracy or AUC, our boosting method outperforms the other state-of-the-art methods in all cases.
| Classifying the New Real-World Dataset WITHOUT Our Features | ||||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
| ||||||
|
|
|
|
|
|
|
|
| |
| Our Method | 91.2 | 0.870 | 0.921 | 0.838 | 92.9 | 0.881 | 0.940 | 0.844 |
| AdaBoost | 78.1 | 0.740 | 0.808 | 0.574 | 80.0 | 0.753 | 0.827 | 0.592 |
| Simple Bayes | 62.7 | 0.586 | 0.643 | 0.521 | 62.0 | 0.577 | 0.634 | 0.515 |
| Discriminant | 58.4 | 0.539 | 0.589 | 0.543 | 60.0 | 0.564 | 0.604 | 0.568 |
| KNN | 72.1 | 0.648 | 0.726 | 0.579 | 75.2 | 0.685 | 0.769 | 0.619 |
| RF | 89.7 | 0.816 | 0.924 | 0.696 | 91.2 | 0.804 | 0.936 | 0.727 |
| SVM | 57.7 | 0.558 | 0.532 | 0.852 | 61.1 | 0.598 | 0.567 | 0.873 |
| LSTM | 77.3 | 0.757 | 0.786 | 0.676 | 85.2 | 0.790 | 0.873 | 0.694 |
The comparison among eight methods on our real-world dataset with the proposed feature extraction method. After comparing with the values of classification accuracy and AUC in Table 7, it shows that the performance of any method is improved using our feature extraction method.
| Classifying the New Real-World Dataset WITH Our Features | ||||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
| ||||||
|
|
|
|
|
|
|
|
| |
| Our Method | 95.6 | 0.947 | 0.960 | 0.921 | 96.7 | 0.969 | 0.968 | 0.957 |
| AdaBoost | 83.7 | 0.781 | 0.857 | 0.685 | 85.1 | 0.802 | 0.867 | 0.731 |
| Simple Bayes | 66.8 | 0.651 | 0.674 | 0.622 | 72.1 | 0.702 | 0.724 | 0.692 |
| Discriminant | 77.8 | 0.698 | 0.794 | 0.651 | 80.1 | 0.732 | 0.816 | 0.686 |
| KNN | 77.2 | 0.716 | 0.789 | 0.640 | 77.6 | 0.727 | 0.790 | 0.669 |
| RF | 93.9 | 0.903 | 0.958 | 0.790 | 94.1 | 0.904 | 0.956 | 0.827 |
| SVM | 77.9 | 0.689 | 0.753 | 0.857 | 80.4 | 0.732 | 0.781 | 0.881 |
| LSTM | 75.1 | 0.734 | 0.765 | 0.642 | 80.3 | 0.767 | 0.820 | 0.675 |
Classification accuracies of the eight methods using our real-world dataset. In this experiment for this table, the samples of the training set and test set are collected in four different periods; samples are therefore grouped into four subsets corresponding to the respective time periods.
| On Our Real-World Dataset | ||
|---|---|---|
|
|
|
|
| Our Method | 86.4% | 68.7% |
| AdaBoost | 69.3% | 66.4% |
| Simple Bayes | 69.4% | 59.9% |
| Discriminant Analysis | 62.3% | 58.7% |
| KNN | 71.9% | 56.9% |
| RF | 65.6% | 61.1% |
| SVM | 84.1% | 62.0% |
| LSTM | 82.3% | 63.9% |
A summary of the training time required for the methods compared in seconds.
| On Our New Dataset | On the Warrigal Dataset | |||||||
|---|---|---|---|---|---|---|---|---|
| With Our Features | Without Our Features | With Our Features | Without Our Features | |||||
| Scales of | 70% | 90% | 70% | 90% | 70% | 90% | 70% | 90% |
| Our Method | 12.8 | 17.4 | 7.5 | 10.6 | 42.1k | 62.9k | 21.7k | 34.5k |
| AdaBoost | 2.7 | 3.3 | 1.9 | 2.2 | 13.3k | 17.2k | 9.72k | 14.5k |
| Simple Bayes | 0.11 | 0.13 | 0.04 | 0.05 | 6.6k | 8.3k | 4.52k | 6.1k |
| Discriminant | 0.08 | 0.10 | 0.04 | 0.07 | 1.7k | 2.1k | 1.1k | 1.5k |
| KNN | 0.07 | 0.12 | 0.06 | 0.08 | 5.4k | 7.2k | 3.2k | 5.1k |
| RF | 0.35 | 0.49 | 0.31 | 0.35 | 2.9k | 3.6k | 2.1k | 2.4k |
| SVM | 2.1 | 2.8 | 1.5 | 2.6 | 12.6k | 15.4k | 8.7k | 12.5k |
| LSTM | 1.9 | 2.3 | 1.6 | 2.0 | 11.7k | 14.2k | 7.1k | 10.5k |
A summary of the classification accuracy of the proposed boosting method with different values of parameter U.
| Values of | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|
| Methods Trained with 70% Examples | 91.2% | 93.5% | 93.9% | 94.1% | 94.0% |
| Methods Trained with 90% Examples | 92.7% | 93.9% | 95.6% | 95.8% | 95.2% |