| Literature DB >> 31261640 |
Teresa Cristóbal1, Gabino Padrón1, Alexis Quesada-Arencibia1, Francisco Alayón1, Gabriel de Blasio1, Carmelo R García2.
Abstract
In road-based mass transit systems, travel time is a key factor in providing quality of service. This article proposes a method of predicting travel time for this type of transport system. This method estimates travel time by taking into account its historical behaviour, represented by historical profiles, and the current behaviour recorded on the public transport vehicle for which the prediction is to be made. The model uses the k-medoids clustering algorithm to obtain historical travel time profiles. A relevant feature of the model is that it does not require recent travel time data from other vehicles. For this reason, the proposed model may be used in intercity transport contexts in which service planning is carried out according to timetables. The proposed model has been tested with two real cases of intercity public transport routes and from the results obtained we may conclude that, in general, the average error of the predictions is around 13% compared to the observed travel time values.Entities:
Keywords: automatic vehicle location; clustering; intelligent transport systems; road-based mass transit systems; travel time prediction
Year: 2019 PMID: 31261640 PMCID: PMC6650887 DOI: 10.3390/s19132869
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Advantages and disadvantages of short-term TT prediction models.
| Model | Technique | Advantages | Disadvantages |
|---|---|---|---|
|
|
|
Predictive power Ability to discover non-linear relationships |
Low interpretability High volumes of data are required Poor scalability for handling large volumes of data High cost for applying to all the lines of a transport network |
|
|
Predictive power Ability to discover non-linear relationships |
Low interpretability High volumes of data are required Poor scalability for handling large volumes of data High cost for applying to all the lines of a transport network | |
|
|
Non-parametric Handling missing data and outliers |
Low interpretability High volumes of data are required | |
|
|
Non-parametric High scalability for handling large volumes of data Interpretability |
Poor predictive power | |
|
|
|
Ability to filter noisy data Ability to react to unexpected events |
Based on the most recent data samples |
|
|
High computational speed |
Highly sensitive to outliers | |
| Smoothing functions |
Simplicity |
Poor predictive power | |
| Flow Conservation and Traffic Dynamic | Queueing theory |
Realistic models for complex realities |
Independent of the input data distribution |
Figure 1Overview of the TT prediction model.
Notation used for the model entities.
| Travel Time |
|
| Generic line of public transport |
|
| Specific line of public transport identified by code c |
|
| Set of trips of |
|
| Set of trips of |
|
| Orderly set of bus stops of |
|
| Orderly set of interest points of |
|
|
| |
| Trip of |
|
| Arrival times observed at the points of interest on trip |
|
| Set comprising the arrival times on all trips of |
|
| Dwell time |
|
| Dwell time at bus stop |
|
| Nonstop running time |
|
| Nonstop running time in segment n of the route |
|
| Observed arrival time at |
|
| Predicted arrival time at |
|
Figure 2Schematic representation of a route according to the model.
Figure 3Representation of trip data and resulting medoids.
Figure 4Schematic representation of the prediction method.
Numerical illustration of a TT prediction on a route with four points of interest (P) and using three medoids (M). TT row is the observed TTs and D(TT,M) is the Manhattan distance.
|
|
|
|
|
| |
|---|---|---|---|---|---|
|
| (360) | (360,900) | (360,900,1620) | (360,900,1620,1980) | (360,900,1620,1980,2880) |
|
| (240) | (240,780) | (240,780,1380) | (240, 780, 1380,1740) | (240,780,1380,1740,2640) |
|
| (240) | (240,720) | (240,720,1200) | (240,720,1200,1500) | (240,720,1200,1500,2340) |
|
| (180) | (180,720) | (180,720,1260) | (180,720,1260,1620) | (180,720,1260,1620,2460) |
|
| 180 | 360 | 720 | 1080 | 1500 |
|
|
| 120 | 240 | 360 | 540 |
|
| 60 |
|
|
|
|
|
| 720 | 1200 | 1560 | 2460 |
Length and number of stops on the lines considered in the use case.
|
|
| |
|---|---|---|
| Length (km) | 60 | 32 |
| Number of stops | 91 | 42 |
| Points of Interest | 7 | 5 |
Figure 5Representation of the two lines analysed.
Number of records from each of the datasets for the preparation phase.
|
|
| |
|---|---|---|
|
| 2,038,668 | 615,813 |
|
| 11,847 | 9887 |
|
| 8419 | 7862 |
Figure 6Behaviour of the silhouette function for each route depending on the number of clusters used in the clustering process. (a) Route L1; (b) Route L303.
Figure 7Clusters resulting from applying the k-medoids technique with two clusters to the TT observed on the trips.
MAPE values for each of the segments of the analysed routes.
|
|
| |
|---|---|---|
|
| 0.1106 | 0.1325 |
|
| 0.1549 | 0.1098 |
|
| 0.0868 | 0.2232 |
|
| 0.1413 | 0.1029 |
|
| 0.0761 | 0.0939 |
|
| 0.0978 | |
|
| 0.1067 |
Figure 8MAPE values obtained by applying the three methods. (a) Values obtained for L1. (b) Values obtained for L303.
Figure 9Average MAPE values for each method on each of the lines studied.
Time required by the methods, in seconds.
|
|
| |
|---|---|---|
|
| 6572 | 9046 |
|
| 385.39 | 74,824 |