| Literature DB >> 27801869 |
Lianjie Zhou1, Nengcheng Chen2,3, Sai Yuan4, Zeqiang Chen5,6.
Abstract
The efficient sharing of spatio-temporal trajectory data is important to understand traffic congestion in mass data. However, the data volumes of bus networks in urban cities are growing rapidly, reaching daily volumes of one hundred million datapoints. Accessing and retrieving mass spatio-temporal trajectory data in any field is hard and inefficient due to limited computational capabilities and incomplete data organization mechanisms. Therefore, we propose an optimized and efficient spatio-temporal trajectory data retrieval method based on the Cloudera Impala query engine, called ESTRI, to enhance the efficiency of mass data sharing. As an excellent query tool for mass data, Impala can be applied for mass spatio-temporal trajectory data sharing. In ESTRI we extend the spatio-temporal trajectory data retrieval function of Impala and design a suitable data partitioning method. In our experiments, the Taiyuan BeiDou (BD) bus network is selected, containing 2300 buses with BD positioning sensors, producing 20 million records every day, resulting in two difficulties as described in the Introduction section. In addition, ESTRI and MongoDB are applied in experiments. The experiments show that ESTRI achieves the most efficient data retrieval compared to retrieval using MongoDB for data volumes of fifty million, one hundred million, one hundred and fifty million, and two hundred million. The performance of ESTRI is approximately seven times higher than that of MongoDB. The experiments show that ESTRI is an effective method for retrieving mass spatio-temporal trajectory data. Finally, bus distribution mapping in Taiyuan city is achieved, describing the buses density in different regions at different times throughout the day, which can be applied in future studies of transport, such as traffic scheduling, traffic planning and traffic behavior management in intelligent public transportation systems.Entities:
Keywords: Beidou positioning sensor; cloud computing; data retrieval; spatial-temporal data
Year: 2016 PMID: 27801869 PMCID: PMC5134472 DOI: 10.3390/s16111813
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Heterogeneous and typical vehicle monitoring networks.
| Shenzhen taxi network | Wuhan taxi network | Taiyuan bus network | New York taxi network | |
| Shenzhen, China | Wuhan, China | Taiyuan, China | New York, America | |
| Global Position System (GPS) | GPS | BeiDou (BD) Navigation Satellite System | GPS | |
| February 2011 | March 2012 | August 2013 | June 2009 | |
| 25,000 | 12,137 | 2200 | 33,000 | |
| Speed, location, direction | Speed, location, direction | Speed, location, direction | Speed, location, direction | |
| 60 | 30 | 30 | 30 | |
| 800 million | 860 million | 12.67 million | 984 million | |
| Oracle | MongoDB cluster | MySQL | Unknown |
Figure 1Impala query processing diagram.
Figure 2Hilbert encoding process in the HDFS data blocks.
Figure 3(a) The mapping of the data types from a relational database to the HDFS; (b) Tables design in HDFS for the SOS.
Figure 4Spatial distributions of bus stations, bus lines and arterial roads in Taiyuan, Shanxi Province, China.
Figure 5Retrieved BD bus locations in the Taiyuan national high tech Industrial Development Zone at different times: (a) 6:20 a.m. to 7:20 a.m.; (b) 11:20 a.m. to 00:20 p.m.; (c) 4:20 p.m. to 5:20 p.m.; and (d) 6:20 p.m. to 7:20 p.m.
Figure 6Buses distribution map of Taiyuan city: (a) from 7:30 a.m. to 9:00 a.m.; (b) from 11:00 a.m. to 12:00 a.m.; (c) from 12:30 p.m. to 2:00 p.m.; and (d) from 7:30 p.m. to 9:00 p.m.
Figure 7Time consumption for fifty hundred data records between ESTRI and Impala with no spatial partition way.
Figure 8Time consumption in different data volume with ESTRI.
Figure 9Time consumption comparisons between ESTRI (black color) and MongoDB (red color) for different amounts of data: (a) fifty million data; (b) one hundred million data; (c) one hundred and fifty million data; and (d) two hundred million data.