| Literature DB >> 28629156 |
Yuchao Zhou1, Suparna De2, Wei Wang3, Klaus Moessner4, Marimuthu S Palaniswami5.
Abstract
Data searching and retrieval is one of the fundamental functionalities in many Web of Things applications, which need to collect, process and analyze huge amounts of sensor stream data. The problem in fact has been well studied for data generated by sensors that are installed at fixed locations; however, challenges emerge along with the popularity of opportunistic sensing applications in which mobile sensors keep reporting observation and measurement data at variable intervals and changing geographical locations. To address these challenges, we develop the Geohash-Grid Tree, a spatial indexing technique specially designed for searching data integrated from heterogeneous sources in a mobile sensing environment. Results of the experiments on a real-world dataset collected from the SmartSantander smart city testbed show that the index structure allows efficient search based on spatial distance, range and time windows in a large time series database.Entities:
Keywords: Web of Things (WoT); mobile sensing; mobile sensor data search; opportunistic sensing; spatial indexing
Year: 2017 PMID: 28629156 PMCID: PMC5492522 DOI: 10.3390/s17061427
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Metrics of Indexing and Query.
| Metrics | Descriptions |
|---|---|
| Data Items | Indicates the items managed by the reviewed system. They are the objects used for the indexed domains and corresponding query functionalities. |
| Indexed Domain | Spatial, temporal, or thematic domains indexed by the system. |
| Supported Query | Query functionalities supported by the system. |
| Metadata Update Frequency | Frequency of update for the metadata of sensor data, especially spatial information. Frequently changing spatial information is one of the key characteristics in mobile sensing environments and a good system should support high metadata update frequency. |
| Query for Historical Data | Whether the reviewed system supports queries for historical data. Historical data is important for data analysis. |
Comparison Table of Related Work.
| Search Method | Data Items | Indexed Domain | Supported Query | Metadata Update Frequency | Query for Historical Data |
|---|---|---|---|---|---|
| IoT service resolution framework [ | IoT Service metadata | Spatial | Point or area-based spatial query | Slow | - |
| Wei et al. [ | Sensor metadata | Spatial | Point-based spatial query | Very fast | Not supported |
| OSIRIS [ | Sensor metadata | A spatial Index for spatial domain, a temporal index for temporal domain, a full-text index for thematic domain | Sensor instance discovery and sensor service discovery based on search criteria of metadata, including spatial, temporal, and thematic metadata | Medium | Yes |
| SensorMap [ | Sensor metadata | Spatial | Spatial search for latest values generated by sensors | Medium | No |
| Liveweb [ | Sensor data | Keywords indexing for thematic domain, binary search tree for values | Search for real-time content based on keywords, category, and value range | Slow | Yes |
| GeoCENS [ | Sensor Web Service | Spatial filling curve for spatial domain | Geospatial search based on key-value pair queries | Slow | Yes |
| IoT-SVK [ | Sensors, devices, objects | Spatial-Temporal R-Tree for spatial and temporal domain, B+-Tree for keywords and values | Keyword-based search | Fast | Yes |
| Linked Stream Middleware (LSM) [ | Sensor streams as RDF triples | - | SPARQL-based continuous query with semantic constraints (including spatial and temporal domain constraints) | Slow | Yes |
| LOST-Tree [ | Sensor data and geometries | Spatial, Temporal | Spatial and temporal queries | - | - |
| AHS model [ | Sensor observation data | Spatial | Spatial query with complex geometry | Very fast | Not supported |
| Bouros et al. [ | Trajectories | Spatial and temporal | Retrieval of the top-k trajectories that pass as close as possible to all query points. | Fast | Yes |
| Chan et al. [ | Haar Wavelets transformed trajectories | Spatial and temporal | Range query and nearest neighbour query for trajectories | Fast | Yes |
| Cai and Ng [ | Chebyshev approximation of trajectories | Spatial and temporal | K nearest neighbours query for trajectories | Fast | Yes |
| Chen et al. [ | Trajectories | Spatial and temporal | K nearest neighbours query for trajectories | Fast | Yes |
| Zhu and Gong [ | Trajectories | Spatio-temporal R-tree for spatial and time domain, | real-time access to latest trajectories, trajectory-based queries | Fast | Yes |
| Proposed Approach | Virtualized objects and sensor data | Geohash-Grid Tree for spatial domain, TSDB for time domain | Query with spatio-temporal ranges and phenomenon, Aggregations on time-series data | Very fast | Yes |
Note: “-“ means the feature is unknown or not made explicit from the paper.
Figure 1FUTS data retrieval framework.
Figure 2Example of Original Collected Data and its record in InfluxDB. (a) Example of Original Collected Data; (b) Example of Data Record in InfluxDB.
Example of VO instance.
| 3021 | bus3021 | ||
| Yes | |||
| {43.430007, −3.949993, eztpn45wn} | |||
| {Particles, 0.89, mg/m3, 02 January 2015 17:33:19, density of particles with a diameter between 2.5 and 10 micrometres} | |||
| {Humidity, 0.64, percentage, 02 January 2015 17:33:19, Humidity} | |||
Storage Mechanism of InfluxDB.
| Database | Measurement | Tag-key | Tag-value | Field-key | Field-value |
|---|---|---|---|---|---|
| mydb | vo_3021 | geohash | eztpn45wn | 0.64 |
Figure 3Z-order Filling Curve to Show Geohash Division. Maps are generated by using the Geohash Explorer service [31].
Figure 4Geohash-Grid Tree Structure. Maps are generated by using the Geohash Explorer service [31].
Statistics of the collected SmartSantander Data.
| Dimension | Total | From | To | Distribution |
|---|---|---|---|---|
| VO | 84 | |||
| Environmental Phenomenon | 5 | CO, Humidity, Ozone + NO2, Particles (PM10), and Temperature | ||
| Latitude | 827 | 42.9855 | 43.6636 | Largely distributed between 43.4 to 43.5 |
| Longitude | 9978 | −4.13309 | −3.53594 | Largely distributed between −3.9 to −3.78 |
| Date | 223 distinct days | Friday, 02 January 2015 17:33:19 GMT | Thursday, 13 August 2015 11:59:09 GMT | Almost a uniform distribution |
Figure 5Index creation time: Geohash-Grid Tree and R-Tree.
Query Constraints and Number of Query Results from Tree Structures.
| Query constraint | Query 1 | Query 2 | Query 3 | Query 4 |
|---|---|---|---|---|
| Point | (43.1702, −3.89954) | (43.4632, −3.80883) | ||
| Location range | (43.4, −3.6) to (43.5, −3.5) | (43.4, −3.9) to (43.5, −3.8) | ||
| Indexed items density in nearby area | sparse | dense | sparse | dense |
| Number of distinct VO_IDs of returned indexed items | 1 | 1 | 1 | 84 |
Figure 6Tree response time: Geohash-Grid Tree and R-Tree. (a) Query 1: point matching in a sparse area; (b) Query 2: point matching in a dense area; (c) Query 3: range matching in a sparse area; (d) Query 4: range matching in a dense area.
Query Constraints and Number of Query Results.
| Phenomenon | Query 1 | Query 2 | Query 3 | Query 4 | Query 5 |
|---|---|---|---|---|---|
| Temperature | |||||
| Time_from | 21 January 2015 10:00 a.m. | 21 January 2015 10:00 a.m. | 21 January 2015 10:00 a.m. | 21 January 2015 10:00 a.m. | 21 January 2015 10:00 a.m. |
| Time_to | 22 January 2015 10:00 a.m. | 26 January 2015 10:00 a.m. | 22 July 2015 10:00 a.m. | 26 July 2015 10:00 a.m. | 22 July 2015 10:00 a.m. |
| Time range | 1 day | 5 days | ~6 months | ~6 months | ~6 months |
| Location range | (43.4, −3.94) to (43.42, −3.93) | (43.47, −3.79) to (43.473, −3.785) | (43.4, −3.94) to (43.42, −3.93) | (43.47, −3.79) to (43.473, −3.785) | (43.467, −3.79) to (43.47, −3.787) |
| Data records density of nearby area | sparse | dense | sparse | dense | dense |
| VOs in indexed trees | 2 | 46 | 2 | 46 | 58 |
| Number of Returned Data Records | |||||
| Geo-coordinates_as_Field | 1 | 4 | 7 | 107 | 1004 |
| Geohash-Grid Tree | 1 | 4 | 7 | 107 | 1004 |
| R-Tree | 1 | 4 | 7 | 107 | 1004 |
| Geohash_as_Tag | 616 | 2774 | 88,864 | 85,459 | 93,404 |
Figure 7Query Response Time for different methods and selection criteria: (a) Query 1; (b) Query 2; (c) Query 3; (d) Query 4; (e) Query 5.