| Literature DB >> 27128918 |
Yuchao Zhou1, Suparna De2, Wei Wang3, Klaus Moessner4.
Abstract
The Web of Things aims to make physical world objects and their data accessible through standard Web technologies to enable intelligent applications and sophisticated data analytics. Due to the amount and heterogeneity of the data, it is challenging to perform data analysis directly; especially when the data is captured from a large number of distributed sources. However, the size and scope of the data can be reduced and narrowed down with search techniques, so that only the most relevant and useful data items are selected according to the application requirements. Search is fundamental to the Web of Things while challenging by nature in this context, e.g., mobility of the objects, opportunistic presence and sensing, continuous data streams with changing spatial and temporal properties, efficient indexing for historical and real time data. The research community has developed numerous techniques and methods to tackle these problems as reported by a large body of literature in the last few years. A comprehensive investigation of the current and past studies is necessary to gain a clear view of the research landscape and to identify promising future directions. This survey reviews the state-of-the-art search methods for the Web of Things, which are classified according to three different viewpoints: basic principles, data/knowledge representation, and contents being searched. Experiences and lessons learned from the existing work and some EU research projects related to Web of Things are discussed, and an outlook to the future research is presented.Entities:
Keywords: Internet of Things; Web of Things; entities; linked data; observation and measurement data; search; sensors; streaming data
Year: 2016 PMID: 27128918 PMCID: PMC4883291 DOI: 10.3390/s16050600
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Classifications of the Search Techniques in WoT.
Figure 2WoT System Model.
WoT Applications and Enabling Search Techniques.
| Domain | Applications | Search Requirements | Search Techniques |
|---|---|---|---|
| Smart Cities | Community-based Flood Monitoring; | Require search with multiple functionalities (e.g., search by descriptions, by locations, or others); | Real-time Data Retrieval; |
| Home Automation | Adaptive Building Smart Control of Washing Machine | Dynamical selection of the surrounding/affecting resources Discovery of devices | Entity Search Sensor Search |
| Manufacturing | Lifecycle Management for Industrial Automation Systems | The discovery needs to be enabled in a dynamic environment where physical resources appear and disappear during lifecycle phases. | Context-based Search |
| Smart Grids | Virtual Power Plants | Require dynamically finding operateable distributed energy resources | Entity Search |
| Network Management | Device, Network, and Application Management | Require finding all devices that have a certain set of properties. | Context-based Search |
Comparison of Search Techniques According to Basic Principles Classification (Abbreviations are listed in the footer).
| Classification | Search Technique | Data Format | Access Approach | Search Type | Scale of Experiments | D1 | Ar2 | Implementation | Remarks |
|---|---|---|---|---|---|---|---|---|---|
| Text Indexing | GSN [ | Virtual sensors in XML | HTTP REST | Keyword-based search, SQL query with time window | - | Yes | Cen | Java/MySQL | Extended as XGSN [ |
| Text Indexing | SenseWeb [ | Sensor ontology (an extension of the namespace defined by OGC SWE) | Map display with text description | Keyword/spatial search | - | Yes | Cen | - | Many applications described but none accessible |
| Text Indexing | Dyser [ | Virtual sensors (microformat) | URL/HTTP/HTML | Keyword-based/value-based | 385 sensors over 5 months | Yes | Cen | Java/PHP | Query of state of real world objects; |
| Text Indexing | Microsearch [ | Direct communication with sensor motes with built-in metadata | Virtual sensors in built-in format | Keyword-based top-k search | - | Yes | Dis | TelosB motes | Early search implementations with limited functionalities and limited scalability |
| Text Indexing | LiveWeb [ | Sensor data in XML | HTML/AJAX | Real-time content search based on keywords, category, and value range | - | Yes | Cen | PHP/Java/Apache/MySQL | Practical implementation without evaluation Should fit in global scale |
| Text Indexing | lmDNS-SD [ | Extended DNS record for sensors and objects | List of sensors | Resource directory based on DNS-SD | - | Yes | P2P | - | Existing Internet standards utilised, thus fit in Web scale |
| Text Indexing | DNS Search [ | Extended DNS record for sensors | HTTP REST/WADL | URL and DNS-based (type/location) | 250,000 zones | Yes | Dis | BIND/MySQL | Utilise DNS structure; fit in Global scale |
| Text Indexing | Mobile Digcovery [ | Integration of Multi-types | JSON-based response | Multi-functional JSON-based query | - | Yes | Dis | ElasticSearch | Combination of search engine and Web Infrastructure |
| Text Indexing | IoT-SVKSearch [ | Sampling values with metadata in Raw Data Store | Data in built-in format | Keyword-based (B+-tree), spatial-temporal (R-tree), value-based (B+-tree) search | 140,800~352,000 sensors | Yes | Dec | PostgreSQL/file system | Multiple search functionalities supported |
| Text Indexing | OSIRIS [ | OGC SWE Sensor model and data model | SWE services | Spatial, temporal and keyword-based search | - | Yes | Cen | JSI/Apache Lucene | Applications in Smart City domain |
| Spatial Indexing | SensorMap [ | Direct communication with web accessible sensors | Web services APIs | Spatial search (based on COLR-tree/crawling) | - | - | Cen | MSRSense toolkit | SenseWeb-based application, not accessible currently |
| Spatial Indexing | GeoCENS [ | OGC SOS, SensorML, OGC O&M | SWE services | Spatial search (Peano space filling curves) | ~40,000 sensors/procedures | Yes | P2P | - | Designed for sensor data from smaller organizations or individuals |
| Spatial Indexing | Geographic Service Discovery [ | Services with geo-information | HTTP REST | Spatial search (R-tree index/category server) | 3200 services | Yes | Dis | Java | R-tree is combined with distributed architecture |
| Spatial Indexing | FUTS Data Query [ | Virtual Object ontology in OWL-DL | SenML/JSON | Spatial search with time window/sensor type | 20,000 sensors | Yes | Cen | Java | Cloud database for huge data storage |
| Spatial Indexing | Geospatial Indexing [ | OWL-S | SPARQL endpoint | Semantic query (R-tree index) | 10,000 services | Yes | Dis | Java | R-tree and Semantic query combined |
| Location-based Clustering | IoT Platform [ | SSN ontology/OWL-DL | SPARQL endpoint | SPARQL query | - | Yes | Cen | Java/Apache Tomcat | Demo in smart building, could be applied to a larger scale as well |
| Location-based Clustering | Linked Sensor Streams [ | RDF stream data | SPARQL query response | Type/location based query (based on clustering) | 20,000 stream data | Yes | Dis | Java | Evaluation provided for clustering, not for search |
| Location-based Clustering | IoT Service Search [ | OWL-S | Service access standard | Semantic-based query (based on clustering) | 7500 services | No | Dis | Java Web Application/ Apache Tomcat | Mobile services may crash this architecture |
| Location-based Clustering | Web-based Infrastructure [ | Microdata | HTTP REST | Keyword matching with scope | 600 simulated sensors | No | Dis | ApacheBench | Demo in smart building, mobile services may crash this architecture |
| Location-based Clustering | Geocasting [ | - | - | Geolocation-based query | - | Yes | P2P | - | Flexible approach, no reliance on any model or architecture |
| Non-location-based Clustering | WoT Search [ | Virtual Object ontology in OWL-DL | - | Search with user preference and geo-location (search by application or human) | - | - | - | - | Technique not implemented |
| Non-location-based Clustering | IoT Serv | Web services (DPWS) | REST/SOAP | Functionality clusters, spatial query (SWC-tree), temporal query | 1600 IoT-WS | Yes | Cen | C#/Matlab | Clustering process does not scale when IoT-WSs increase |
| Non-location-based Clustering | AntClust [ | SSN Ontology | SPARQL query response | Ant clustering based on context | 100,000 sensors | Yes | Cen | - | Good performance on query time but may need a lot of time to deal with incoming data, thus may not be suitable for real-time search |
1 D is short for Dynamicity. 2 Ar is short for Architecture. Other abbreviations include AJAX: Asynchronous JavaScript and XML, Cen: Centralised, Dec: Decentralised, Dis: Distributed, DNS-SD: DNS Service Directory, DPWS: Devices Profile for Web Services, FUTS: Frequently Updated Timestamped and Structured, HTML: HyperText Markup Language, JSI: Java Spatial Index, JSON: JavaScript Object Notation, O&M-S: Semantically annotated O&M, OGC: Open Geospatial Consortium, OWL: Web Ontology Language, OWL-DL: OWL- Description Logics, OWL-S: Semantic Markup for Web Services, RDB: Relational Database, RDFS: RDF Schema, SenML: Sensor Markup Language, SML-S: Semantically annotated SensorML, SOAP: Simple Object Access Protocol, SOS: Sensor Observation Service, SWE: Sensor Web Enablement, WADL: Web Application Description Language, XML: eXtensible Markup Language.
Comparison of Search Techniques According to Data/Knowledge Representation (Abbreviations are listed in the footer of Table 2).
| Classification | Search Technique | Data Format | Access Approach | Search Type | Scale of Experiments | D1 | Ar2 | Implementation | Remarks |
|---|---|---|---|---|---|---|---|---|---|
| Centralised | Linked Sensor Data [ | RDF | HTML/XML | SPARQL query | 20,000 weather stations | Yes | Cen | Virtuoso RDF | Combination of Linked Data, sensors and sensor data |
| Centralised | IoT-DS [ | RDF | SPARQL query response | SPARQL query | 100,000 instances | Yes | Cen | - | Proposed IoT directory is similar to IoT Gateway |
| Centralised | SemSOS [ | RDF | SML-S/O&M-S | SOS query mapping to SPARQL query | 20,000 sensors | No | Cen | Apache Jena | Mobile sensors are ignored |
| Centralised | SemSOS [ | SWE based model in Linked Data | OGC SWE SOS | Named location based query | 20,000 weather stations | No | Cen | RDF2Go/Sesame | Different implementation of SemSOS |
| Centralised | LinkedGeoData [ | RDF | SPARQL endpoint | SPARQL-like query | - | Yes | Cen | Java | Enables geolocation query as part of a SPARQL query |
| Centralised | OWLIM-SE [ | RDF/RDFS/OWL | SPARQL endpoint | SPARQL-like query | - | Yes | Cen | Java | Enable geolocation query as part of a SPARQL query |
| Centralised | GeoSPARQL [ | RDF | SPARQL endpoint | SPARQL-like query | - | Yes | Cen | Apache Jena | Enable geolocation query as part of a SPARQL query |
| Federated | DARQ [ | RDF datasets | SPARQL query response | SPARQL query | - | No | Dec | Java | Requires service descriptions of datasets |
| Federated | ANAPSID [ | RDF datasets | SPARQL query response | SPARQL query | - | No | Dec | Python | Requires predicates of datasets |
| Federated | SPLENDID [ | RDF datasets | SPARQL query response | SPARQL query | - | No | Dec | Java | Requires VoID of datasets |
| Federated | FedX [ | RDF datasets | SPARQL query response | SPARQL query | - | Yes | Dec | Java | Easy to be extended as only the endpoint is required |
| Federated | Federated Query Implementation [ | RDF datasets | SPARQL query response | SPARQL query | - | Yes | Dec | Java | Easy to be extended as only the endpoint is required |
| Federated | SPARQL 1.1 Federated Query [ | RDF datasets | SPARQL query response | SPARQL query | - | Yes | Dis | - | W3C Recommendation for Federated Query |
| RDB Mapping | GSN [ | Virtual sensors in XML | HTTP REST | Keyword-based search, | - | Yes | Cen | Java/MySQL | Extended as XGSN [ |
| RDB Mapping | SenseWeb [ | Sensor ontology (an extension of the namespace defined by OGC SWE) | Map display with text description | Keyword/spatial search | - | Yes | Cen | - | Many applications described but none is accessible |
| RDB Mapping | SPARQLstream [ | SSN ontology | SPARQL query response | Continuous SPARQL query with time window | 8000 data values/second | Yes | Cen | - | Data mapped to DSMS |
| Semantic Modelling | C-SPARQL [ | RDF streams | SPARQL query response | Continuous SPARQL query (time window/ periodical execution) | - | Yes | Cen | - | Data mapped to DSMS |
| Semantic Modelling | EP-SPARQL [ | RDF and RDF event streams | SPARQL query response | Continuous SPARQL query | 20,000 triples/20 locations | Yes | Cen | Prolog language | Designed for Event Processing |
| Semantic Modelling | CQELS [ | RDF streams | SPARQL query response | Continuous SPARQL query | 10 million triples | Yes | Dis | Java | Processing directly on Linked Stream Data |
| Semantic Modelling | LSM [ | SSN ontology | SPARQL endpoint/HTTP REST | CQELS | 70,000 sensor data sources | Yes | Cen | Java/Virtuoso/Hadoop | Integrated with XGSN [ |
| Semantic Modelling | Q-ASSF [ | SSN ontology | SPARQL query response | CQELS | 200 queries/3000 sensors/16,000 triples | Yes | Cen | RabbitMQ/Apache Jena | Filtering algorithm to reduce sensor communications and triple transmission |
| Semantic Modelling | Linked Sensor Streams [ | RDF streams | SPARQL query response | Type/Location based query (based on clustering) | 20,000 stream data | Yes | Dis | Java | Combination of Linked Data and sensor streams |
Comparison of Search Techniques According to Content being Searched (Abbreviations are listed in the footer of Table 2).
| Classification | Search Technique | Data Format | Access Approach | Search Type | Scale of Experiments | D1 | Ar2 | Implementation | Remarks |
|---|---|---|---|---|---|---|---|---|---|
| Instantaneous Data | LiveWeb [ | Sensor data in XML | HTML/AJAX | Real-time content search based on keywords, category, and value range | - | Yes | Cen | PHP/Java/Apache/MySQL | Practical implementation without evaluation Should fit in global scale |
| Instantaneous Data | SensorMap [ | Direct communication with web accessible sensors | Web services APIs | Spatial search (based on COLR-tree/crawling) | - | - | Cen | MSRSense toolkit | SenseWeb-based application, currently not accessible |
| Instantaneous Data | GeoCENS [ | OGC SOS, SensorML, OGC O&M | SWE services | Spatial search (Peano space filling curves) | ~40,000 sensors/procedures | Yes | P2P | - | Designed for sensor data from smaller organizations or individuals |
| Instantaneous Data | CQELS [ | RDF streams | SPARQL query response | Continuous SPARQL query | 10 million triples | Yes | Dis | Java | Processing performed directly on Linked Stream Data |
| Instantaneous Data | IoT-SVKSearch [ | Sampling values with metadata in Raw Data Store | Data in built-in format | Keyword-based (B+-tree), spatial-temporal (R-tree), value-based (B+-tree) search | 140,800~352,000 sensors | Yes | Dec | PostgreSQL/file system | Multiple search functionalities supported |
| Instantaneous Data Historical Data | FUTS Data Query [ | Virtual Object ontology in OWL-DL | SenML/JSON | Spatial search with time window/sensor type | 20,000 sensors | Yes | Cen | Java | Cloud database for huge data storage |
| Historical Data | LSM [ | SSN ontology | SPARQL endpoint/HTTP REST | CQELS | 70,000 sensor data sources | Yes | Cen | Java/OpenLink Virtuoso/Apache Hadoop | Integrated with XGSN [ |
| Context-based Sensor | OSIRIS [ | OGC SWE Sensor model and data model | SWE services | Spatial, temporal and keyword-based search | - | Yes | Cen | JSI/Apache Lucene | Applications in Smart City domain, |
| Context-based Sensor | Microsearch [ | Direct communication with sensor motes with built-in metadata | Virtual sensors in built-in format | Keyword-based top-k search | - | Yes | Dis | TelosB motes | Early search implementations with limited functionalities and limited scalability |
| Context-based Sensor | Linked Sensor Data [ | RDF | HTML/XML | SPARQL query | 20,000 weather stations | Yes | Cen | OpenLink Virtuoso | Combination of Linked Data, sensors and sensor data |
| Context-based Sensor | SemSOS [ | SWE based model in Linked Data | OGC SWE SOS | Named location based query | 20,000 weather stations | No | Cen | RDF2Go/Sesame | Mobile sensors are ignored |
| Context-based Sensor | CASSARAM [ | SSN ontology | SPARQL query response | SPARQL query | 1,000,000 sensors | Yes | Dis | Apache Jena API | Multiple sensor feature support |
| Context-based Sensor | VCS [ | Server nodes (objects) | List of nodes | Keyword-based query with multiple features | - | Yes | P2P | - | No evaluation provided |
| Context-based Sensor | OpenIoT [ | SSN ontology | SPARQL endpoint/HTTP REST | Continuous SPARQL query/Publish and Subscribe model | - | Yes | Dis | XGSN/LSM | Multiple practical applications provided |
| Context-based Sensor | SPITFIRE [ | Sensor (RDF triple) | SPARQL query response | CQELS | 40 physical sensors | Yes | Cen | Jena Semantic Web Framework | Semantic sensor descriptions |
| Content-based Sensor | Fuzzy-based Sensor Search [ | Sensor data streams | - | Sensor data stream based search for sensors | 1500 data points/day | Yes | Dis | Java | Ranking based on sensor data prediction |
| Content-based Sensor | Dyser [ | Virtual sensors (microformat) | URL/HTTP/HTML | Keyword-based/value-based | 385 sensors over 5 months | Yes | Cen | Java/PHP | Query of state of real world objects |
| Content-based Sensor | Sensor Ranking [ | Sensor with outputs in built-in format | Ranked list of sensors | Prediction based ranking for content-based sensor search | 20 sensors | Yes | Cen | C++ | Local scale deployment |
| Context-based Entity | Gander [ | Nodes (objects) (tuple space/tuple graph) | HTTP REST | Multi-hop query | 20,000 mobile visitors | Yes | P2P | Java | Mobile applications for smart university |
| Context-based Entity | ISE [ | Encrypted RFID records | XML/HTML/JSON | Database-based query | 1,000,000 RFID records | Yes | Dis | Nginx/Tokyo Cabinet/C++ | Cryptography for security |
| Content-based Entity | Correlation-based Sensor Search [ | Sensor with sensor data in built-in format | List of sensors | Sensor search with a given state/output | 384 sensors | Yes | Cen | Java/SMILE reasoning engine | Based on sensor data correlation |
| Context-based Entity | DiscoWoT [ | Built-in description for resources (objects) | JSON/XML | RESTful interface search | - | Yes | Cen | AutoWoT toolkit | No evaluation provided |