| Literature DB >> 30213116 |
Bo Shen1,2, Yi-Chen Liao3, Dan Liu4, Han-Chieh Chao5,6,7.
Abstract
Big data gathered from real systems, such as public infrastructure, healthcare, smart homes, industries, and so on, by sensor networks contain enormous value, and need to be mined deeply, which depends on a data storing and retrieving service. HBase is playing an increasingly important part in the big data environment since it provides a flexible pattern for storing extremely large amounts of unstructured data. Despite the fast-speed reading by RowKey, HBase does not natively support multi-conditional query, which is a common demand and operation in relational databases, especially for data analysis of ubiquitous sensing applications. In this paper, we introduce a method to construct a linear index by employing a Hilbert space-filling curve. As a RowKey generating schema, the proposed method maps multiple index-columns into a one-dimensional encoded sequence, and then constructs a new RowKey. We also provide a R-tree-based optimization to reduce the computational cost of encoding query conditions. Without using a secondary index mode, experimental results indicate that the proposed method has better performance in multi-conditional queries.Entities:
Keywords: HBase; Hilbert space-filling curve; multi-conditional query; ubiquitous sensing
Year: 2018 PMID: 30213116 PMCID: PMC6164097 DOI: 10.3390/s18093064
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Three-dimensional Hilbert curve unit coordinate transformation. (a) Initial state; (b) after exchanging; (c) after reversing.
Figure 2Query process for point query and range query.
Figure 3Two-dimensional Hilbert index space division.
The structure of data records.
| Field Name | Quantity |
|---|---|
| ASIN | 1559362022 |
| Title | Wake Up and Smell the Coffee |
| Group | Book |
| Time | 13 May 2002 |
| Customer | A2IGOA66Y6O8TQ |
| Rating | 5 |
| Votes | 3 |
| Helpful | 2 |
Figure 4Results of a single-column query performance experiment.
Figure 5Results of Multi-column query performance experiment.
Figure 6Effect of query conditions.
Figure 7Test results of query optimization.