| Literature DB >> 34084921 |
Mina Younan1, Essam H Houssein1, Mohamed Elhoseny2,3, Abd El-Mageid Ali1.
Abstract
The Internet of Things (IoT) has penetrating all things and objects around us giving them the ability to interact with the Internet, i.e., things become Smart Things (SThs). As a result, SThs produce massive real-time data (i.e., big IoT data). Smartness of IoT applications bases mainly on services such as automatic control, events handling, and decision making. Consumers of the IoT services are not only human users, but also SThs. Consequently, the potential of IoT applications relies on supporting services such as searching, retrieving, mining, analyzing, and sharing real-time data. For enhancing search service in the IoT, our previous work presents a promising solution, called Cluster Representative (ClRe), for indexing similar SThs in IoT applications. ClRe algorithms could reduce similar indexing by O(K - 1), where K is number of Time Series (TS) in a cluster. Multiple extensions for ClRe algorithms were presented in another work for enhancing accuracy of indexed data. In this theme, this paper studies performance analysis of ClRe algorithms, proposes two novel execution methods: (a) Linear execution (LE) and (b) Pair-merge execution (PME), and studies sorting impact on TS execution for enhancing similarity rate for some ClRe extensions. The proposed execution methods are evaluated with real examples and proved using Szeged-weather dataset on ClRe 3.0 and its extensions; where they produce representatives with higher similarities compared to the other extensions. Evaluation results indicate that PME could improve performance of ClRe 3.0 by = 20.5%, ClRe 3.1 by = 17.7%, and ClRe 3.2 by = 6.4% in average.Entities:
Keywords: Clustering; DTW; Data reduction; Indexing; Internet of things; Searching; Time series
Year: 2021 PMID: 34084921 PMCID: PMC8157125 DOI: 10.7717/peerj-cs.500
Source DB: PubMed Journal: PeerJ Comput Sci ISSN: 2376-5992
Examples for IoT search engines (IoTSEs) prototypes and frameworks.
| Year | References | IoT Search engine, Framework, or Crawler |
|---|---|---|
| 2007 | SenseWeb | |
| 2008 | Microsearch | |
| 2010 | Snoogle | |
| 2010 | Dyser | |
| 2013 | Shodhan | |
| 2014 | IoT-SVKSearch | |
| 2015 | COBASEN | |
| 2016 | WoTSF | |
| 2016 | WOTS2E | |
| 2016 | ThingSeek | |
| 2018 | CDS | |
| 2019 | SMPKR | |
| 2020 | IoTCrawler |
Performance analysis for ClRe algorithm extensions.
| Algorithm | Run-time complexity | Memory complexity | Length rank | Dissimilarity rank |
|---|---|---|---|---|
| ClRe 1.0 | 2 | 3 | ||
| ClRe 1.1 | 2 | 3 | ||
| ClRe 2.0 | 5 | 4 | ||
| ClRe 3.0 | 3 | 2 | ||
| ClRe 3.1 | 4 | 1 | ||
| ClRe 3.2 | 1 | 5 |
Figure 1Execution methods for ClRe algorithm: (A) Linear and (B) Pair-merge.
Linear execution: generate accumulative representation for cluster datasets.
| // Initialization. |
| 1 NewDS=ClrDSList[0] // initialize with the first dataset in the cluster |
| // Call ClRe extension for building accumulative representation. |
| 2 for |
| 3 X= NewDS |
| 4 Y= ClrDSList[i] |
| 5 NewDS= ClRe Ex(X,Y) |
| 6 |
| 7 Return NewDS |
Pair-merge execution: generate representative for each pair at each level.
| // Call ClRe extension for building inner representatives. |
| 1 |
| 2 odd=len(lset)%2 |
| 3 NewList=[] |
| 4 count=0 |
| 5 |
| 6 X=ClrDSList[i] |
| 7 Y=ClrDSList[i+1] |
| 8 NewList.append(ClRe V(X,Y)) |
| 9 |
| 10 |
| 11 X=ClrDSList[len(ClrDSList)-1] |
| 12 Y=NewList[len(NewList)-1] |
| 13 NewList[len(NewList)-1]= ClRe V(X,Y) |
| 14 |
| 15 ClrDSList=NewList |
| 16 |
| 17 Return ClrDSList[0] |
A comparison between the proposed execution methods (LE and PME).
| Criteria | Linear execution (LE) | Pair-merge execution (PME) |
|---|---|---|
| Main features | • Sequential execution. | • Pair execution. |
| • Accumulative building for final representative at each step. | • Temporal representative for each pair. | |
| • One representative per iteration. | • Accumulative representative at each level (sub-tree) | |
| Run-time complexity | • | • |
| • No parallel execution | • Parallel: | |
| Pros | • Less memory at each iteration (only one dataset of length N, where n < N < 2n). | • Allow parallel execution. |
| • Resulting representative < Pair-Merge representative length. | • Throwaway representative. | |
| • Generates | ||
| Cons | • Only sequential execution. | • At the level ( |
| • Average dissimilarity < Pair-merge dissimilarity. | • Resulting dataset length > Linear method. |
Datasets used for forming TS of the four clusters.
| Dataset label | Items |
|---|---|
| a | [4, 5, 6, 19, 18, 5, 17, 14, 6, 11, 10, 8, 8, 9, 8, 8, 6, 5, 8, 9, 19, 23, 24, 18, 14, 23, 21] |
| b | [5, 8, 10, 7, 11, 13, 5, 4, 9, 10, 8, 8, 9, 8, 12, 16, 16, 14, 6, 8, 17, 13, 16, 18] |
| c | [6, 7, 5, 4, 4, 5, 6, 7, 8, 9, 9, 10, 5, 17, 13, 21, 26, 16, 18, 10, 9, 8, 7, 16, 17] |
| d | [4, 19, 18, 5, 14, 14, 16, 11, 10, 8, 8, 9, 8, 8, 6, 5, 8, 9, 19, 23, 24, 14, 25, 21, 24] |
| e | [7, 6, 5, 10, 8, 9, 11, 17, 18, 16, 15, 12, 10, 9, 21, 19, 19, 20, 7, 27, 22, 10, 11, 14] |
| f | [4, 5, 6, 19, 8, 5, 17, 14, 6, 11, 10, 8, 16, 9, 8, 18, 6, 5, 6, 17, 16, 24, 9, 10] |
Real example: dissimilarities between datasets and each other.
| Dataset label | Items | a | b | c | d | e | f | Average dissimilarity |
|---|---|---|---|---|---|---|---|---|
| a | 27 | 0 | 7.392 | 7.981 | 1.385 | 9.843 | 9.961 | |
| b | 24 | 0 | 5.673 | 9.633 | 7.729 | 7.792 | 6.370 | |
| c | 25 | 0 | 11.320 | 6.898 | 10.816 | 7.115 | ||
| d | 25 | 0 | 13.102 | 12.755 | 8.032 | |||
| e | 24 | 0 | 7.771 | 7.557 | ||||
| f | 24 | 0 | 8.182 |
Real example: selected representatives 8 clusters using ClRe 1.0.
| Dataset label | Min.Dis.CLR_A | Min.Dis.CLR_B | Min.Dis.CLR_C | Min.Dis.CLR_D |
|---|---|---|---|---|
| a | – | 4.189 | 5.320 | 6.094 |
| b | 4.355 | – | – | – |
| c | – | – | – | – |
| d | – | – | – | |
| e | – | – | ||
| f | – |
Real example: ClRe performance analysis in LE mode.
| Cluster | Min.Dis | Alg. | Dissimilarity | Length | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Min | Avg | 50 | Max | Range | Min | Avg | 50 | Max | Range | |||
| CLS_B | MinDs=a | ClRe 3.0 | 3.31 | 4.09 | 3.98 | 5.05 | 1.74 | 19 | 21 | 22 | 23 | 4 |
| ClRe 3.1 | 2.83 | 3.37 | 3.32 | 3.85 | 1.02 | 24 | 26 | 26 | 28 | 4 | ||
| ClRe 3.2 | 5.32 | 7.28 | 7.10 | 9.31 | 4.00 | 7 | 9 | 9 | 11 | 4 | ||
| CLS_C | MinDs=a | ClRe 3.0 | 3.82 | 5.31 | 5.27 | 7.67 | 3.85 | 16 | 20 | 20 | 23 | 7 |
| ClRe 3.1 | 3.31 | 4.38 | 4.40 | 5.41 | 2.10 | 22 | 25 | 25 | 27 | 5 | ||
| ClRe 3.2 | 6.95 | 9.31 | 9.14 | 13.95 | 6.99 | 6 | 8 | 8 | 9 | 3 | ||
| CLS_D | MinDs=a | ClRe 3.0 | 4.63 | 6.31 | 6.15 | 10.63 | 6.00 | 14 | 19 | 19 | 24 | 10 |
| ClRe 3.1 | 3.71 | 5.12 | 5.10 | 6.56 | 2.85 | 20 | 24 | 24 | 30 | 10 | ||
| ClRe 3.2 | 7.59 | 10.20 | 9.88 | 14.54 | 6.94 | 5 | 7 | 7 | 10 | 5 | ||
Real example: ClRe performance analysis in PME mode.
| Cluster | Min.Dis | Alg. | Dissimilarity | Length | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Min | Avg | 50 | Max | Range | Min | Avg | 50 | Max | Range | |||
| CLS_B | MinDs=a | ClRe 3.0 | 3.27 | 3.56 | 3.43 | 3.97 | 0.70 | 19 | 21 | 21 | 23 | 4 |
| ClRe 3.1 | 2.78 | 2.99 | 3.05 | 3.14 | 0.36 | 24 | 26 | 26 | 27 | 3 | ||
| ClRe 3.2 | 6.65 | 8.16 | 8.78 | 9.03 | 2.38 | 7 | 8 | 8 | 9 | 2 | ||
| CLS_C | MinDs=a | ClRe 3.0 | 4.22 | 5.71 | 5.50 | 9.97 | 5.74 | 16 | 18 | 17 | 23 | 7 |
| ClRe 3.1 | 3.32 | 4.09 | 4.01 | 4.91 | 1.59 | 22 | 25 | 25 | 28 | 6 | ||
| ClRe 3.2 | 7.46 | 9.75 | 8.80 | 12.52 | 5.06 | 6 | 8 | 8 | 9 | 3 | ||
| CLS_D | MinDs=a | ClRe 3.0 | 5.21 | 6.52 | 6.18 | 9.69 | 4.48 | 15 | 17 | 17 | 20 | 5 |
| ClRe 3.1 | 4.14 | 4.90 | 4.88 | 5.83 | 1.69 | 21 | 24 | 24 | 28 | 7 | ||
| ClRe 3.2 | 8.38 | 10.42 | 10.66 | 14.04 | 5.66 | 6 | 7 | 8 | 9 | 3 | ||
Real dataset: dissimilarities between datasets and each other.
| Dataset label | Items | a | b | c | d | e | f | Average dissimilarity |
|---|---|---|---|---|---|---|---|---|
| a | y_11=577 | 0 | 2.385 | 5.376 | 2.594 | 4.990 | 4.212 | 3.260 |
| b | y_12=577 | 0 | 3.471 | 1.799 | 2.828 | 3.794 | ||
| c | y_13=577 | 0 | 4.122 | 2.348 | 2.623 | 2.990 | ||
| d | y_14=577 | 0 | 3.642 | 3.822 | 2.663 | |||
| e | y_15=577 | 0 | 3.053 | 2.810 | ||||
| f | y_16=577 | 0 | 2.917 |
Real dataset: selected representatives ∀ clusters using ClRe 1.0.
| Dataset label | Min.Dis.CLR_A | Min.Dis.CLR_B | Min.Dis.CLR_C | Min.Dis.CLR_D |
|---|---|---|---|---|
| a | – | – | – | – |
| b | 1.952 | 1.914 | 2.097 | 2.379 |
| c | – | – | – | – |
| d | – | – | – | |
| e | – | – | ||
| f | – |
Real dataset: ClRe performance analysis in LE mode.
| Cluster | Min.Dis | Alg. | Dissimilarity | Length | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Min | Avg | 50 | Max | Range | Min | Avg | 50 | Max | Range | |||
| CLS_B | MinDs=b | ClRe 3.0 | 1.47 | 1.66 | 1.65 | 1.93 | 0.47 | 410 | 420 | 420 | 433 | 23 |
| ClRe 3.1 | 1.31 | 1.50 | 1.50 | 1.75 | 0.44 | 516 | 534 | 533 | 548 | 32 | ||
| ClRe 3.2 | 3.04 | 3.74 | 3.69 | 4.37 | 1.33 | 62 | 78 | 79 | 88 | 26 | ||
| CLS_C | MinDs=b | ClRe 3.0 | 1.54 | 1.92 | 1.90 | 2.85 | 1.31 | 390 | 412 | 413 | 445 | 55 |
| ClRe 3.1 | 1.44 | 1.74 | 1.73 | 2.50 | 1.07 | 524 | 534 | 534 | 553 | 29 | ||
| ClRe 3.2 | 3.58 | 4.44 | 4.13 | 6.77 | 3.18 | 39 | 61 | 61 | 77 | 38 | ||
| CLS_D | MinDs=b | ClRe 3.0 | 1.81 | 2.19 | 2.14 | 3.25 | 1.44 | 387 | 414 | 414 | 440 | 53 |
| ClRe 3.1 | 1.59 | 1.96 | 1.93 | 2.81 | 1.22 | 519 | 533 | 533 | 555 | 36 | ||
| ClRe 3.2 | 3.93 | 5.20 | 4.99 | 8.07 | 4.15 | 30 | 51 | 51 | 69 | 39 | ||
Real dataset: ClRe performance analysis in PME mode.
| Cluster | Min.Dis | Alg. | Dissimilarity | Length | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Min | Avg | 50 | Max | Range | Min | Avg | 50 | Max | Range | |||
| CLS_B | MinDs=b | ClRe 3.0 | 1.30 | 1.34 | 1.32 | 1.41 | 0.11 | 532 | 550 | 553 | 564 | 32 |
| ClRe 3.1 | 1.20 | 1.24 | 1.24 | 1.30 | 0.10 | 638 | 651 | 654 | 662 | 24 | ||
| ClRe 3.2 | 3.50 | 3.79 | 3.81 | 4.06 | 0.56 | 70 | 74 | 73 | 78 | 8 | ||
| CLS_C | MinDs=b | ClRe 3.0 | 1.33 | 1.58 | 1.57 | 1.95 | 0.62 | 470 | 518 | 516 | 550 | 80 |
| ClRe 3.1 | 1.26 | 1.48 | 1.46 | 1.72 | 0.46 | 621 | 657 | 655 | 686 | 65 | ||
| ClRe 3.2 | 3.67 | 4.04 | 3.92 | 5.66 | 1.99 | 44 | 65 | 64 | 81 | 37 | ||
| CLS_D | MinDs=b | ClRe 3.0 | 1.49 | 1.67 | 1.67 | 1.80 | 0.31 | 538 | 593 | 593 | 626 | 88 |
| ClRe 3.1 | 1.42 | 1.56 | 1.57 | 1.68 | 0.26 | 704 | 751 | 754 | 781 | 77 | ||
| ClRe 3.2 | 4.14 | 4.69 | 4.70 | 5.10 | 0.96 | 47 | 56 | 56 | 69 | 22 | ||
Real example: a comparison for average dissimilarities and lengths of ClRe representatives in LE and PME modes.
| Evaluation | Cluster | Linear | Pair Merge | ||||
|---|---|---|---|---|---|---|---|
| ClRe 3.0 | ClRe 3.1 | ClRe 3.2 | ClRe 3.0 | ClRe 3.1 | ClRe 3.2 | ||
| Average dissimilarity | Ds=4 | 4.09 | 3.37 | 7.28 | 3.56 | 2.99 | 8.16 |
| Ds=5 | 5.31 | 4.38 | 9.31 | 5.71 | 4.09 | 9.75 | |
| Ds=6 | 6.31 | 5.12 | 10.2 | 6.52 | 4.9 | 10.42 | |
| Average length | Ds=4 | 21 | 26 | 9 | 21 | 26 | 8 |
| Ds=5 | 20 | 25 | 8 | 18 | 25 | 8 | |
| Ds=6 | 19 | 24 | 7 | 17 | 24 | 7 | |
Real dataset: a comparison for average dissimilarities and lengths of ClRe representatives in LE and PME modes.
| Evaluation | Cluster | Linear | Pair Merge | ||||
|---|---|---|---|---|---|---|---|
| ClRe 3.0 | ClRe 3.1 | ClRe 3.2 | ClRe 3.0 | ClRe 3.1 | ClRe 3.2 | ||
| Average dissimilarity | Ds=4 | 1.66 | 1.50 | 3.74 | 1.34 | 1.24 | 3.79 |
| Ds=5 | 1.92 | 1.74 | 4.44 | 1.58 | 1.48 | 4.04 | |
| Ds=6 | 2.19 | 1.96 | 5.2 | 1.67 | 1.56 | 4.69 | |
| Average length | Ds=4 | 420 | 534 | 78 | 550 | 651 | 74 |
| Ds=5 | 412 | 534 | 61 | 518 | 657 | 65 | |
| Ds=6 | 414 | 533 | 51 | 593 | 751 | 56 | |
Figure 2Real dataset: performance analysis for ClRe 3.0 and its extensions using LE and PME methods: (A) Real example, (B) real dataset.