| Literature DB >> 32235653 |
Hong Yang1,2, Shanshan Gong1, Yaqing Liu1,2, Zhengkui Lin1, Yi Qu1.
Abstract
Daily activity forecasts play an important role in the daily lives of residents in smart homes. Category forecasts and occurrence time forecasts of daily activity are two key tasks. Category forecasts of daily activity are correlated with occurrence time forecasts, however, existing research has only focused on one of the two tasks. Moreover, the performance of daily activity forecasts is low when the two tasks are performed in series. In this paper, a forecast model based on multi-task learning is proposed to forecast category and occurrence time of daily activity mutually and iteratively. Firstly, raw sensor events are pre-processed to form a feature space of daily activity. Secondly, a parallel multi-task learning model which combines a convolutional neural network (CNN) with bidirectional long short-term memory (Bi-LSTM) units are developed as the forecast model. Finally, five distinct datasets are used to evaluate the proposed model. The experimental results show that compared with the state-of-the-art single-task learning models, this model improves accuracy by at least 2.22%, and the metrics of NMAE, NRMSE and R2 are improved by at least 1.542%, 7.79% and 1.69%, respectively.Entities:
Keywords: daily activity forecast; deep learning; multi-task learning; smart home
Mesh:
Year: 2020 PMID: 32235653 PMCID: PMC7181057 DOI: 10.3390/s20071933
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1A high-level overview of the multi-task daily activity forecast problem. Given features X ∈ F extracted from the current sensor event at time t as input, the model forecaster needs to forecast daily activity category and the relative occurrence time. In this example, we have the next daily activity category a (eating) of the current sensor event and the time ta of the event marking the start of daily activity a. Therefore, the ground-truth output is y = (a, y), where y = ta - t stands for the correct relative occurrence time (minutes) of next daily activity a.
Datasets description.
| Locations of Sensors | Kinds of Sensors | Daily Activity Categories |
|---|---|---|
| “Bedroom” | “Motion sensors” | “Sleep” |
| “Breakfast” | ||
| “Office” | “Leave_home” | |
| “Temperature sensors” | “Work_in_office” | |
| “Kitchen” | “Lunch” | |
| “Dinner” | ||
| “Dining room” | “Door sensors” | “Wash_Dishes” |
| “Bed_to_toilet” | ||
| “Bathroom” | “Light sensors” | “Enter_Home” |
| “Watch_TV” | ||
| _ | _ | _ |
Figure 2Sequence of sensors activated in chronological order. The end of the seventh sensor event marks the starting of eating activity.
Figure 3An overview of the multi-task learning architecture in self-boosted forecast model framework.
CASAS smart home datasets involve sensors, sensor events and daily activities.
| Dataset | Residents and Pets/Participants | Number of Sensors | Daily Activity Categories | Measurement Time | Sensor Events |
|---|---|---|---|---|---|
| MavLab | 6 participants | 51 | 10 | 19 days | 3015 |
| Adlnormal | 20 participants | 25 | 5 | 13 days | 6425 |
| Cairo | 2 residents and 1 pet | 32 | 13 | 57 days | 726534 |
| Tulum2009 | 2 residents | 20 | 10 | 84 days | 486912 |
| Aruba | 1 resident | 39 | 11 | 90 days | 725530 |
Parameter Interpretation of Model Training.
| Configuration Name | Parameter Interpretation |
|---|---|
| Hyperparameter of Huber loss | |
| Gradient descent algorithm | AdamOptimizar |
| Learning rate | 1e-3 |
| Batch size | 200 |
| Epoch number |
The Cairo dataset uses six evaluation metrics (Precision, Recall, F-score, NMAE, NRMSE and R-squared (R)) for comparison under different loss weights. The best performing tests in each metric are shown in underline.
| Loss Weight | Training Window Size | Metrics | |||||
|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| ||
| 0.5, 0.5 | 1000 | 0.8666 | 0.8565 | 0.8598 | 0.1059 | 0.0292 | 0.9406 |
| 2000 | 0.8941 | 0.8834 | 0.8873 | 0.0956 | 0.0279 | 0.9457 | |
| 3000 | 0.9046 | 0.8776 | 0.8862 | 0.0837 | 0.0303 | 0.9436 | |
| 4000 | 0.8897 | 0.8798 | 0.8832 | 0.0969 | 0.0299 | 0.9517 | |
| 5000 | 0.8823 | 0.8749 | 0.8768 | 0.1012 | 0.0306 | 0.9377 | |
| 0.8, 0.2 | 1000 | 0.9266 | 0.9097 | 0.9174 | 0.1049 | 0.0279 | 0.9599 |
| 2000 | 0.9177 | 0.9186 | 0.9177 | 0.1079 | 0.0292 | 0.9406 | |
| 3000 | 0.9116 | 0.899 | 0.9044 | 0.1013 | 0.031 | 0.9408 | |
| 4000 | 0.9171 | 0.9086 | 0.9122 | 0.1034 | 0.0289 | 0.9547 | |
| 5000 | 0.9181 | 0.9275 | 0.9222 | 0.1014 | 0.0301 | 0.9399 | |
| 1, 0.1 | 1000 | 0.9478 | 0.9441 | 0.9459 | 0.0971 | 0.0224 | 0.965 |
| 2000 | 0.9469 | 0.9453 | 0.9458 | 0.0974 | 0.0291 | 0.941 | |
| 3000 | 0.9448 | 0.946 | 0.9451 | 0.0849 | 0.0288 | 0.9494 | |
| 4000 | 0.9312 | 0.9223 | 0.9255 | 0.0999 | 0.027 | 0.9606 | |
| 5000 | 0.9414 | 0.9405 | 0.9405 | 0.0845 | 0.0271 | 0.9514 | |
The Tulum2009 dataset uses six evaluation metrics (Precision, Recall, F-score, NMAE, NRMSE and R-squared (R)) for comparison under different loss weights. The best performing tests in each metric are shown in underline.
| Loss Weight | Training Window Size | Metrics | |||||
|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| ||
| 0.5, 0.5 | 1000 | 0.6639 | 0.6782 | 0.6627 | 0.1286 | 0.0625 | 0.9097 |
| 2000 | 0.7005 | 0.6772 | 0.6807 | 0.1126 | 0.0648 | 0.9146 | |
| 3000 | 0.7618 | 0.7518 | 0.7543 | 0.1063 | 0.0536 | 0.942 | |
| 4000 | 0.7789 | 0.7316 | 0.7495 | 0.1047 | 0.0571 | 0.9314 | |
| 5000 | 0.7324 | 0.7071 | 0.7138 | 0.099 | 0.0611 | 0.9169 | |
| 0.8, 0.2 | 1000 | 0.79 | 0.7839 | 0.7837 | 0.101 | 0.0554 | 0.9292 |
| 2000 | 0.8359 | 0.8453 | 0.8402 | 0.0963 | 0.0517 | 0.9457 | |
| 3000 | 0.8598 | 0.8619 | 0.8603 | 0.0934 | 0.0565 | 0.9358 | |
| 4000 | 0.8517 | 0.8453 | 0.8448 | 0.1056 | 0.0553 | 0.9357 | |
| 5000 | 0.8426 | 0.8305 | 0.8353 | 0.1054 | 0.0629 | 0.9116 | |
| 1, 0.1 | 1000 | 0.8499 | 0.8529 | 0.8505 | 0.11 | 0.0533 | 0.9344 |
| 2000 | 0.8994 | 0.878 | 0.8868 | 0.1126 | 0.0613 | 0.9236 | |
| 3000 | 0.9108 | 0.9066 | 0.9081 | 0.0901 | 0.0537 | 0.9421 | |
| 4000 | 0.8877 | 0.9019 | 0.8943 | 0.0912 | 0.0543 | 0.9378 | |
| 5000 | 0.838 | 0.8623 | 0.8478 | 0.1028 | 0.0584 | 0.924 | |
The Aruba dataset uses six evaluation metrics (Precision, Recall, F-score, NMAE, NRMSE and R-squared (R)) for comparison under different loss weights. The best performing tests in each metric are shown in underline.
| Loss Weight | Training Window Size | Metrics | |||||
|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| ||
| 0.5, 0.5 | 1000 | 0.8761 | 0.8024 | 0.8345 | 0.2802 | 0.024 | 0.8468 |
| 2000 | 0.8765 | 0.7993 | 0.8321 | 0.2868 | 0.023 | 0.8603 | |
| 3000 | 0.8477 | 0.8067 | 0.8226 | 0.299 | 0.0318 | 0.7735 | |
| 4000 | 0.8597 | 0.8004 | 0.8253 | 0.2989 | 0.0255 | 0.8357 | |
| 5000 | 0.8793 | 0.7989 | 0.8337 | 0.2741 | 0.0215 | 0.8706 | |
| 0.8, 0.2 | 1000 | 0.8815 | 0.8345 | 0.8556 | 0.2939 | 0.0245 | 0.841 |
| 2000 | 0.8956 | 0.8686 | 0.8791 | 0.2743 | 0.0239 | 0.8532 | |
| 3000 | 0.8673 | 0.8672 | 0.8664 | 0.2919 | 0.0258 | 0.8506 | |
| 4000 | 0.8588 | 0.8412 | 0.848 | 0.3123 | 0.0243 | 0.8506 | |
| 5000 | 0.8736 | 0.8432 | 0.8566 | 0.2966 | 0.026 | 0.8097 | |
| 1, 0.1 | 1000 | 0.9098 | 0.8657 | 0.8838 | 0.2998 | 0.0243 | 0.8425 |
| 2000 | 0.9045 | 0.8751 | 0.8895 | 0.2808 | 0.0213 | 0.8832 | |
| 3000 | 0.8508 | 0.8745 | 0.8614 | 0.3089 | 0.0274 | 0.832 | |
| 4000 | 0.8702 | 0.8697 | 0.868 | 0.2949 | 0.0234 | 0.8605 | |
| 5000 | 0.8899 | 0.8335 | 0.8575 | 0.3053 | 0.0227 | 0.8552 | |
Figure 4Performance evaluation of the multi-task forecast model with different loss weights on three datasets. The first row represents the average measures of category forecast of daily activity (Average Precision, Average Recall, Average F-score). The other row is the average measures of occurrence time forecast of daily activity (Average NMAE, Average NRMSE, Average R).
Comparison test of six the average metrics of the training window size (in number of events) in Cairo dataset. The best performing tests in each metric are shown in underline. All tests move 20 events per iteration.
| Training Window Size | Average Metrics | |||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
| 1000 | 0.9137 | 0.9034 | 0.9077 | 0.1026 | 0.0265 | 0.9552 |
| 2000 | 0.9196 | 0.9158 | 0.9169 | 0.1003 | 0.0287 | 0.9424 |
| 3000 | 0.9203 | 0.9075 | 0.9119 | 0.0900 | 0.03 | 0.9446 |
| 4000 | 0.9127 | 0.9036 | 0.907 | 0.1001 | 0.0286 | 0.9557 |
| 5000 | 0.9139 | 0.9143 | 0.9132 | 0.0957 | 0.0293 | 0.943 |
Comparison test of six the average metrics of the training window size (in number of events) in Tulum2009 dataset. The best performing tests in each metric are shown in underline. All tests move 20 events per iteration.
| Training Window Size | Average Metrics | |||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
| 1000 | 0.7679 | 0.7717 | 0.7656 | 0.1132 | 0.0571 | 0.9244 |
| 2000 | 0.8119 | 0.8002 | 0.8026 | 0.1072 | 0.0593 | 0.9280 |
| 3000 | 0.8441 | 0.8401 | 0.8409 | 0.0966 | 0.0546 | 0.94 |
| 4000 | 0.8394 | 0.8263 | 0.8295 | 0.1005 | 0.0556 | 0.935 |
| 5000 | 0.8043 | 0.8 | 0.7990 | 0.1024 | 0.0608 | 0.9175 |
Comparison test of six the average metrics of the training window size (in number of events) in Aruba dataset. The best performing tests in each metric are shown in underline. All tests move 20 events per iteration.
| Training Window Size | Average Metrics | |||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
| 1000 | 0.8891 | 0.8342 | 0.858 | 0.2913 | 0.0243 | 0.8434 |
| 2000 | 0.8922 | 0.8477 | 0.8669 | 0.2806 | 0.0227 | 0.8656 |
| 3000 | 0.8553 | 0.8495 | 0.8501 | 0.2999 | 0.0283 | 0.8187 |
| 4000 | 0.8629 | 0.8371 | 0.8471 | 0.3020 | 0.0244 | 0.8489 |
| 5000 | 0.8809 | 0.8252 | 0.8493 | 0.2920 | 0.0234 | 0.8452 |
Accuracy comparison results for the models of category forecast of daily activity. The best performing tests in each metric are shown in underline. The two dataset tests are listed by specific sliding window sizes. The sliding window sizes are 50 and 20, respectively, while the number of moves per iteration is 5, 1, respectively.
| Method | Dataset | |
|---|---|---|
| Adlnormal | MavLab | |
| SPADE | 0.8047 | 0.8411 |
| LSTM | 0.9030 | 0.8451 |
| CNN+Bi-LSTM | 0.8964 | 0.8401 |
| Multi-task CNN+Bi-LSTM | 0.9323 | 0.8673 |
NMAE, NRMSE and R comparison results for the models of occurrence time forecast of daily activity. The best performing tests in each metric are shown in underline. The three dataset tests are listed by specific sliding window sizes, where the sliding window sizes are 1000, 3000 and 2000, respectively. All tests move 20 events per iteration.
| Method | Metrics | Dataset | ||
|---|---|---|---|---|
| Cairo | Tulum2009 | Aruba | ||
| Bi-LSTM |
| 0.1883 | 0.1323 | 0.3158 |
|
| 0.0331 | 0.0652 | 0.0236 | |
|
| 0.9233 | 0.9142 | 0.8569 | |
| CNN+Bi-LSTM |
| 0.1504 | 0.1109 | 0.2852 |
|
| 0.0296 | 0.0608 | 0.0231 | |
|
| 0.9379 | 0.9252 | 0.8632 | |
| Multi-task CNN+Bi-LSTM |
| 0.0971 | 0.0901 | 0.2808 |
|
| 0.0224 | 0.0537 | 0.0213 | |
|
| 0.9650 | 0.9421 | 0.8832 | |
Figure 5Performance comparison of the different daily activity occurrence time forecast model on three datasets.