| Literature DB >> 30249024 |
Yang Tao1, Juan Xu2, Zhifang Liang3, Lian Xiong4, Haocheng Yang5.
Abstract
This paper proposes a way for drift compensation in electronic noses (e-nose) that often suffers from uncertain and unpredictable sensor drift. Traditional machine learning methods for odor recognition require consistent data distribution, which makes the model trained with previous data less generalized. In the actual application scenario, the data collected previously and the data collected later may have different data distributions due to the sensor drift. If the dataset without sensor drift is treated as a source domain and the dataset with sensor drift as a target domain, a domain correction based on kernel transformation (DCKT) method is proposed to compensate the sensor drift. The proposed method makes the distribution consistency of two domains greatly improved through mapping to a high-dimensional reproducing kernel space and reducing the domain distance. A public benchmark sensor drift dataset is used to verify the effectiveness and efficiency of the proposed DCKT method. The experimental result shows that the proposed method yields the highest average accuracies compared to other considered methods.Entities:
Keywords: domain correction; drift compensation; electronic nose; transfer learning
Year: 2018 PMID: 30249024 PMCID: PMC6210950 DOI: 10.3390/s18103209
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Schematic diagram of the proposed DCKT method.
Figure 2Difference between traditional machine learning and transfer learning.
Distribution of benchmark sensor drift dataset from UCSD.
| Batch ID | Month | Acetone | Acetaldehyde | Ethanol | Ethylene | Ammonia | Toluene |
|---|---|---|---|---|---|---|---|
| Batch 1 | 1,2 | 90 | 98 | 83 | 30 | 70 | 74 |
| Batch 2 | 3,4,8–10 | 164 | 334 | 100 | 109 | 532 | 5 |
| Batch 3 | 11~13 | 365 | 490 | 216 | 240 | 275 | 0 |
| Batch 4 | 14,15 | 64 | 43 | 12 | 30 | 12 | 0 |
| Batch 5 | 16 | 28 | 40 | 20 | 46 | 63 | 0 |
| Batch 6 | 17~20 | 514 | 574 | 110 | 29 | 606 | 467 |
| Batch 7 | 21 | 649 | 662 | 360 | 744 | 630 | 568 |
| Batch 8 | 22,23 | 30 | 30 | 40 | 33 | 143 | 18 |
| Batch 9 | 24,30 | 61 | 55 | 100 | 75 | 78 | 101 |
| Batch 10 | 36 | 600 | 600 | 600 | 600 | 600 | 600 |
Figure 3Principal components (PC1 vs. PC2) of the raw data of 10 batches using PCA (i.e., the 2-D subspace distribution of 10 batches, respectively), from which the significant changes of data space distribution caused by drift can be observed.
Figure 4Principal components (PC1 vs. PC2) of the source domain and the target domain after DCKT (i.e., the 2-D subspace distribution of 9 tasks, respectively). In each task, the top picture is the Batch 1 (source domain) after DCKT and the picture below is the Batch i (target domain) after DCKT, from it we can find that the distribution of source domain and target domain is more similar after DCKT.
Recognition accuracy (%) under Experimental Setting 1.
| Methods | Batch ID | Average Value | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | ||
| PCASVM | 82.40 | 84.80 | 80.12 | 75.13 | 73.57 | 56.16 | 48.64 | 67.45 | 49.14 | 68.60 |
| LDASVM | 47.27 | 57.76 | 50.93 | 62.44 | 41.48 | 37.42 |
| 52.34 | 31.17 | 49.91 |
| SVM-rbf | 74.36 | 61.03 | 50.93 | 18.27 | 28.26 | 28.81 | 20.07 | 34.26 | 34.47 | 38.94 |
| SVM-comgfk | 74.47 | 70.15 | 59.78 | 75.09 | 73.99 | 54.59 | 55.88 | 70.23 | 41.85 | 64.00 |
| DS | 69.37 | 46.28 | 41.61 | 58.88 | 48.83 | 32.83 | 23.47 |
| 29.03 | 46.98 |
| DRCA | 89.15 |
|
|
| 86.52 | 60.25 | 62.24 | 72.34 | 52.00 | 77.63 |
| DCKT |
| 90.29 | 83.23 | 76.14 |
|
| 66.67 | 71.06 |
|
|
Parameter values of the DCKT under Experimental Setting 1.
| Batch ID | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
|---|---|---|---|---|---|---|---|---|---|
|
| 0.001 | 10,000 | 20 | 1000 | 0.001 | 1000 | 10,000 | 1000 | 10,000 |
| m | 16 | 5 | 8 | 11 | 8 | 8 | 4 | 11 | 5 |
Recognition accuracy (%) under Experimental Setting 2.
| Methods | Batch ID | Average Value | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 1⟶2 | 2⟶3 | 3⟶4 | 4⟶5 | 5⟶6 | 6⟶7 | 7⟶8 | 8⟶9 | 9⟶10 | ||
| PCASVM | 82.40 |
| 83.23 | 72.59 | 36.70 | 74.98 | 58.16 | 84.04 | 30.61 | 69.06 |
| LDASVM | 47.27 | 46.72 | 70.81 | 85.28 | 48.87 | 75.15 | 77.21 | 62.77 | 30.25 | 60.48 |
| SVM-rbf | 74.36 | 87.83 | 90.06 | 56.35 | 42.52 |
|
| 62.98 | 22.64 | 68.01 |
| SVM-comgfk | 74.47 | 73.75 | 78.51 | 64.26 | 69.97 | 77.69 | 82.69 | 85.53 | 17.76 | 69.40 |
| DS | 69.37 | 53.59 | 67.08 | 37.56 | 36.30 | 26.57 | 49.66 | 42.55 | 25.78 | 45.38 |
| DRCA | 89.15 | 98.11 |
| 69.54 | 50.87 | 78.94 | 65.99 | 84.04 | 36.31 | 74.22 |
| DCKT |
| 91.87 | 90.68 |
|
| 78.88 | 75.22 |
|
|
|
Parameter values of the DCKT under Experimental Setting 2.
| Batch ID | 1⟶2 | 2⟶3 | 3⟶4 | 4⟶5 | 5⟶6 | 6⟶7 | 7⟶8 | 8⟶9 | 9⟶10 |
|---|---|---|---|---|---|---|---|---|---|
|
| 0.001 | 10,000 | 0.001 | 0.001 | 10,000 | 10,000 | 10,000 | 1000 | 10,000 |
| m | 16 | 8 | 32 | 32 | 7 | 64 | 8 | 64 | 17 |