| Literature DB >> 31404972 |
Issam Hammad1, Kamal El-Sankary2.
Abstract
Accuracy evaluation in machine learning is based on the split of data into a training set and a test set. This critical step is applied to develop machine learning models including models based on sensor data. For sensor-based problems, comparing the accuracy of machine learning models using the train/test split provides only a baseline comparison in ideal situations. Such comparisons won't consider practical production problems that can impact the inference accuracy such as the sensors' thermal noise, performance with lower inference quantization, and tolerance to sensor failure. Therefore, this paper proposes a set of practical tests that can be applied when comparing the accuracy of machine learning models for sensor-based problems. First, the impact of the sensors' thermal noise on the models' inference accuracy was simulated. Machine learning algorithms have different levels of error resilience to thermal noise, as will be presented. Second, the models' accuracy using lower inference quantization was compared. Lowering inference quantization leads to lowering the analog-to-digital converter (ADC) resolution which is cost-effective in embedded designs. Moreover, in custom designs, analog-to-digital converters' (ADCs) effective number of bits (ENOB) is usually lower than the ideal number of bits due to various design factors. Therefore, it is practical to compare models' accuracy using lower inference quantization. Third, the models' accuracy tolerance to sensor failure was evaluated and compared. For this study, University of California Irvine (UCI) 'Daily and Sports Activities' dataset was used to present these practical tests and their impact on model selection.Entities:
Keywords: ADC; ENOB; deep learning; edge artificial intelligence (AI); low power; low quantization; machine learning; sensor failure; sensor fusion; thermal noise
Year: 2019 PMID: 31404972 PMCID: PMC6719906 DOI: 10.3390/s19163491
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Xsens MTx 3-DOF (degrees of freedom) orientation tracker (photo from [14]).
Baseline test accuracies using k-fold (k = 10).
| Algorithm | Train/Test Sample Size | Test Accuracy without PCA | Test Accuracy with PCA |
|---|---|---|---|
| Deep Neural Network (DNN) | 8208/912 | 99.26% | 97.87% |
| K-Nearest Neighbors (KNN) | 8208/912 | 78.34% | 98.12% |
| Decision Tree Classifier (DTC) | 8208/912 | 90.30% | 90.72% |
| Random Forest Classifier (RFC) | 8208/912 | 98.96% | 98.65% |
| Gaussian Naïve Bayes (GNB) | 8208/912 | 93.49% | 78.55% |
Average inference accuracy with simulated thermal noise.
| SNR | Machine Learning Model | ||||||
|---|---|---|---|---|---|---|---|
| DNN | KNN + PCA | DTC | DTC + PCA | RFC | RFC + PCA | GNB | |
| Baseline | 99.26% | 98.12% | 90.30% | 90.72% | 98.96% | 98.65% | 93.49% |
| 40 dB | 99.28% | 98.11% | 89.62% | 90.54% | 98.93% | 98.59% | 93.34% |
| 35 dB | 99.25% | 97.97% | 88.30% | 90.47% | 98.84% | 98.51% | 93.28% |
| 30 dB | 99.25% | 97.98% | 85.70% | 89.89% | 98.35% | 98.44% | 93.03% |
| 25 dB | 99.27% | 98.02% | 81.69% | 88.90% | 97.08% | 98.24% | 85.06% |
| 20 dB | 99.24% | 98.03% | 76.28% | 87.53% | 94.88% | 97.60% | 69.61% |
| 15 dB | 99.25% | 98.01% | 68.33% | 84.79% | 91.51% | 95.74% | 69.60% |
| 10 dB | 99.24% | 97.82% | 55.65% | 80.77% | 85.55% | 92.35% | 68.81% |
| 5 dB | 99.11% | 97.73% | 40.13% | 74.56% | 69.12% | 86.90% | 46.82% |
| 0 dB | 98.43% | 96.37% | 25.46% | 63.98% | 45.40% | 77.61% | 17.07% |
Figure 2A histogram for a thermal noise sample added to one accelerometer axis in all test instances.
Figure 3A sample for thermal noise simulation for one accelerometer axis in one instance with signal-to-noise ratio (SNR) of 5 dB. (a) Original sensor readings. (b) Added white noise. (c) New values with SNR = 5 dB.
Figure 4Accuracy trend for machine learning models with the increase of thermal noise power.
Average inference accuracy using low resolution inference accuracy.
| Resolution | Machine Learning Model | ||||||
|---|---|---|---|---|---|---|---|
| DNN | KNN + PCA | DTC | DTC + PCA | RFC | RFC + PCA | GNB | |
| 16 bits[baseline] | 99.26% | 98.12% | 90.30% | 90.72% | 98.96% | 98.65% | 93.49% |
| 14 bits | 99.25% | 97.95% | 90.28% | 90.68% | 98.95% | 98.61% | 93.40% |
| 12 bits | 99.25% | 98.02% | 90.23% | 90.64% | 98.93% | 98.61% | 93.47% |
| 10 bits | 99.25% | 97.99% | 89.62% | 90.31% | 98.89% | 98.56% | 93.44% |
| 8 bits | 99.20% | 97.93% | 88.80% | 87.30% | 98.33% | 97.50% | 93.72% |
| 7 bits | 99.20% | 97.74% | 85.33% | 83.68% | 96.94% | 94.53% | 93.74% |
| 6 bits | 98.90% | 95.48% | 78.63% | 76.33% | 94.89% | 88.11% | 90.65% |
| 5 bits | 98.11% | 89.12% | 71.01% | 63.81% | 90.51% | 76.29% | 86.69% |
| 4 bits | 89.74% | 60.91% | 58.62% | 38.89% | 82.71% | 54.52% | 82.26% |
Figure 5A sample for low quantization simulation for one accelerometer axis in one instance. (a) Original sensors readings with 16 bits quantization. (b) 5 bits quantization (c) 6 bits quantization.
Figure 6Accuracy trend in machine learning models with lower inference quantization.
Average inference accuracies using lower resolution quantization applied to training and testing.
| Model | Simulated ADC Bits | ||||
|---|---|---|---|---|---|
| 8 bits | 7 bits | 6 bits | 5 bits | 4 bits | |
| DNN | 99.19% | 98.72% | 98.54% | 89.29% | 81.87% |
| KNN + PCA | 98.03% | 95.29% | 91.56% | 72.48% | 51.65% |
| DTC | 89.06% | 86.32% | 87.41% | 79.96% | 75.35% |
| DTC + PCA | 86.29% | 81.91% | 79.17% | 50.88% | 36.73% |
| RFC | 98.80% | 97.48% | 97.70% | 92.77% | 89.81% |
| RFC + PCA | 96.24% | 94.59% | 87.91% | 67.73% | 41.31% |
| GNB | 85.62% | 82.55% | 81.67% | 54.04% | 32.11% |
Inference accuracy with a device failure in one tracker.
| Model | Failed Device | ||
|---|---|---|---|
| Accelerometer | Gyroscope | Magnetometer | |
| DNN | 93.75% | 98.81% | 83.92% |
| KNN + PCA | 64.42% | 94.77% | 94.74% |
| DTC | 63.98% | 90.28% | 66.26% |
| DTC + PCA | 30.46% | 90.72% | 90.48% |
| RFC | 87.55% | 98.82% | 82.58% |
| RFC + PCA | 41.38% | 98.26% | 96.26% |
| GNB | 76.87% | 92.79% | 86.08% |
Inference accuracy with one tracker failure.
| Model | Failed Tracker | ||||
|---|---|---|---|---|---|
| #1 | #2 | #3 | #4 | #5 | |
| DNN | 74.29% | 86.15% | 72.59% | 74.69% | 68.65% |
| KNN + PCA | 78.08% | 48.69% | 51.07% | 71.53% | 72.16% |
| DTC | 38.6% | 66.89% | 55.27% | 64.37% | 16.15% |
| DTC + PCA | 27.89% | 33.43% | 30.63% | 31.56% | 28.9% |
| RFC | 57.73% | 82.69% | 73.26% | 88.59% | 39.62% |
| RFC + PCA | 42.98% | 40.48% | 38.88% | 43.84% | 40.25% |
| GNB | 57.21% | 83% | 67.55% | 63.45% | 64.5% |