| Literature DB >> 36262146 |
Yuan Lei1, Shir Li Wang1, Caiyu Su2, Theam Foo Ng3.
Abstract
The Internet of Vehicles (IoV) is an interactive network providing intelligent traffic management, intelligent dynamic information service, and intelligent vehicle control to running vehicles. One of the main problems in the IoV is the reluctance of vehicles to share local data resulting in the cloud server not being able to acquire a sufficient amount of data to build accurate machine learning (ML) models. In addition, communication efficiency and ML model accuracy in the IoV are affected by noise data caused by violent shaking and obscuration of in-vehicle cameras. Therefore we propose a new Outlier Detection and Exponential Smoothing federated learning (OES-Fed) framework to overcome these problems. More specifically, we filter the noise data of the local ML model in the IoV from the current perspective and historical perspective. The noise data filtering is implemented by combining data outlier, K-means, Kalman filter and exponential smoothing algorithms. The experimental results of the three datasets show that the OES-Fed framework proposed in this article achieved higher accuracy, lower loss, and better area under the curve (AUC). The OES-Fed framework we propose can better filter noise data, providing an important domain reference for starting field of federated learning in the IoV.Entities:
Keywords: Exponential smoothing; Federated learning; Internet of vehicles; Kalman filter; Noise data filtering; Outlier detection
Year: 2022 PMID: 36262146 PMCID: PMC9575870 DOI: 10.7717/peerj-cs.1101
Source DB: PubMed Journal: PeerJ Comput Sci ISSN: 2376-5992
Comparison of federated learning and noise data filtering algorithms for different domains.
| Technologies | Application scenario | Innovations | |
|---|---|---|---|
|
| Federated learning, Active learning | Natural disaster, Refuse classification | Modifies some federal learning model parameters and allows the machine learning (ML) model to automatically select and tag the data it learns. |
|
| Federated learning | IoV | The two-dimensional contract theory is used as the distributed framework and greedy algorithm is added. |
|
| Outlier detection | Data filtering | Outlier detection based on graph clustering outliers are allowed. |
|
| Outlier detection, K-nearest algorithm | Data filtering | Use the K-nearest neighbor algorithm to divide different regions for outlier attributes, and then use the division of different regions for outlier attributes to introduce local outlier factors |
|
| Kalman filter | Position and trajectory estimation of moving objects | The exponential function and the Kalman gain |
|
| Kalman filter | IoV | Kalman filter is used to fuse the position information. GPS, SINS, DR and TDOA are selected to simulate the fusion algorithm. |
| OURS | Federated learning, Outlier detection, K-nearest algorithm, Kalman filter | IoV | Outliers are detected by selecting excellent subsets, and combined with K-means algorithm, cubic exponential smoothing and Kalman filter algorithm. |
Figure 1IoV scenario.
Figure 2System model.
Parameter definition of OES-Fed algorithm.
| Symbols | Definition |
|---|---|
| r | Global communication rounds |
| D | Discard rate |
| V | Vehicle set |
| v | Vehicle |
| m | Vehicle weights |
|
| Vehicle accuracy |
| global | Global model |
|
| Global model accuracy |
| step | Vehicle training resources |
Parameter definition of ESmooth algorithm.
| Symbols | Definition |
|---|---|
| X | Filtered processing value |
| P | The variance value corresponding to X |
| K | Filtering gain value |
| S | Smoothing value |
| R | Smoothing period |
|
| Exponential smoothing coefficient |
Figure 3OES-Fed framework.
Outlier algorithm & ESmooth algorithm.
|
|
| |
| |
| |
|
|
|
|
|
|
| for |
| |
| |
| |
| |
| |
| |
| |
| |
| |
|
|
OES-Fed algorithm.
|
|
|
|
| |
|
|
|
|
| |
| |
| |
| |
|
|
|
|
| |
|
|
|
|
| |
| |
| |
| |
| |
| |
| |
| e |
| |
|
|
|
|
| ( |
|
|
|
|
|
|
|
|
| |
| |
|
|
|
|
Parameter setting of the experiment.
| Symbols | MNIST CIFAR-10 | Vehicle classification |
|---|---|---|
| Number of training set images | 60,000 | 2,000 |
| Number of images in the test set | 10,000 | 200 |
| Client | 40 | 40 |
| Total number of rounds | 30 | 30 |
| Number of local training rounds | 10 | 10 |
| Learning efficiency | 0.01 | 0.01 |
| Data type | non-iid | non-iid |
| Local data batch size | 64 | 64 |
| Convolution kernel | 5 * 5 | 5 * 5 |
Figure 4Statistics on the number of actual clients in the OES-Fed model using the MNIST dataset, the CIFAR-10 dataset and the vehicle classification dataset.
Figure 5Comparison of the accuracy of each model for MNIST, CIFAR-10 and the vehicle classification datasets using different models and non-iid data settings.
Figure 6Comparison of loss values for each model for MNIST, CIFAR-10 and the vehicle classification datasets using different models and non-iid data settings.
Figure 7Comparison of AUC values for each model for MNIST using different models and non-iid data settings.
Figure 9Comparison of AUC values for each model for the vehicle classification dataset using different models and non-iid data settings.
Figure 10Accuracy comparison of all clients in the last round for the MNIST dataset, CIFAR-10 dataset and the vehicle classification dataset using the OES-Fed model and FedAVG model.
Comparison of accuracy, loss values and AUC values for the MNIST dataset, CIFAR-10 and the vehicle classification dataset using FedAVG, FedSGD and OES-Fed.
| MNIST | CIFAR-10 | Vehicle classification | ||
|---|---|---|---|---|
| accuracy | 54.20 | 14.15 | 9.9 | |
| FedSGD | loss | 2.18 | 2.30 | 2.30 |
| AUC | 0.53 | 0.55 | 0.57 | |
| accuracy | 96.30 | 19.73 | 45.5 | |
| FedAVG | loss | 0.30 | 2.24 | 1.62 |
| AUC | 0.63 | 0.58 | 0.63 | |
| accuracy | 98.66 | 61.75 | 77.9 | |
| OES-Fed | loss | 0.07 | 1.86 | 1.03 |
| AUC | 0.85 | 0.72 | 0.90 |
Parameter definition of Outlier algorithm.
| Symbols | Definition |
|---|---|
| e | Clustering parameters |
| D | Sample distance formula |
| d | Sample dimension |
|
| Arbitrary real numbers |
| k | Number of clustering centers |
|
| Center of clustering |
|
| Number of samples of class j |