| Literature DB >> 35626569 |
Cen-Jhih Li1, Pin-Han Huang2, Yi-Ting Ma1, Hung Hung3, Su-Yun Huang1.
Abstract
Federated learning is a framework for multiple devices or institutions, called local clients, to collaboratively train a global model without sharing their data. For federated learning with a central server, an aggregation algorithm integrates model information sent from local clients to update the parameters for a global model. Sample mean is the simplest and most commonly used aggregation method. However, it is not robust for data with outliers or under the Byzantine problem, where Byzantine clients send malicious messages to interfere with the learning process. Some robust aggregation methods were introduced in literature including marginal median, geometric median and trimmed-mean. In this article, we propose an alternative robust aggregation method, named γ-mean, which is the minimum divergence estimation based on a robust density power divergence. This γ-mean aggregation mitigates the influence of Byzantine clients by assigning fewer weights. This weighting scheme is data-driven and controlled by the γ value. Robustness from the viewpoint of the influence function is discussed and some numerical results are presented.Entities:
Keywords: byzantine problem; density power divergence; federated learning; influence function; robustness; γ-divergence
Year: 2022 PMID: 35626569 PMCID: PMC9141408 DOI: 10.3390/e24050686
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.738
Figure 1Comparison of different aggregators across different dimensions p. (a) Case (no Byzantine client). (b) Case (10% Byzantine clients).
Figure 2Comparison of different aggregators across different values with fixed . The two subplots in the 3rd row are the zoomed-in views of subplots in the 2nd row.
Figure 3Effect of different values across different dimensions with .
Figure 4Comparison of -mean versus simple -mean with and .
Figure A1Model used in MNIST and fashion MNIST.
Figure A2Model used in chest X-ray images.
Figure 5Testing process and comparison of testing accuracy for different aggregators on MNIST.
Figure 6Testing process and comparison of testing accuracy for different aggregators on fashion MNIST.
Figure 7Testing process and comparison of testing accuracy for different aggregators on chest x ray. The aggregators by mean and geometric median cannot tolerate the Byzantine attack and both methods were crushed during model training. Thus, there are no results reported for these two methods.
Pneumonia prediction on test data.
| Byz | Aggregator | TN | FN | FP | TP | Prec | Sens (Type II Error) | Acc |
|---|---|---|---|---|---|---|---|---|
| single machine | 156 | 23 | 78 | 367 | 0.8247 | 0.9410 (0.0590) | 0.8381 | |
| No | mean | 212 | 103 | 22 | 287 | 0.9288 | 0.7359 (0.2661) | 0.7997 |
| marginal median | 190 | 63 | 44 | 327 | 0.8814 | 0.8385 (0.1615) | 0.8285 | |
| simple | 126 | 8 | 108 | 382 | 0.7796 | 0.9795 (0.0205) | 0.8141 | |
| GeoMed | 177 | 30 | 57 | 360 | 0.8633 | 0.9231 (0.0769) | 0.8606 | |
| Yes | mean | – | – | – | – | – | – | – |
| marginal median | 228 | 271 | 6 | 119 | 0.9520 | 0.3051 (0.6949) | 0.5561 | |
| simple | 140 | 11 | 94 | 379 | 0.8013 | 0.9718 (0.0282) | 0.8317 | |
| GeoMed | – | – | – | – | – | – | – |
† The symbol “–” indicates model crushed during training.