| Literature DB >> 34840525 |
Dun Li1, Dezhi Han1, Tien-Hsiung Weng2, Zibin Zheng3, Hongzhi Li1, Han Liu1, Arcangelo Castiglione4, Kuan-Ching Li2.
Abstract
Federated learning (FL) is a promising decentralized deep learning technology, which allows users to update models cooperatively without sharing their data. FL is reshaping existing industry paradigms for mathematical modeling and analysis, enabling an increasing number of industries to build privacy-preserving, secure distributed machine learning models. However, the inherent characteristics of FL have led to problems such as privacy protection, communication cost, systems heterogeneity, and unreliability model upload in actual operation. Interestingly, the integration with Blockchain technology provides an opportunity to further improve the FL security and performance, besides increasing its scope of applications. Therefore, we denote this integration of Blockchain and FL as the Blockchain-based federated learning (BCFL) framework. This paper introduces an in-depth survey of BCFL and discusses the insights of such a new paradigm. In particular, we first briefly introduce the FL technology and discuss the challenges faced by such technology. Then, we summarize the Blockchain ecosystem. Next, we highlight the structural design and platform of BCFL. Furthermore, we present the attempts ins improving FL performance with Blockchain and several combined applications of incentive mechanisms in FL. Finally, we summarize the industrial application scenarios of BCFL.Entities:
Keywords: Blockchain; Federated learning; Incentive mechanism; Industrial Applications; Smart Contract
Year: 2021 PMID: 34840525 PMCID: PMC8605788 DOI: 10.1007/s00500-021-06496-5
Source DB: PubMed Journal: Soft comput ISSN: 1432-7643 Impact factor: 3.732
Fig. 1The architecture of FL
Fig. 2The loosely related researches of Yang’s work and Zheng’s work
The summary of selected overviews and surveys for FL
| Category | Ref. no | Author(s) | Topic | Published |
|---|---|---|---|---|
| Fundamental architecture, algorithm, and model |
Konečnỳ et al. ( | Mcmahanet et al. | Concept and applications | 2017.6-9 |
|
Kairouz et al. ( | Kairouz et al. | Advances and open problems | 2019.1 | |
|
Yang et al. ( | Yang et al. | Concept and applications | 2019.2 | |
|
Bonawitz et al. ( | Bonawitz et al. | System design | 2019.3 | |
|
Li et al. ( | Tian Li et al. | Challenges, methods, and future directions | 2019.8 | |
|
Gu et al. ( | Gu et al. | Distributed machine learning | 2019.9 | |
|
Li et al. ( | Qinbin Li et al. | Data privacy and protection | 2019.11 | |
|
Mothukuri et al. ( | Mothukuri et al. | Security and privacy | 2020.10 | |
|
Shen et al. ( | Sheng Shen et al. | Data privacy and security | 2020.10 | |
|
Lo et al. ( | SK Lo et al. | A Software engineering perspective | 2020.12 | |
|
Lyu et al. ( | Lyu et al. | Threats | 2020.3 | |
|
Bellavista et al. ( | Bellavista et al. | Deployment environments | 2021.2 | |
|
Zhan et al. ( | Yufeng Zhan et al. | Incentive mechanism design | 2021.3 | |
| Performance improvement |
Kulkarni et al. ( | Kulkarni et al. | Personalization techniques | 2020.3 |
|
Jin et al. ( | Yilun Jin et al. | Utilizing unlabeled data | 2020.5 | |
|
Hu et al. ( | Sixu Hu et al. | Benchmark suite | 2020.10 | |
| Embeding technology, and application |
Cui et al. ( | Cui et al. | FL for Internet of things | 2018.6 |
|
Lim et al. ( | Bryan Lim et al. | FL in Mobile edge networks | 2020.2 | |
|
Du et al. ( | Du et al. | FL for Vehicular internet of things | 2020.4 | |
|
Saputra et al. ( | Saputra et al. | FL for Electric vehicle networks | 2020.4 | |
|
Aledhari et al. ( | Aledhari et al. | Enabling technologies, protocols, and applications | 2020.8 | |
|
Tan et al. ( | Tan et al. | FL in Vehicular networks | 2020.8 | |
|
Wahab et al. ( | Wahab et al. | FL in Communication and networking systems | 2021.2 |
The summary of acronyms and definitions
| Acronym | Definition |
|---|---|
| Federated learning | |
| Horizontal federated learning | |
| Vertical federated learning | |
| Federated transfer learning | |
| The integration of Blockchain and federated learning | |
| Artificial intelligence | |
| Distributed denial of service | |
| Single point of failure | |
| Proof of work | |
| Proof of stake | |
| Delayed proof-of-work | |
| Delegated proof-of-stake | |
| Practical byzantine fault tolerance | |
| Delegated byzantine fault tolerance | |
| Verify the pooling | |
| Internet of vehicles | |
| Internet of things | |
| Digital twin wireless network | |
| 5th Generation mobile networks | |
| 6th Generation mobile networks |
Fig. 3The category of data partition for FL
Fig. 4The workflow of FL
The summary of open-source frameworks of FL
| Project | Publisher | Framework | Open source | Refs. | Github |
|---|---|---|---|---|---|
| Tensorflow | Code blocks |
XXXX ( | |||
| Ryffel et.al | PyTorch | Code blocks |
Ryffel et al. ( | ||
| Webank | KubeFATE | API |
XXXX ( | ||
| Baidu | PaddlePaddle | API |
Ma et al. ( | ||
| University of Southern California | worker-oriented program | API |
He et al. ( |
Fig. 5The architecture of Blockchain
Taxonomy of Blockchain systems
| Blockchain | Participants | Characteristics | TPS |
|---|---|---|---|
| Public | Anyone | Decentralized | 3–20 data writes per second |
| Consortium | Authorized nodes | Partially centralized | 1000 data writes per second |
| Private | Authorized nodes | Centralized | 1000 data writes per second |
The summary of Consensus in Blockchain
| Consensus | Merits | Weakness |
|---|---|---|
| Complete centralization, nodes free access | Waste of energy and difficult to reduce the confirmation time of blocks | |
| Simple algorithm | Prone to forking and need to wait for multiple forks to reach consistency | |
| The cost of destruction is huge(destroyer exceed 50%) | ||
| cLow performance requirements for nodes | No final consistency, need checkpoint mechanism to compensate and finality | |
| Short consensus time | ||
| Significantly reduce the number of nodes involved in validation | Sacrifices the concept of decentralization, not suitable for public chains | |
| Energy conservation | Slightly more centralized, e.g., participants with high equity can vote to make themselves a validator. | |
| Rapidity | ||
| High consensus efficiency for high frequency trading | The existence of cryptocurrency and the incentive mechanism will create a Matthew effect making the poor poorer and the rich richer in the community | |
| The system will stop when only 33% of the nodes are left running | ||
| Highly fault-tolerant with bookkeeping done by multiple nodes | The system will not be able to provide services when more than one-third of the bookkeepers stop working | |
| Every block has finality | ||
| The algorithm has a strict mathematical proof that it will not bifurcate | ||
| No cryptocurrency required | Less decentralized | |
| Second-level consensus verification |
Fig. 6Smart contract
Fig. 7The architecture of BCFL
The summary of Deployed Platform for BCFL
| Platform | Blockchain type | Consensus | Identity | Recent studies | Refs. |
|---|---|---|---|---|---|
| Ethereum | Public | Anonymity |
Nagar ( | ||
| Hyperledger Fabric | Consortium | Known identity |
Zhang et al. ( | ||
| EOS | Consortium | DPoS,BFT | Known identity |
Martinez et al. ( | |
| Custom Blockchain | Private | PBFT | Known identity |
Kim et al. ( |
The summary of performance enhancements in BCFL
| Reinforcement | Proposed Model | Solutions | Simulation | Refs. |
|---|---|---|---|---|
| Performance | Classification Accuracy Enhancement | MNIST digit recognition task | Korkmaz et al. ( | |
| CIFAR-10 image classification task | ||||
| Solving non-IID issues | Non-IID MNIST dataset |
Jeong et al. ( | ||
| Efficiency | Replace Oracle service with chaincode | Synthetic 2D dataset | Drungilas et al. ( | |
| EEG Eye State dataset | ||||
| Setting weight parameter | MNIST dataset |
Kim and Hong ( | ||
| Security | Re-encryption algorithms | FEMNIST dataset |
Li et al. ( | |
| Improved Consensus | MNIST dataset |
Kang et al. ( |
The summary of industrial applications of BCFL in emerging domains
| Application domains | Applicable data | Benefits | Related studies |
|---|---|---|---|
| Data processing in Health care | Covid-19 data | Data security, auditability, and incentives | Dp et al. ( |
| Transaction Metadata | Addressing data heterogeneity | ||
| Medical data | Model robustness | ||
| Anomaly detection in network security | Automatic encoder for anomaly detection | Data Auditability |
Preuveneers et al. ( |
| Device failure and anomaly detection in IoT | The movement data | High testing accuracy | Zhao et al. ( |
| High communication efficiency | |||
| Complete privacy and anonymity | |||
| Internet of vehicles For Trustworthy Vehicular Networks | Train running data Vehicle localization application data | High quality parameter collection | Otoum et al. ( |
| High test accuracy | |||
| High communication efficiency | |||
| Anti-attack | |||
| 5G & 6G secure communication | LP solver with GMI Communication network data | System reliability and security | Liu et al. ( |
| Improved data privacy | |||
| Incentives and fairness | |||
| Intelligent Edge Computing | MovieLens datasets CASIA-WebFace | High communication efficiency | Rehman et al. ( |
| Bandwidth optimization | |||
| Privacy protection | |||
| Fog Computing | Fog servers data | Decentralized Privacy | Qu et al. ( |
| Poisoning Attack Proof | |||
| High Efficiency | |||
| Cognitive Computing | CIFAR-10 dataset | Advanced validation | Qu et al. ( |
| Fast convergence | |||
| Defence framework for sustainable society | “Airplane,” “Bird,” “Drone,” and “Ship” from the different sources | Advanced validation | Sharma et al. ( |
| Privacy protection |