CrowdSFL: A Secure Crowd Computing Framework Based on Blockchain and Federated Learning
Abstract
:1. Introduction
- Requester. Typically, it is a server that hosts the crowd task. It is generally responsible for standardizing a task before releasing it. In some special tasks, it can even participate in the project as a unique worker.
- Workers. Crowdworkers lend their devices to execute the jobs. In crowd computing, the entities of workers are generally idle devices contributed by crowdworkers, such as mobile phones or laptops. When a worker chooses to participate in a crowdsourcing task, their devices will execute the tasks according to an agreement. When the devices are found to be idle, the jobs are processed following the (CPU) cycle-stealing scheme.
- Crowdsourcing Platform. Middleware for job management along with the server and client applications. The server application is responsible for creating tasks, discovering suitable workers, assigning tasks and scheduling tasks to the designated workers. Collecting the results from multiple workers, assembling them, and updating it in the server application, and for additional purposes. The platform will handle a large amount of work, so it usually needs to have strong computing performance or be deployed on the cloud.
- Network. All devices communicate through the Internet or a local area network. Traditional crowdsourcing applications generally uses the Internet as the network, and a platform as the center to control all devices. This paper innovatively proposes a blockchain-based crowdsourcing paradigm, and the essential explanation is described in Section 4.
- We transplant the entire crowdsourcing system onto a the blockchain, and each participant is an account in the blockchain. In order to make full use of the advantages of the blockchain, a data interaction mode controlled by smart contracts running on Ethereum is proposed. The code set in the contract enables the data to be uploaded and saved in blocks with the correct format.
- We propose crowd computing with federated learning as the computing paradigm. Unlike previous paradigms, the interactive content of this crowdsourcing system is a value of the gradient, or model. Using the intermediate training value of federated learning as the interaction can effectively protect user privacy. At the same time, in each round of the federated learning, the gradient submitted by workers would be judged by the requester, and feedback to the platform for the distribution of rewards. Every worker who effectively participates in crowdsourcing can get a certain reward. This provides another innovative way for the reward distribution to future crowdsourcing systems. As far as we know, this is the first time that federated learning has been adopted into crowd computing, and our research is an innovation in crowdsourcing.
- We propose a new re-encryption algorithm based on the blindness nature of Elgamal. With this algorithm, as long as requesters and the platform do not collude, the value of the gradient or the model submitted by the workers would not be leaked to the platform or requester. It enables the decentralized blockchain platform to ensure data confidentiality, anonymity, and accountability during data interaction. Our algorithm does not take advantage of ciphertext-homomorphism to protect privacy, so compared to some recent similar works on homomorphic encryption methods [16,17], it does not add too much pressure to the computation overhead.
2. Related Work
2.1. Crowdsouring in Adversarial Settings
2.2. Privacy Preserving in Distributed Computing
2.3. Blockchain Meets Traditional Schemes
3. Main Components
3.1. Crowdsourcing System
3.2. Federated Learning
Algorithm 1: FederatedAveraging. C is the global batch size; B is the local batch size; the K clients are indexed by k; E is the number of local epochs; and is the learning rate. |
ServerAggregation: // Run on server |
initialize |
for each round t = 1, 2, ... do |
for each client in parallel do |
ClientUpdate |
ClientUpdate : // Run on client k |
for each local epoch i from 1 to E do |
for each local epoch i from 1 to E do |
for batch do |
return w to server |
3.3. Blockchain
- Distributed database. Every node in the blockchain network keeps a copy of all transactions, and others can verify any transactions between two parties. This redundant storage strategy improves tamper-proofing of data and traceability by sacrificing storage space.
- Righteous programmatic platform. There are two types of accounts in Ethereum: the externally owned account controlled by private keys, and the contract accounts controlled by contract codes [35]. The subject of a contract account is a smart contract, which supports an external account to write the input content in the transaction data, and get the output result by calling a contract. A contract cannot be changed once published, so the creator must carefully examine the contents of the contract before publishing it.
3.4. Secure Re-Encryption Scheme
- : In the key generation algorithm, is a multiplicative cyclic group of a prime order p with a generator g, such that discrete logarithm problem over the group is hard. Then, the CSP and the REQ respectively choose a secret number as private key and compute , then there exit two public key . Moreover, we negotiate a special Diffie–Hellman public key .
- : The encryption algorithm takes the public key and a message as inputs, then it randomly chooses a number and output a ciphertext .
- : CSP uses its private key for the re-encryption operation which is similar to common decryption process in Elgamal encryption system, and output a new ciphertext .
- : The blinding algorithm takes a temporary rand number as a rand factor and corresponding ciphertext as inputs, and CSP outputs a new blind ciphertext .
- : REQ execute the decryption algorithm. It takes the and blinding ciphertext as inputs and outputs .
4. Preliminaries
4.1. Overview
- Quality issues. Since crowdsourcing workers are usually Internet crowds, it is hard to ensure each worker will collaborate with the agreement. In some adversary environments, malicious participants will violate the agreement and launch some attacks, such as malicious workers submitting a large number of low-quality solutions or stealing others’ solutions and claiming to be theirs. These behaviors often lead to poor quality of crowdsourcing solutions.
- Rewards distribution issues. The ultimate purpose of workers is to get rewards by completing tasks. Some of the currently known distribution mechanisms will reward the worker who submitted the best solution, but this is unfair for other workers who have worked hard. The most reasonable profit distribution should be that any workers get the reward they deserve according to their own workload and the quality of the solution.
4.2. Smart Contract
- (1)
- The data submitted by the participants conform to the correct format.
- (2)
- The data is stored publicly, and participants can retrieve the entire data from the blockchain.
Algorithm 2: Submitting Data. |
(Function 1) SubmitMetadata |
(Function 2) WorkerSubmit: // round t, worker i, worker’s account number |
If now < then |
(Function 3) CSPSubmit: // after Blinding, ciphertext will be uniquely numbered as |
by CSP |
If then |
else |
(Function 4) REQSubmit: // stands for model test accuracy results |
If then |
else |
4.3. Model Data Interaction via Blockchain
Algorithm 3: Model-Data-Interaction. |
Workers: |
for each worker in parallel do |
ClientUpdate |
Enc |
Call the contract to upload to the chain |
CSP: |
ReEnc |
Blinding |
Call the contract to upload to the chain |
REQ |
Dec |
If is available, then |
ServerAggregation |
Call the contract to upload to the chain |
5. Framework Architecture
5.1. System Model
5.2. Evaluation Mechanism
5.3. Threat Model
- All ciphertexts cannot be decrypted.
- Participants cannot find the rules of the information on the blockchain. For example, REQ can find the corresponding gradient plaintext through the blinded gradient ciphertext.
- Malicious REQ. REQ should release the reward pool he provided before performing the task. Considering that the purpose of malicious REQ is to obtain a useful solution and reduce his asset consumption, he may use some features of crowdsourcing to achieve it. For example, malicious REQ will recruit some fake workers to participate in crowdsourcing, and misrepresent the solutions submitted by these workers as a high-quality solution. After crowdsourcing, these workers recruited by malicious REQ can theoretically receive more rewards, so that malicious REQ can recover some assets from the reward pool.
- Malicious Workers. Malicious workers attempt to obtain rewards without paying sufficient effort, which is free-rid-ing attack. Workers who scored low during the evaluation would want to change these in other ways, such as denying or even fork the existing blockchain. In addition, malicious workers can directly grab other workers’ solutions and claim to be their own, so as to achieve reaping without sowing.
- Malicious Miners. Malicious miners attempt to disrupt the normal execution of programs on the blockchain by forking chains or collaborating with malicious participants, thereby achieving their attack goals.
6. Performance Evaluation and Analysis
6.1. Experiment Design
6.2. Time Overhead
6.3. Communication Overhead
6.4. Analysis
7. Conclusions and Future Work
Author Contributions
Funding
Conflicts of Interest
Abbreviations
P2P network | Peer-to-Peer network |
BP-NN | Back Propagation Neural Network |
CSP | Crowdsourcing platform |
REQ | Requester |
References
- Howe, J. The rise of crowdsourcing. Wired Mag. 2006, 53, 1–4. [Google Scholar]
- Albarqouni, S.; Baur, C.; Achilles, F.; Belagiannis, V.; Demirci, S.; Navab, N. Aggnet: Deep learning from crowds for mitosis detection in breast cancer histology images. IEEE Trans. Med. Imaging 2016, 35, 1313–1321. [Google Scholar] [CrossRef]
- Yuan, D.; Li, Q.; Li, G.; Wang, Q.; Ren, K. PriRadar: A privacy–preserving framework for spatial crowdsourcing. IEEE Trans. Inf. Forensics Secur. 2019, 15, 299–314. [Google Scholar] [CrossRef]
- Beberg, A.L.; Ensign, D.L.; Jayachandran, G.; Khaliq, S.; Pande, V.S. Folding@ home: Lessons from eight years of volunteer distributed computing. In Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing, Rome, Italy, 23–29 May 2009; pp. 1–8. [Google Scholar]
- Balicki, J.; Brudlo, P.; Szpryngier, P. Crowdsourcing and volunteer computing as distributed approach for problem solving. In Proceedings of the 13th International Conference on Software Engineering, Parallel and Distributed Systems, Jerzy Balicki Ateny, Greece, 15 May 2014; pp. 115–121. [Google Scholar]
- Murray, D.G.; Yoneki, E.; Crowcroft, J.; Hand, S. The case for crowd computing. In Proceedings of the Second ACM SIGCOMM Workshop on Networking, Systems, and Applications on Mobile Handhelds, New York, NY, USA, 30 August 2010; pp. 39–44. [Google Scholar]
- Uber Conceals 57 Million User Information Leaks Last Year. 2017. Available online: https://cn.technode.com/post/2017--11--22/uber--cyberattack--users--data--exposed--57--million/ (accessed on 10 April 2020).
- Elance and Odesk Hit by Ddos. 2014. Available online: https://gigaom.com/2014/03/18/elance--hit--by--major--ddos--attack--downing--service--for--many--freelancers/ (accessed on 1 May 2019).
- Hao, J.; Huang, C.; Chen, G.; Xian, M.; Shen, X.S. Privacy–Preserving Interest–Ability Based Task Allocation in Crowdsourcing. In Proceedings of the IEEE International Conference on Communications 2019, Shanghai, China, 20–24 May 2019; pp. 1–6. [Google Scholar]
- Rong, H.; Wang, H.M.; Liu, J.; Xian, M. Privacy–preserving k–nearest neighbor computation in multiple cloud environments. IEEE Access 2016, 4, 9589–9603. [Google Scholar] [CrossRef]
- Wu, W.; Liu, J.; Rong, H.; Xian, M. Efficient k–nearest neighbor classification over semantically secure hybrid encrypted cloud database. IEEE Access 2018, 6, 41771–41784. [Google Scholar] [CrossRef]
- Tang, F.; Wu, W.; Liu, J.; Wang, H.; Xian, M. Privacy–preserving distributed deep learning via homomorphic re–encryption. Electronics 2019, 8, 411. [Google Scholar] [CrossRef] [Green Version]
- Yang, D.; Xue, G.; Fang, X.; Tang, J. Crowdsourcing to smartphones: Incentive mechanism design for mobile phone sensing. In Proceedings of the 18th Annual International Conference on Mobile Computing and Networking, New York, NY, USA, 22–26 August 2012; pp. 173–184. [Google Scholar]
- Peng, D.; Wu, F.; Chen, G. Pay as how well you do: A quality based incentive mechanism for crowdsensing. In Proceedings of the 16th ACM International Symposium on Mobile Ad Hoc Networking and Computing, New York, NY, USA, 22–25 June 2015; pp. 177–186. [Google Scholar]
- Li, M.; Weng, J.; Yang, A.; Lu, W.; Zhang, Y.; Deng, R.H. Crowdbc: A blockchain–based decentralized framework for crowdsourcing. IEEE Trans. Parallel Distrib. Syst. 2018, 30, 1251–1266. [Google Scholar] [CrossRef]
- Shen, M.; Tang, X.; Zhu, L.; Du, X.; Guizani, M. Privacy–preserving support vector machine training over blockchain–based encrypted IoT data in smart cities. IEEE Internet Things J. 2019, 6, 7702–7712. [Google Scholar] [CrossRef]
- Shen, M.; Zhang, J.; Zhu, L.; Xu, K.; Tang, X. Secure svm training over vertically–partitioned datasets using consortium blockchain for vehicular social networks. IEEE Trans. Veh. Technol. 2019. [Google Scholar] [CrossRef]
- Wang, W.; He, Z.; Shi, P.; Wu, W.; Jiang, Y.; An, B.; Chen, B. Strategic social team crowdsourcing: Forming a team of truthful workers for crowdsourcing in social networks. IEEE Trans. Mobile Comput. 2018, 18, 1419–1432. [Google Scholar] [CrossRef]
- Li, M.; Sun, X.; Wang, H.; Zhang, Y.; Zhang, J. Privacy–aware access control with trust management in web service. World Wide Web 2011, 14, 407–430. [Google Scholar] [CrossRef]
- Evfimievski, A.; Srikant, R.; Agrawal, R. Privacy preserving mining of association rules. Inf. Syst. 2004, 29, 343–364. [Google Scholar] [CrossRef]
- Kairouz, P.; McMahan, H.B.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A.N.; Oliveira, R.G. Advances and open problems in federated learning. arXiv 2019, arXiv:1912.04977. [Google Scholar]
- Aono, Y.; Hayashi, T.; Wang, L.; Moriai, S. Privacy–preserving deep learning via additively homomorphic encryption. IEEE Trans. Inf. Forensics Secur. 2018, 13, 1333–1345. [Google Scholar]
- Ding, W.; Yan, Z.; Deng, R.H. Encrypted data processing with homomorphic re–encryption. Inf. Sci. 2017, 409, 35–55. [Google Scholar] [CrossRef]
- Lu, Y.; Tang, Q.; Wang, G. Zebralancer: Private and anonymous crowdsourcing system atop open blockchain. In Proceedings of the 2018 IEEE 38th International Conference on Distributed Computing Systems, Vienna, Austria, 2–5 July 2018; pp. 853–865. [Google Scholar]
- Zheng, B.K.; Zhu, L.H.; Shen, M.; Gao, F.; Zhang, C.; Li, Y.D.; Yang, J. Scalable and privacy–preserving data sharing based on blockchain. J. Comput. Sci. Technol. 2018, 33, 557–567. [Google Scholar] [CrossRef]
- Kuo, T.T.; Gabriel, R.A.; Ohno-Machado, L. Fair compute loads enabled by blockchain: Sharing models by alternating client and server roles. J. Am. Med. Inf. Assoc. 2019, 26, 392–403. [Google Scholar] [CrossRef] [Green Version]
- Preuveneers, D.; Rimmer, V.; Tsingenopoulos, I.; Spooren, J.; Joosen, W.; Ilie-Zudor, E. Chained anomaly detection models for federated learning: An intrusion detection case study. Appl. Sci. 2018, 8, 2663. [Google Scholar] [CrossRef] [Green Version]
- Weng, J.; Weng, J.; Zhang, J.; Li, M.; Zhang, Y.; Luo, W. Deepchain: Auditable and privacy–preserving deep learning with blockchain–based incentive. IEEE Trans. Dependable Secur. Comput. 2019. [Google Scholar] [CrossRef]
- Chen, X.; Ji, J.; Luo, C.; Liao, W.; Li, P. When machine learning meets blockchain: A decentralized, privacy–preserving and secure design. In Proceedings of the 2018 IEEE International Conference on Big Data, Seattle, WA, USA, 10–13 December 2018; pp. 1178–1187. [Google Scholar]
- Zhang, S.; Wu, J.; Lu, S. Minimum makespan workload dissemination in DTNs: Making full utilization of computational surplus around. In Proceedings of the Fourteenth ACM International Symposium on Mobile Ad Hoc Networking and Computing, New York, NY, USA, 30 July 2013; pp. 293–296. [Google Scholar]
- Cheung, M.H.; Southwell, R.; Hou, F.; Huang, J. Distributed time–sensitive task selection in mobile crowdsensing. In Proceedings of the 16th ACM International Symposium on Mobile Ad Hoc Networking and Computing, New York, NY, USA, 13–18 June 2015; pp. 157–166. [Google Scholar]
- Konečný, J.; McMahan, H.B.; Ramage, D.; Richtárik, P. Federated optimization: Distributed machine learning for on–device intelligence. arXiv 2016, arXiv:1610.02527. [Google Scholar]
- McMahan, H.B.; Moore, E.; Ramage, D.; Hampson, S. Communication–efficient learning of deep networks from decentralized data. arXiv 2016, arXiv:1602.05629. [Google Scholar]
- Nakamoto, S. A Peer-to-Peer Electronic Cash System. Bitcoin 2008. Available online: https://bitcoin.org/bitcoin.pdf (accessed on 15 March 2020).
- Buterin, V. Ethereum White Paper: A Next Generation Smart Contract & Decentralized Application Platform. Ethereum White Paper. 2014, pp. 1–36. Available online: https://blockchainlab.com/pdf/Ethereum_white_paper-a_next_generation_smart_contract_and_decentralized_application_platform-vitalik-buterin.pdf (accessed on 12 April 2020).
- Chen, N. A Comparison of El Gamal and Paillier. Cryptosystems. 2018. Available online: https://pdfs.semanticscholar.org/2939/17317f092818c23392b8be9bc662a5bb9f9e.pdf (accessed on 2 March 2020).
- Szabo, N. Formalizing and securing relationships on public networks. First Monday 1997, 2. [Google Scholar] [CrossRef]
- Wu, W.; Tsai, W.T.; Li, W. An evaluation framework for software crowdsourcing. Front. Comput. Sci. 2013, 7, 694–709. [Google Scholar] [CrossRef]
- Han, S.; Mao, H.; Ally, W.J. Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv 2015, arXiv:1510.00149. [Google Scholar]
- Konečný, J.; McMahan, H.B.; Yu, F.X.; Richtárik, P.; Suresh, A.T.; Bacon, D. Federated learning: Strategies for improving communication efficiency. arXiv 2015, arXiv:1610.05492. [Google Scholar]
Dataset | CrowdSFL | BP-NN | SecureSVM [16] |
---|---|---|---|
BCWD | 91.42% | 94.29% | 90.35% |
HDD | 93.94% | 95.65% | 93.89% |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, Z.; Liu, J.; Hao, J.; Wang, H.; Xian, M. CrowdSFL: A Secure Crowd Computing Framework Based on Blockchain and Federated Learning. Electronics 2020, 9, 773. https://doi.org/10.3390/electronics9050773
Li Z, Liu J, Hao J, Wang H, Xian M. CrowdSFL: A Secure Crowd Computing Framework Based on Blockchain and Federated Learning. Electronics. 2020; 9(5):773. https://doi.org/10.3390/electronics9050773
Chicago/Turabian StyleLi, Ziyuan, Jian Liu, Jialu Hao, Huimei Wang, and Ming Xian. 2020. "CrowdSFL: A Secure Crowd Computing Framework Based on Blockchain and Federated Learning" Electronics 9, no. 5: 773. https://doi.org/10.3390/electronics9050773
APA StyleLi, Z., Liu, J., Hao, J., Wang, H., & Xian, M. (2020). CrowdSFL: A Secure Crowd Computing Framework Based on Blockchain and Federated Learning. Electronics, 9(5), 773. https://doi.org/10.3390/electronics9050773