Coded Parallel Transmission for Half-Duplex Distributed Computing
Abstract
:1. Introduction
1.1. Related Work
1.2. Our Contribution
2. System Model and Problem Definitions
2.1. Network Model
2.2. Mapreduce Process Description
2.2.1. Map Phase
2.2.2. Shuffle Phase
2.2.3. Reduce Phase
2.3. Examples: The Uncoded Scheme and the CDC Scheme
3. Main Results
4. The Proposed Coded Distributed Computing Scheme
4.1. Map Phase
4.2. Shuffle Phase
Group Partitioning Strategy
4.3. Data Shuffle Strategy
Algorithm 1 Distributed computing process of parallel coding in lossless scenarios. |
1: , , if , and |
2: for do |
3: for do |
4: |
5: |
6: Split as |
7: for do |
8: Node k sends the to the nodes in with |
9: end for |
10: for do |
11: Node j decodes the desired parts as follows: |
12: end for |
13: end for |
14: end for |
15: for , do |
16: . |
17: end for |
4.4. Reduce Phase
4.5. Analysis of Communication Delay
4.6. Illustrative Examples
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Cristea, V.; Dobre, C.; Stratan, C.; Pop, F.; Costan, A. Large-Scale Distributed Computing and Applications: Models and Trends Information Science Reference; IGI Publishing: Hershey, PA, USA, 2010. [Google Scholar]
- Nikoletseas, S.; Rolim, J.D. Theoretical Aspects of Distributed Computing in Sensor Networks; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
- Corbett, J.C.; Dean, J.; Epstein, M.; Fikes, A.; Frost, C.; Furman, J.J.; Ghemawat, S.; Gubarev, A.; Heiser, C.; Hochschild, P.; et al. Spanner: Google’s globally distributed database. ACM Trans. Comput. Syst. 2013, 31, 1–22. [Google Scholar] [CrossRef] [Green Version]
- Valentini, G.L.; Lassonde, W.; Khan, S.U.; Min-Allah, N.; Madani, S.A.; Li, J.; Zhang, L.; Wang, L.; Ghani, N.; Kolodziej, J.; et al. An overview of energy efficiency techniques in cluster computing systems. Clust. Comput. 2013, 16, 3–15. [Google Scholar] [CrossRef] [Green Version]
- Sadashiv, N.; Kumar, S.D. Cluster, grid and cloud computing: A detailed comparison. In Proceedings of the 2011 6th International Conference on Computer Science Education (ICCSE), Singapore, 3–5 August 2011; pp. 477–482. [Google Scholar]
- Hussain, H.; Malik, S.U.R.; Hameed, A.; Khan, S.U.; Bickler, G.; Min-Allah, N.; Qureshi, M.B.; Zhang, L.; Yongji, W.; Ghani, N.; et al. A survey on resource allocation in high performance distributed computing systems. Parallel Comput. 2013, 39, 709–736. [Google Scholar] [CrossRef] [Green Version]
- Idrissi, H.K.; Kartit, A.; El Marraki, M. A taxonomy and survey of cloud computing. In Proceedings of the 2013 National Security Days (JNS3), Rabat, Morocco, 26–27 April 2013; pp. 1–5. [Google Scholar]
- Dean, J.; Ghemawat, S. Mapreduce: Simplified data processing on large clusters. Commun. ACM 2008, 51, 107–113. [Google Scholar] [CrossRef]
- Jiang, D.; Ooi, B.C.; Shi, L.; Wu, S. The performance of mapreduce: An in-depth study. Proc. VLDB Endow. 2010, 3, 472–483. [Google Scholar] [CrossRef] [Green Version]
- Ahmad, F.; Chakradhar, S.T.; Raghunathan, A.; Vijaykumar, T.N. Tarazu: Optimizing mapreduce on heterogeneous clusters. ACM SIGARCH Comput. Archit. News 2012, 40, 61–74. [Google Scholar] [CrossRef]
- Guo, Y.; Rao, J.; Cheng, D.; Zhou, X. ishuffle: Improving hadoop performance with shuffle-on-write. IEEE Trans. Parallel Distrib. Syst. 2016, 28, 1649–1662. [Google Scholar] [CrossRef]
- Chowdhury, M.; Zaharia, M.; Ma, J.; Jordan, M.I.; Stoica, I. Managing data transfers in computer clusters with orchestra. ACM SIGCOMM Comput. Commun. Rev. 2011, 41, 98–109. [Google Scholar] [CrossRef]
- Zhang, Z.; Cherkasova, L.; Loo, B.T. Performance modeling of mapreduce jobs in heterogeneous cloud environments. In Proceedings of the 2013 IEEE Sixth International Conference on Cloud Computing, Washington, DC, USA, 9–12 December 2013; pp. 839–846. [Google Scholar]
- Georgiou, Z.; Symeonides, M.; Trihinas, D.; Pallis, G.; Dikaiakos, M.D. Streamsight: A query-driven framework for streaming analytics in edge computing. In Proceedings of the 2018 IEEE/ACM 11th International Conference on Utility and Cloud Computing (UCC), Zurich, Switzerland, 17–20 December 2018. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25. [Google Scholar] [CrossRef]
- Attia, M.A.; Tandon, R. Combating computational heterogeneity in large-scale distributed computing via work exchange. arXiv 2017, arXiv:1711.08452. [Google Scholar]
- Wang, D.; Joshi, G.; Wornell, G. Using straggler replication to reduce latency in large-scale parallel computing. ACM Sigmetrics Perform. Eval. Rev. 2015, 43, 7–11. [Google Scholar] [CrossRef]
- Ahmad, F.; Lee, S.; Thottethodi, M.; Vijaykumar, T.N. Mapreduce with communication overlap (marco). J. Parallel Distrib. Comput. 2013, 73, 608–620. [Google Scholar] [CrossRef] [Green Version]
- Nicolae, B.; Costa, C.H.; Misale, C.; Katrinis, K.; Park, Y. Leveraging adaptive i/o to optimize collective data shuffling patterns for big data analytics. IEEE Trans. Parallel Distrib. Syst. 2016, 28, 1663–1674. [Google Scholar] [CrossRef] [Green Version]
- Yu, W.; Wang, Y.; Que, X.; Xu, C. Virtual shuffling for efficient data movement in mapreduce. IEEE Trans. Comput. 2013, 64, 556–568. [Google Scholar] [CrossRef]
- Zaharia, M.; Borthakur, D.; Sen Sarma, J.; Elmeleegy, K.; Shenker, S.; Stoica, I. Delay scheduling: A simple technique for achieving locality and fairness in cluster scheduling. In Proceedings of the 5th European Conference on Computer Systems, Paris, France, 13–16 April 2010; pp. 265–278. [Google Scholar]
- Maddah-Ali, M.A.; Niesen, U. Fundamental limits of caching. IEEE Trans. Inf. Theory 2014, 60, 2856–2867. [Google Scholar] [CrossRef] [Green Version]
- Maddah-Ali, M.A.; Niesen, U. Decentralized coded caching attains order-optimal memory-rate tradeoff. IEEE/ACM Trans. Netw. 2015, 23, 1029–1040. [Google Scholar] [CrossRef] [Green Version]
- Li, S.; Maddah-Ali, M.A.; Yu, Q.; Avestimehr, A.S. A fundamental tradeoff between computation and communication in distributed computing. IEEE Trans. Inf. Theory 2018, 64, 109–128. [Google Scholar] [CrossRef]
- Shariatpanahi, S.P.; Motahari, S.A.; Khalaj, B.H. Multi-server coded caching. IEEE Trans. Inf. Theory 2016, 62, 7253–7271. [Google Scholar] [CrossRef] [Green Version]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zai, Q.; Yuan, K.; Wu, Y. Coded Parallel Transmission for Half-Duplex Distributed Computing. Information 2022, 13, 342. https://doi.org/10.3390/info13070342
Zai Q, Yuan K, Wu Y. Coded Parallel Transmission for Half-Duplex Distributed Computing. Information. 2022; 13(7):342. https://doi.org/10.3390/info13070342
Chicago/Turabian StyleZai, Qixuan, Kai Yuan, and Youlong Wu. 2022. "Coded Parallel Transmission for Half-Duplex Distributed Computing" Information 13, no. 7: 342. https://doi.org/10.3390/info13070342