Parallel Particle Swarm Optimization Based on Spark for Academic Paper Co-Authorship Prediction
Abstract
:1. Introduction
2. Background and Related Work
2.1. Particle Swarm Optimization Algorithm
2.2. Co-Authorship Prediction
2.3. Apache Spark
2.4. Related Work
3. Experimental Design
3.1. Data
3.2. Experimental Environment
3.3. Fitness Evaluation
3.4. Spark Implementation of Particle Swarm Algorithm
Algorithm 1 Spark-PSO Algorithm |
|
Algorithm 2 calculateFitness Algorithm |
|
Algorithm 3 updateParticle Algorithm |
|
4. Experiment and Result Analysis
Experimental Evaluation Methods and Results
5. Summary and Outlook
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95—International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar]
- Shi, Y. Particle swarm optimization: Developments, applications and resources. In Proceedings of the 2001 Congress on Evolutionary Computation, Seoul, Korea, 27–30 May 2001; Volume 1, pp. 81–86. [Google Scholar]
- Snir, M.; Gropp, W.; Otto, S.; Huss-Lederman, S.; Dongarra, J.; Walker, D. MPI–The Complete Reference: The MPI Core; MIT Press: Cambridg, MA, USA, 1998. [Google Scholar]
- Dean, J.; Ghemawat, S. MapReduce: Simplified data processing on large clusters. Commun. ACM 2008, 51, 107–113. [Google Scholar] [CrossRef]
- McNabb, A.W.; Monson, C.K.; Seppi, K.D. Parallel pso using mapreduce. In Proceedings of the 2007 IEEE Congress on Evolutionary Computation, Singapore, 25–28 September 2007; pp. 7–14. [Google Scholar]
- Sadasivam, G.S.; Selvaraj, D. A novel parallel hybrid PSO-GA using MapReduce to schedule jobs in Hadoop data grids. In Proceedings of the 2010 Second World Congress on Nature and Biologically Inspired Computing (NaBIC), Kitakyushu, Japan, 15–17 December 2010; pp. 377–382. [Google Scholar]
- Wang, J.; Yuan, D.; Jiang, M. Parallel k-pso based on mapreduce. In Proceedings of the 2012 IEEE 14th International Conference on Communication Technology, Chengdu, China, 9–11 November 2012; pp. 1203–1208. [Google Scholar]
- Zaharia, M.; Xin, R.S.; Wendell, P.; Das, T.; Armbrust, M.; Dave, A.; Meng, X.; Rosen, J.; Venkataraman, S.; Franklin, M.J.; et al. Apache spark: A unified engine for big data processing. Commun. ACM 2016, 59, 56–65. [Google Scholar] [CrossRef]
- Guo, X.; Chen, S.; Zhang, Y.; Li, W. Service composition optimization method based on parallel particle swarm algorithm on spark. Secur. Commun. Netw. 2017, 2017, 9097616. [Google Scholar] [CrossRef] [Green Version]
- Duan, Q.; Sun, L.; Shi, Y. Spark clustering computing platform based parallel particle swarm optimizers for computationally expensive global optimization. In Proceedings of the International Conference on Parallel Problem Solving from Nature, Coimbra, Portugal, 8–12 September 2018; Springer: Cham, Switzerland, 2018; pp. 424–435. [Google Scholar]
- Zhang, W.; Huang, Y. Using big data computing framework and parallelized PSO algorithm to construct the reservoir dispatching rule optimization. Soft. Comput. 2020, 24, 8113–8124. [Google Scholar] [CrossRef]
- Sherar, M.; Zulkernine, F. Particle swarm optimization for large-scale clustering on apache spark. In Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence (SSCI), Honolulu, HI, USA, 27 November–1 December 2017; pp. 1–8. [Google Scholar]
- Al-Sawwa, J.; Ludwig, S.A. Parallel particle swarm optimization classification algorithm variant implemented with Apache Spark. Concurr. Comp-Pract. E 2020, 32, e5451. [Google Scholar] [CrossRef]
- Bliss, C.A.; Frank, M.R.; Danforth, C.M.; Dodds, P.S. An evolutionary algorithm approach to link prediction in dynamic social networks. J. Comput. Sci. 2014, 5, 750–764. [Google Scholar] [CrossRef] [Green Version]
- Karau, H.; Konwinski, A.; Wendell, P.; Zaharia, M. Learning Spark: Lightning-Fast Big Data Analysis; O’Reilly Media, Inc.: Newton, MA, USA, 2015. [Google Scholar]
- Li, Y.; Chen, Z.; Wang, Y.; Jiao, L. Quantum-behaved particle swarm optimization using mapreduce. In Proceedings of the International Conference on Bio-Inspired Computing: Theories and Applications, Xi’an, China, 28–30 October 2016; Springer: Singapore, 2016; pp. 173–178. [Google Scholar]
- Aljarah, I.; Ludwig, S.A. Mapreduce intrusion detection system based on a particle swarm optimization clustering algorithm. In Proceedings of the 2013 IEEE Congress on Evolutionary Computation, Cancun, Mexico, 20–23 June 2013; pp. 955–962. [Google Scholar]
- Aljarah, I.; Ludwig, S.A. Towards a scalable intrusion detection system based on parallel pso clustering using mapreduce. In Proceedings of the 15th Annual Conference Companion on Genetic and Evolutionary Computation, Amsterdam, The Netherlands, 6–10 July 2013; ACM: New York, NY, USA, 2013; pp. 169–170. [Google Scholar]
- Chunne, A.P.; Chandrasekhar, U.; Malhotra, C. Real time clustering of tweets using adaptive PSO technique and MapReduce. In Proceedings of the 2015 Global Conference on Communication Technologies (GCCT), Thuckalay, India, 23–24 April 2015; pp. 452–457. [Google Scholar]
- Xu, Y.; You, T. Minimizing thermal residual stresses in ceramic matrix composites by using Iterative MapReduce guided particle swarm optimization algorithm. Compos. Struct. 2013, 99, 388–396. [Google Scholar] [CrossRef]
- Sherkat, E.; Rahgozar, M.; Asadpour, M. Structural link prediction based on ant colony approach in social networks. Phys. Stat. Mech. Appl. 2015, 419, 80–94. [Google Scholar] [CrossRef]
- Barham, R.; Aljarah, I. Link prediction based on whale optimization algorithm. In Proceedings of the 2017 International Conference on New Trends in Computing Sciences (ICTCS), Amman, Jordan, 11–13 October 2017; pp. 55–60. [Google Scholar]
- Shi, Z.; Zuo, W.; Chen, W.; Yue, L.; Han, J.; Feng, L. User relation prediction based on matrix factorization and hybrid particle swarm optimization. In Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, 3–7 April 2017; pp. 1335–1341. [Google Scholar]
- Zhuang, H.; Sun, Y.; Tang, J.; Zhang, J.; Sun, X. Influence maximization in dynamic social networks. In Proceedings of the 2013 IEEE 13th International Conference on Data Mining, Dallas, TX, USA, 7–10 December 2013; pp. 1313–1318. [Google Scholar]
Topological Similarity Indices (Abbreviation) | |
---|---|
Jaccard Index (J) | |
Adamic-Adar Coefficient (A) | |
Common neighbors (C) | |
Average Path Weight (P) | |
Katz (K) | |
Preferential Attachment (Pr) | |
Resource Allocation (R) | |
Hub promoted Index (Hp) | |
Hub depressed Index (Hd) | |
Leicht-Holme-Newman Index (L) | |
Salton Index (Sa) | |
Sorenson Index (So) |
Year | File Name | File Size | Nodes Num | Edges Num | Sparse Matrix File Size |
---|---|---|---|---|---|
1986 | 1986.txt | 1018.9 K | 21,776 | 68,179 | 24.1 G |
1987 | 1987.txt | 1.2 M | 25,224 | 80,253 | 32.4 G |
1988 | 1988.txt | 1.4 M | 29,746 | 95,299 | 45.0 G |
1989 | 1989.txt | 1.5 M | 32,368 | 102,639 | 53.3 G |
1990 | 1990.txt | 1.8 M | 39,004 | 124,185 | 77.4 G |
Hardware | CPU | 40 Intel(R) Xeon(R) Gold 5215 CPU @ 2.50 GHz |
Memory | 240 G | |
Software | Operating System | 18.04.1-Ubuntu |
Spark | 3.0.0 | |
Scala | 2.12.10 | |
Hadoop | 2.10.0 | |
Java Development Kit | 1.8.0_131 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, C.; Zhu, T.; Zhang, Y.; Ning, H.; Chen, L.; Liu, Z. Parallel Particle Swarm Optimization Based on Spark for Academic Paper Co-Authorship Prediction. Information 2021, 12, 530. https://doi.org/10.3390/info12120530
Yang C, Zhu T, Zhang Y, Ning H, Chen L, Liu Z. Parallel Particle Swarm Optimization Based on Spark for Academic Paper Co-Authorship Prediction. Information. 2021; 12(12):530. https://doi.org/10.3390/info12120530
Chicago/Turabian StyleYang, Congmin, Tao Zhu, Yang Zhang, Huansheng Ning, Liming Chen, and Zhenyu Liu. 2021. "Parallel Particle Swarm Optimization Based on Spark for Academic Paper Co-Authorship Prediction" Information 12, no. 12: 530. https://doi.org/10.3390/info12120530
APA StyleYang, C., Zhu, T., Zhang, Y., Ning, H., Chen, L., & Liu, Z. (2021). Parallel Particle Swarm Optimization Based on Spark for Academic Paper Co-Authorship Prediction. Information, 12(12), 530. https://doi.org/10.3390/info12120530