Topical Collection "Parallel and Distributed Computing: Algorithms and Applications"

Editors

Dr. Charalampos Konstantopoulos
E-Mail Website
Collection Editor
Department of Informatics, University of Piraeus, 185 34 Pireas, Greece
Interests: design and analysis of algorithms; parallel and distributed computing; mobile ad hoc networks; sensor networks
Special Issues, Collections and Topics in MDPI journals
Prof. Dr. Grammati Pantziou
E-Mail Website
Collection Editor
Department of Informatics and Computer Engineering, University of West Attica, 122 43 Athens, Greece
Interests: design of algorithms; parallel and distributed computing; pervasive computing; sensor networks; security and privacy issues in pervasive environments
Special Issues, Collections and Topics in MDPI journals

Topical Collection Information

Dear Colleagues,

It is an undeniable fact that parallel and distributed computing is ubiquitous now in nearly all computational scenarios ranging from mainstream computing to high-performance and/or distributed architectures such as cloud architectures and supercomputers. The ever-increasing complexity of parallel/distributed systems requires effective algorithmic techniques for unleashing the enormous computational power of these systems and attaining the promising performance of parallel/distributed computing. Moreover, the new possibilities offered by the high-performance systems pave the way to a new genre of applications that were considered as far-fetched a short while ago.

This Topical Collection is focused on all algorithmic aspects of parallel and distributed computing and applications. Essentially, every scenario where multiple operations or tasks are executed at the same time is within the scope of this Topical Collection. Topics of interest include (but are not limited to) the following:

  • Theoretical aspects of parallel and distributed computing;
  • Design and analysis of parallel and distributed algorithms;
  • Algorithm engineering in parallel and distributed computing;
  • Load balancing and scheduling techniques;
  • Green computing;
  • Algorithms and applications for big data, machine learning and artificial intelligence;
  • Game-theoretic approaches in parallel and distributed computing;
  • Algorithms and applications on GPUs and multicore or manycore platforms;
  • Cloud computing, edge/fog computing, IoT and distributed computing;
  • Scientific computing;
  • Simulation and visualization;
  • Graph and irregular applications.

Dr. Charalampos Konstantopoulos
Prof. Dr. Grammati Pantziou
Collection Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the collection website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Parallel algorithms 
  • Distributed algorithms 
  • GPUs 
  • Multicore and manycore architectures 
  • Supercomputing 
  • Data centers 
  • Big data 
  • Cloud architectures 
  • IoT

Published Papers (9 papers)

2022

Jump to: 2021

Article
A Dynamic Distributed Deterministic Load-Balancer for Decentralized Hierarchical Infrastructures
Algorithms 2022, 15(3), 96; https://doi.org/10.3390/a15030096 - 18 Mar 2022
Viewed by 836
Abstract
In this work, we propose D3-Tree, a dynamic distributed deterministic structure for data management in decentralized networks, by engineering and extending an existing decentralized structure. Conducting an extensive experimental study, we verify that the implemented structure outperforms other well-known hierarchical tree-based [...] Read more.
In this work, we propose D3-Tree, a dynamic distributed deterministic structure for data management in decentralized networks, by engineering and extending an existing decentralized structure. Conducting an extensive experimental study, we verify that the implemented structure outperforms other well-known hierarchical tree-based structures since it provides better complexities regarding load-balancing operations. More specifically, the structure achieves an O(logN) amortized bound (N is the number of nodes present in the network), using an efficient deterministic load-balancing mechanism, which is general enough to be applied to other hierarchical tree-based structures. Moreover, our structure achieves O(logN) worst-case search performance. Last but not least, we investigate the structure’s fault tolerance, which hasn’t been sufficiently tackled in previous work, both theoretically and through rigorous experimentation. We prove that D3-Tree is highly fault-tolerant and achieves O(logN) amortized search cost under massive node failures, accompanied by a significant success rate. Afterwards, by incorporating this novel balancing scheme into the ART (Autonomous Range Tree) structure, we go one step further to achieve sub-logarithmic complexity and propose the ART+ structure. ART+ achieves an O(logb2logN) communication cost for query and update operations (b is a double-exponentially power of 2 and N is the total number of nodes). Moreover, ART+ is a fully dynamic and fault-tolerant structure, which supports the join/leave node operations in O(loglogN) expected WHP (with high proability) number of hops and performs load-balancing in O(loglogN) amortized cost. Full article
Show Figures

Figure 1

Article
Tries-Based Parallel Solutions for Generating Perfect Crosswords Grids
Algorithms 2022, 15(1), 22; https://doi.org/10.3390/a15010022 - 13 Jan 2022
Cited by 1 | Viewed by 733
Abstract
A general crossword grid generation is considered an NP-complete problem and theoretically it could be a good candidate to be used by cryptography algorithms. In this article, we propose a new algorithm for generating perfect crosswords grids (with no black boxes) that relies [...] Read more.
A general crossword grid generation is considered an NP-complete problem and theoretically it could be a good candidate to be used by cryptography algorithms. In this article, we propose a new algorithm for generating perfect crosswords grids (with no black boxes) that relies on using tries data structures, which are very important for reducing the time for finding the solutions, and offers good opportunity for parallelisation, too. The algorithm uses a special tries representation and it is very efficient, but through parallelisation the performance is improved to a level that allows the solution to be obtained extremely fast. The experiments were conducted using a dictionary of almost 700,000 words, and the solutions were obtained using the parallelised version with an execution time in the order of minutes. We demonstrate here that finding a perfect crossword grid could be solved faster than has been estimated before, if we use tries as supporting data structures together with parallelisation. Still, if the size of the dictionary is increased by a lot (e.g., considering a set of dictionaries for different languages—not only for one), or through a generalisation to a 3D space or multidimensional spaces, then the problem still could be investigated for a possible usage in cryptography. Full article
Show Figures

Figure 1

2021

Jump to: 2022

Article
Parallel Computing of Edwards—Anderson Model
Algorithms 2022, 15(1), 13; https://doi.org/10.3390/a15010013 - 27 Dec 2021
Cited by 2 | Viewed by 884
Abstract
A scheme for parallel computation of the two-dimensional Edwards—Anderson model based on the transfer matrix approach is proposed. Free boundary conditions are considered. The method may find application in calculations related to spin glasses and in quantum simulators. Performance data are given. The [...] Read more.
A scheme for parallel computation of the two-dimensional Edwards—Anderson model based on the transfer matrix approach is proposed. Free boundary conditions are considered. The method may find application in calculations related to spin glasses and in quantum simulators. Performance data are given. The scheme of parallelisation for various numbers of threads is tested. Application to a quantum computer simulator is considered in detail. In particular, a parallelisation scheme of work of quantum computer simulator. Full article
Show Figures

Figure 1

Article
An O(log2N) Fully-Balanced Resampling Algorithm for Particle Filters on Distributed Memory Architectures
Algorithms 2021, 14(12), 342; https://doi.org/10.3390/a14120342 - 26 Nov 2021
Cited by 1 | Viewed by 1397
Abstract
Resampling is a well-known statistical algorithm that is commonly applied in the context of Particle Filters (PFs) in order to perform state estimation for non-linear non-Gaussian dynamic models. As the models become more complex and accurate, the run-time of PF applications becomes increasingly [...] Read more.
Resampling is a well-known statistical algorithm that is commonly applied in the context of Particle Filters (PFs) in order to perform state estimation for non-linear non-Gaussian dynamic models. As the models become more complex and accurate, the run-time of PF applications becomes increasingly slow. Parallel computing can help to address this. However, resampling (and, hence, PFs as well) necessarily involves a bottleneck, the redistribution step, which is notoriously challenging to parallelize if using textbook parallel computing techniques. A state-of-the-art redistribution takes O((log2N)2) computations on Distributed Memory (DM) architectures, which most supercomputers adopt, whereas redistribution can be performed in O(log2N) on Shared Memory (SM) architectures, such as GPU or mainstream CPUs. In this paper, we propose a novel parallel redistribution for DM that achieves an O(log2N) time complexity. We also present empirical results that indicate that our novel approach outperforms the O((log2N)2) approach. Full article
Show Figures

Figure 1

Article
Parallel Implementation of the Algorithm to Compute Forest Fire Impact on Infrastructure Facilities of JSC Russian Railways
Algorithms 2021, 14(11), 333; https://doi.org/10.3390/a14110333 - 15 Nov 2021
Cited by 1 | Viewed by 803
Abstract
Forest fires have a negative impact on the economy in a number of regions, especially in Wildland Urban Interface (WUI) areas. An important link in the fight against fires in WUI areas is the development of information and computer systems for predicting the [...] Read more.
Forest fires have a negative impact on the economy in a number of regions, especially in Wildland Urban Interface (WUI) areas. An important link in the fight against fires in WUI areas is the development of information and computer systems for predicting the fire safety of infrastructural facilities of Russian Railways. In this work, a numerical study of heat transfer processes in the enclosing structure of a wooden building near the forest fire front was carried out using the technology of parallel computing. The novelty of the development is explained by the creation of its own program code, which is planned to be put into operation either in the Information System for Remote Monitoring of Forest Fires ISDM-Rosleskhoz, or in the information and computing system of JSC Russian Railways. In the Russian Federation, it is forbidden to use foreign systems in the security services of industrial facilities. The implementation of the deterministic model of heat transfer in the enclosing structure with the complexity of the algorithm O (2N2 + 2K) is presented. The program is implemented in Python 3.x using the NumPy and Concurrent libraries. Calculations were carried out on a multiprocessor cluster in the Sirius University of Science and Technology. The results of calculations and the acceleration coefficient for operating modes for 1, 2, 4, 8, 16, 32, 48 and 64 processes are presented. The developed algorithm can be applied to assess the fire safety of infrastructure facilities of Russian Railways. The main merit of the new development should be noted, which is explained by the ability to use large computational domains with a large number of computational grid nodes in space and time. The use of caching intermediate data in files made it possible to distribute a large number of computational nodes among the processors of a computing multiprocessor system. However, one should also note a drawback; namely, a decrease in the acceleration of computational operations with a large number of involved nodes of a multiprocessor computing system, which is explained by the write and read cycles in cache files. Full article
Show Figures

Figure 1

Article
Load Balancing Strategies for Slice-Based Parallel Versions of JEM Video Encoder
Algorithms 2021, 14(11), 320; https://doi.org/10.3390/a14110320 - 01 Nov 2021
Viewed by 573
Abstract
The proportion of video traffic on the internet is expected to reach 82% by 2022, mainly due to the increasing number of consumers and the emergence of new video formats with more demanding features (depth, resolution, multiview, 360, etc.). Efforts are therefore being [...] Read more.
The proportion of video traffic on the internet is expected to reach 82% by 2022, mainly due to the increasing number of consumers and the emergence of new video formats with more demanding features (depth, resolution, multiview, 360, etc.). Efforts are therefore being made to constantly improve video compression standards to minimize the necessary bandwidth while retaining high video quality levels. In this context, the Joint Collaborative Team on Video Coding has been analyzing new video coding technologies to improve the compression efficiency with respect to the HEVC video coding standard. A software package known as the Joint Exploration Test Model has been proposed to implement and evaluate new video coding tools. In this work, we present parallel versions of the JEM encoder that are particularly suited for shared memory platforms, and can significantly reduce its huge computational complexity. The proposed parallel algorithms are shown to achieve high levels of parallel efficiency. In particular, in the All Intra coding mode, the best of our proposed parallel versions achieves an average efficiency value of 93.4%. They also had high levels of scalability, as shown by the inclusion of an automatic load balancing mechanism. Full article
Show Figures

Figure 1

Article
A Parallel Algorithm for Dividing Octonions
Algorithms 2021, 14(11), 309; https://doi.org/10.3390/a14110309 - 24 Oct 2021
Viewed by 587
Abstract
The article presents a parallel hardware-oriented algorithm designed to speed up the division of two octonions. The advantage of the proposed algorithm is that the number of real multiplications is halved as compared to the naive method for implementing this operation. In the [...] Read more.
The article presents a parallel hardware-oriented algorithm designed to speed up the division of two octonions. The advantage of the proposed algorithm is that the number of real multiplications is halved as compared to the naive method for implementing this operation. In the synthesis of the discussed algorithm, the matrix representation of this operation was used, which allows us to present the division of octonions by means of a vector–matrix product. Taking into account a specific structure of the matrix multiplicand allows for reducing the number of real multiplications necessary for the execution of the octonion division procedure. Full article
Show Figures

Figure 1

Article
Rough Estimator Based Asynchronous Distributed Super Points Detection on High Speed Network Edge
by and
Algorithms 2021, 14(10), 277; https://doi.org/10.3390/a14100277 - 25 Sep 2021
Viewed by 678
Abstract
Super points detection plays an important role in network research and application. With the increase of network scale, distributed super points detection has become a hot research topic. The key point of super points detection in a multi-node distributed environment is how to [...] Read more.
Super points detection plays an important role in network research and application. With the increase of network scale, distributed super points detection has become a hot research topic. The key point of super points detection in a multi-node distributed environment is how to reduce communication overhead. Therefore, this paper proposes a three-stage communication algorithm to detect super points in a distributed environment, Rough Estimator based Asynchronous Distributed super points detection algorithm (READ). READ uses a lightweight estimator, the Rough Estimator (RE), which is fast in computation and takes less memory to generate candidate super points. Meanwhile, the famous Linear Estimator (LE) is applied to accurately estimate the cardinality of each candidate super point, so as to detect the super point correctly. In READ, each node scans IP address pairs asynchronously. When reaching the time window boundary, READ starts three-stage communication to detect the super point. This paper proves that the accuracy of READ in a distributed environment is no less than that in the single-node environment. Four groups of 10 Gb/s and 40 Gb/s real-world high-speed network traffic are used to test READ. The experimental results show that READ not only has high accuracy in a distributed environment, but also has less than 5% of communication burden compared with existing algorithms. Full article
Show Figures

Figure 1

Article
Accelerating In-Transit Co-Processing for Scientific Simulations Using Region-Based Data-Driven Analysis
Algorithms 2021, 14(5), 154; https://doi.org/10.3390/a14050154 - 12 May 2021
Viewed by 956
Abstract
Increasing processing capabilities and input/output constraints of supercomputers have increased the use of co-processing approaches, i.e., visualizing and analyzing data sets of simulations on the fly. We present a method that evaluates the importance of different regions of simulation data and a data-driven [...] Read more.
Increasing processing capabilities and input/output constraints of supercomputers have increased the use of co-processing approaches, i.e., visualizing and analyzing data sets of simulations on the fly. We present a method that evaluates the importance of different regions of simulation data and a data-driven approach that uses the proposed method to accelerate in-transit co-processing of large-scale simulations. We use the importance metrics to simultaneously employ multiple compression methods on different data regions to accelerate the in-transit co-processing. Our approach strives to adaptively compress data on the fly and uses load balancing to counteract memory imbalances. We demonstrate the method’s efficiency through a fluid mechanics application, a Richtmyer–Meshkov instability simulation, showing how to accelerate the in-transit co-processing of simulations. The results show that the proposed method expeditiously can identify regions of interest, even when using multiple metrics. Our approach achieved a speedup of 1.29× in a lossless scenario. The data decompression time was sped up by 2× compared to using a single compression method uniformly. Full article
Show Figures

Figure 1

Back to TopTop