Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (2)

Search Parameters:
Keywords = TeraSort

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 3186 KB  
Article
Adaptive Multi-Criteria Selection for Efficient Resource Allocation in Frugal Heterogeneous Hadoop Clusters
by Basit Qureshi
Electronics 2024, 13(10), 1836; https://doi.org/10.3390/electronics13101836 - 9 May 2024
Viewed by 1982
Abstract
Efficient resource allocation is crucial in clusters with frugal Single-Board Computers (SBCs) possessing limited computational resources. These clusters are increasingly being deployed in edge computing environments in resource-constrained settings where energy efficiency and cost-effectiveness are paramount. A major challenge in Hadoop scheduling is [...] Read more.
Efficient resource allocation is crucial in clusters with frugal Single-Board Computers (SBCs) possessing limited computational resources. These clusters are increasingly being deployed in edge computing environments in resource-constrained settings where energy efficiency and cost-effectiveness are paramount. A major challenge in Hadoop scheduling is load balancing, as frugal nodes within the cluster can become overwhelmed, resulting in degraded performance and frequent occurrences of out-of-memory errors, ultimately leading to job failures. In this study, we introduce an Adaptive Multi-criteria Selection for Efficient Resource Allocation (AMS-ERA) in Frugal Heterogeneous Hadoop Clusters. Our criterion considers CPU, memory, and disk requirements for jobs and aligns the requirements with available resources in the cluster for optimal resource allocation. To validate our approach, we deploy a heterogeneous SBC-based cluster consisting of 11 SBC nodes and conduct several experiments to evaluate the performance using Hadoop wordcount and terasort benchmark for various workload settings. The results are compared to the Hadoop-Fair, FOG, and IDaPS scheduling strategies. Our results demonstrate a significant improvement in performance with the proposed AMS-ERA, reducing execution time by 27.2%, 17.4%, and 7.6%, respectively, using terasort and wordcount benchmarks. Full article
Show Figures

Figure 1

19 pages, 788 KB  
Article
Optimization Techniques for a Distributed In-Memory Computing Platform by Leveraging SSD
by June Choi, Jaehyun Lee, Jik-Soo Kim and Jaehwan Lee
Appl. Sci. 2021, 11(18), 8476; https://doi.org/10.3390/app11188476 - 13 Sep 2021
Cited by 1 | Viewed by 3799
Abstract
In this paper, we present several optimization strategies that can improve the overall performance of the distributed in-memory computing system, “Apache Spark”. Despite its distributed memory management capability for iterative jobs and intermediate data, Spark has a significant performance degradation problem when the [...] Read more.
In this paper, we present several optimization strategies that can improve the overall performance of the distributed in-memory computing system, “Apache Spark”. Despite its distributed memory management capability for iterative jobs and intermediate data, Spark has a significant performance degradation problem when the available amount of main memory (DRAM, typically used for data caching) is limited. To address this problem, we leverage an SSD (solid-state drive) to supplement the lack of main memory bandwidth. Specifically, we present an effective optimization methodology for Apache Spark by collectively investigating the effects of changing the capacity fraction ratios of the shuffle and storage spaces in the “Spark JVM Heap Configuration” and applying different “RDD Caching Policies” (e.g., SSD-backed memory caching). Our extensive experimental results show that by utilizing the proposed optimization techniques, we can improve the overall performance by up to 42%. Full article
(This article belongs to the Special Issue Big Data Management and Analysis with Distributed or Cloud Computing)
Show Figures

Figure 1

Back to TopTop