Next Article in Journal
MapReduce Algorithm for Location Recommendation by Using Area Skyline Query
Next Article in Special Issue
New and Efficient Algorithms for Producing Frequent Itemsets with the Map-Reduce Framework
Previous Article in Journal
Vibration Suppression of a Flexible-Joint Robot Based on Parameter Identification and Fuzzy PID Control
Article Menu

Export Article

Open AccessArticle
Algorithms 2018, 11(12), 190; https://doi.org/10.3390/a11120190

Best Trade-Off Point Method for Efficient Resource Provisioning in Spark

Department of Computer Engineering, School of Engineering, Santa Clara University, 500 El Camino Real, Santa Clara, CA 95053, USA
Received: 20 September 2018 / Revised: 11 November 2018 / Accepted: 16 November 2018 / Published: 22 November 2018
(This article belongs to the Special Issue MapReduce for Big Data)
Full-Text   |   PDF [1953 KB, uploaded 22 November 2018]   |  

Abstract

Considering the recent exponential growth in the amount of information processed in Big Data, the high energy consumed by data processing engines in datacenters has become a major issue, underlining the need for efficient resource allocation for more energy-efficient computing. We previously proposed the Best Trade-off Point (BToP) method, which provides a general approach and techniques based on an algorithm with mathematical formulas to find the best trade-off point on an elbow curve of performance vs. resources for efficient resource provisioning in Hadoop MapReduce. The BToP method is expected to work for any application or system which relies on a trade-off elbow curve, non-inverted or inverted, for making good decisions. In this paper, we apply the BToP method to the emerging cluster computing framework, Apache Spark, and show that its performance and energy consumption are better than Spark with its built-in dynamic resource allocation enabled. Our Spark-Bench tests confirm the effectiveness of using the BToP method with Spark to determine the optimal number of executors for any workload in production environments where job profiling for behavioral replication will lead to the most efficient resource provisioning. View Full-Text
Keywords: Apache Spark; Hadoop MapReduce; YARN; algorithm for best trade-off point; optimization; resource provisioning; performance efficiency; energy efficiency; elbow curve Apache Spark; Hadoop MapReduce; YARN; algorithm for best trade-off point; optimization; resource provisioning; performance efficiency; energy efficiency; elbow curve
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Nghiem, P.P. Best Trade-Off Point Method for Efficient Resource Provisioning in Spark. Algorithms 2018, 11, 190.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Algorithms EISSN 1999-4893 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top