Journals - MOST Wiedzy

TASK Quarterly

INFLUENCE OF YARN SCHEDULERS ON POWER CONSUMPTION AND PROCESSING TIME FOR VARIOUS BIG DATA BENCHMARKS

Abstract

Climate change caused by human activities can influence the lives of everybody on the planet. The environmental concerns must be taken into consideration by all fields of study including ICT. Green Computing aims to reduce negative effects of IT on the environment while, at the same time, maintaining all of the possible benefits it provides. Several Big Data platforms like Apache Spark or YARN have become widely used in analytics and High-Performance Computing systems due to the reliability and usability of Map Reduce implementations. The authors research the power consumption and energy efficiency of Hadoop YARN schedulers using Apache Spark under three different workloads. The test cases include: sorting large binary files, counting unique words in large text files and processing satellite imagery from the Sentinel-2 mission. The presented results show small (2%–11%) but distinct differences in the power consumption of FIFO and FAIR schedulers.

Keywords:

Apache Spark, YARN, Big Data, Green Computing, Sentinel, Tera Sort, word count, benchmarks, scheduler

Details

Issue
Vol. 22 No. 4 (2018)
Section
Research article
Published
2018-12-29
DOI:
https://doi.org/10.17466/tq2018/22.4/c
Licencja:
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Author Biography

ANDRZEJ STEPNOWSKI,
Gdansk University of Technology, Faculty of Electronics, Telecommunications and Informatics



Authors

  • KRZYSZTOF DRYPCZEWSKI

    Gdansk University of Technology, Faculty of Electronics, Telecommunications and Informatics
  • JERZY PROFICZ

    Gdansk University of Technology, Centre of Informatics – Tricity Academic Supercomputer & Network
  • ANDRZEJ STEPNOWSKI

    Gdansk University of Technology, Faculty of Electronics, Telecommunications and Informatics

Download paper