Comparing Spark vs MPI/OpenMP On Word Count MapReduce

11/12/2018
by   Junhao Li, et al.
1

Spark provides an in-memory implementation of MapReduce that is widely used in the big data industry. MPI/OpenMP is a popular framework for high performance parallel computing. This paper presents a high performance MapReduce design in MPI/OpenMP and uses that to compare with Spark on the classic word count MapReduce task. My result shows that the MPI/OpenMP MapReduce outperforms Apache Spark by about 300

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/22/2023

MPI Application Binary Interface Standardization

MPI is the most widely used interface for high-performance computing (HP...
research
04/30/2018

Performance Evaluation of an Algorithm-based Asynchronous Checkpoint-Restart Fault Tolerant Application Using Mixed MPI/GPI-2

One of the hardest challenges of the current Big Data landscape is the l...
research
02/14/2020

Big Data Staging with MPI-IO for Interactive X-ray Science

New techniques in X-ray scattering science experiments produce large dat...
research
09/19/2023

Julia as a unifying end-to-end workflow language on the Frontier exascale system

We evaluate using Julia as a single language and ecosystem paradigm powe...
research
05/21/2022

MapReduce for Counting Word Frequencies with MPI and GPUs

In this project, the goal was to use the Julia programming language and ...
research
05/13/2018

Building Near-Real-Time Processing Pipelines with the Spark-MPI Platform

Advances in detectors and computational technologies provide new opportu...
research
09/23/2015

A shared memory implementation of pipelined Parareal

The paper introduces an OpenMP implementation of pipelined Parareal and ...

Please sign up or login with your details

Forgot password? Click here to reset