Research About MapReduce

In this blog, I mainly talk about three important research issues for MapReduce framework which are:

  • Job scheduling for minimizing the total response time
  • Data locality issue
  • Speculative Execution

For each issue I will make two categories which are theoretical analysis based optimization and heuristic based algorithm design. Hope you can get something useful from this summary.


Job Scheduling

Theoretical Analysis based:

Heuristic based:

I haven’t read any papers which present heuristic-based algorithm to optimize the job completion time in MapReduce system.


Data locality

Theoretical Analysis based:

Heuristic based:


Speculative execution

Heuristic based:

1
2
3
In our following research, we can consider to optimize 
the job scheduling in a heterogeneous environment where
the machines are not identical.

Comments