Correct Answer: In a mapreduce job the master pings each worker periodically In case a worker does not respond to that system then the system is marked as failed Even completed tasks are rescheduled because the output was stored in a in a local disk of a worker which failed Hence mapreduce is able to handle large-scale failures easily by simply restarting a task The master node always saves itself at checkpoints and in case of any failure it simply restarts from that checkpoint