About today’s class

Scaling out: MapReduce, Hadoop, distributed filesystems, Hadoop Streaming.

Readings

Readings for this lecture (to be completed before this class):
- Wolohan Ch.7
- Ghemawat et.al - The Google File System
- Dean, Ghemawat - MapReduce

Slides

The slides for today’s lesson are available online as an HTML file. You can also click in the slides below and navigate through them with your left and right arrow keys.

Lab

Starting a cluster, running a Hadoop job with EMR on AWS. The lab for today’s lesson is available online as an HTML file.

Assignment

GitHub Classroom Link