About today’s class
Scaling out: MapReduce, Hadoop, distributed filesystems, Hadoop Streaming.
Readings
Readings for this lecture (to be completed before this class):
- Wolohan Ch.7
- Ghemawat et.al - The Google File System
- Dean, Ghemawat - MapReduce
Slides
The slides for today’s lesson are available online as an HTML file. You can also click in the slides below and navigate through them with your left and right arrow keys.
Lab
Starting a cluster, running a Hadoop job with EMR on AWS. The lab for today’s lesson is available online as an HTML file.