Introduction, and Getting Started
  • Introduction
  • Udemy 101: Getting the Most From This Course
  • Note: Alternate download link for the MovieLens data set
  • Getting Started - Run your First MapReduce Program!
Understanding MapReduce
  • MapReduce Basic Concepts
  • A quick note on file names.
  • Walkthrough of Rating Histogram Code
  • Understanding How MapReduce Scales / Distributed Computing
  • Average Friends by Age Example: Part 1
  • Average Friends by Age Example: Part 2
  • Minimum Temperature By Location Example
  • Maximum Temperature By Location Example
  • Word Frequency in a Book Example
  • Making the Word Frequency Mapper Better with Regular Expressions
  • Sorting the Word Frequency Results Using Multi-Stage MapReduce Jobs
  • Activity: Design a Mapper and Reducer for Total Spent by Customer
  • Activity: Write Code for Total Spent by Customer
  • Compare Your Code to Mine. Activity: Sort Results by Amount Spent
  • Compare your Code to Mine for Sorted Results.
  • Combiners
Advanced MapReduce Examples
  • Example: Most Popular Movie
  • Including Ancillary Lookup Data in the Example
  • Example: Most Popular Superhero, Part 1
  • Example: Most Popular Superhero, Part 2
  • Example: Degrees of Separation: Concepts
  • Degrees of Separation: Preprocessing the Data
  • Degrees of Separation: Code Walkthrough
  • Degrees of Separation: Running and Analyzing the Results
  • Example: Similar Movies Based on Ratings: Concepts
  • Similar Movies: Code Walkthrough
  • Similar Movies: Running and Analyzing the Results
  • Learning Activity: Improving our Movie Similarities MapReduce Job
Using Hadoop and Elastic MapReduce
  • Fundamental Concepts of Hadoop
  • The Hadoop Distributed File System (HDFS)
  • Apache YARN
  • Hadoop Streaming: How Hadoop Runs your Python Code
  • Setting Up Your Amazon Elastic MapReduce Account
  • Linking Your EMR Account with MRJob
  • Exercise: Run Movie Recommendations on Elastic MapReduce
  • Analyze the Results of Your EMR Job
Advanced Hadoop and EMR
  • Distributed Computing Fundamentals
  • Activity: Running Movie Similarities on Four Machines
  • Analyzing the Results of the 4-Machine Job
  • Troubleshooting Hadoop Jobs with EMR and MRJob, Part 1
  • Troubleshooting Hadoop Jobs, Part 2
  • ml-1m Dataset: Alternate Download Link
  • Analyzing One Million Movie Ratings Across 16 Machines, Part 1
  • Analyzing One Million Movie Ratings Across 16 Machines, Part 2
Other Hadoop Technologies
  • Introducing Apache Hive
  • Introducing Apache Pig
  • Apache Spark: Concepts
  • Spark Example: Part 1
  • Spark Example: Part 2
  • Congratulations!
Where to Go from Here
  • Bonus Lecture: More courses to explore!