- Course Structure
- Tools & Setup (Windows)
- Tools & Setup (Linux)
- What is Big Data?
- Understanding Big Data Problem
- History of Hadoop
- Test your understanding of Big Data
- HDFS - Why Another Filesystem?
- Blocks
- Working With HDFS
- HDFS - Read & Write
- HDFS - Read & Write (Program)
- Test your understanding of HDFS
- HDFS Assignment
- Introduction to MapReduce
- Dissecting MapReduce Components
- Dissecting MapReduce Program (Part 1)
- Dissecting MapReduce Program (Part 2)
- Combiner
- Counters
- Facebook - Mutual Friends
- New York Times - Time Machine
- Test your understanding of MapReduce
- MapReduce Assignment
- Introduction to Apache Pig
- Loading & Projecting Datasets
- Solving a Problem
- Complex Types
- Pig Latin - Joins
- Million Song Dataset (Part 1)
- Million Song Dataset (Part 2)
- Page Ranking (Part 1)
- Page Ranking (Part 2)
- Page Ranking (Part 3)
- Test your understanding of Apache Pig
- Apache Pig Assignment
- Introduction to Apache Hive
- Dissect a Hive Table
- Loading Hive Tables
- Simple Selects
- Managed Table vs. External Table
- Order By vs. Sort By vs. Cluster By
- Partitions
- Buckets
- Hive QL - Joins
- Twitter (Part 1)
- Twitter (Part 2)
- Test your understanding of Apache Hive
- Apache Hive Assignment
- Introduction to Hive Window and Analytical functions
- Kickstarter campaign duplicates and top campaigns
- Kickstarter campaign bands and user sessions
- HDFS Architechture
- Secondary Namenode
- Highly Available Hadoop
- MRv1 Architechture
- YARN
- Test your understanding of Hadoop Architechture
- Vendors & Hosting
- Cluster Setup (Part 1)
- Cluster Setup (Part 2)
- Cluster Setup (Part 3)
- Amazon EMR
- Test your understanding of Cluster Setup
- Cloudera Manager - Introduction
- Cloudera Manager - Installation
- Compression
- Sequence File
- AVRO
- File Formats - Pig
- File Formats - Hive
- Introduction to RCFile
- Working with RCFile
- Introduction to ORC
- Working with ORC
- Parquet - Another Columnar Format
- Avro Schema and It's Importance
- Schema Evolution in Avro (Part 1)
- Schema Evolution in Avro (Part 2)
- Test your understanding of File Formats
- Exploring Logs
- MRUnit
- MapReduce Tuning
- Pig Join Optimizations (Part 1)
- Pig Join Optimizations (Part 2)
- Hive Join Optimizations
- Test your understanding of Troubleshooting & Optimizations