Thank You and Let's Get Started
  • Course Structure
  • Tools & Setup (Windows)
  • Tools & Setup (Linux)
Introduction To Big Data
  • What is Big Data?
  • Understanding Big Data Problem
  • History of Hadoop
  • Test your understanding of Big Data
HDFS
  • HDFS - Why Another Filesystem?
  • Blocks
  • Working With HDFS
  • HDFS - Read & Write
  • HDFS - Read & Write (Program)
  • Test your understanding of HDFS
  • HDFS Assignment
MapReduce
  • Introduction to MapReduce
  • Dissecting MapReduce Components
  • Dissecting MapReduce Program (Part 1)
  • Dissecting MapReduce Program (Part 2)
  • Combiner
  • Counters
  • Facebook - Mutual Friends
  • New York Times - Time Machine
  • Test your understanding of MapReduce
  • MapReduce Assignment
Apache Pig
  • Introduction to Apache Pig
  • Loading & Projecting Datasets
  • Solving a Problem
  • Complex Types
  • Pig Latin - Joins
  • Million Song Dataset (Part 1)
  • Million Song Dataset (Part 2)
  • Page Ranking (Part 1)
  • Page Ranking (Part 2)
  • Page Ranking (Part 3)
  • Test your understanding of Apache Pig
  • Apache Pig Assignment
Apache Hive
  • Introduction to Apache Hive
  • Dissect a Hive Table
  • Loading Hive Tables
  • Simple Selects
  • Managed Table vs. External Table
  • Order By vs. Sort By vs. Cluster By
  • Partitions
  • Buckets
  • Hive QL - Joins
  • Twitter (Part 1)
  • Twitter (Part 2)
  • Test your understanding of Apache Hive
  • Apache Hive Assignment
Hive Window and Analytical Functions
  • Introduction to Hive Window and Analytical functions
  • Kickstarter campaign duplicates and top campaigns
  • Kickstarter campaign bands and user sessions
Architechture
  • HDFS Architechture
  • Secondary Namenode
  • Highly Available Hadoop
  • MRv1 Architechture
  • YARN
  • Test your understanding of Hadoop Architechture
Cluster Setup
  • Vendors & Hosting
  • Cluster Setup (Part 1)
  • Cluster Setup (Part 2)
  • Cluster Setup (Part 3)
  • Amazon EMR
  • Test your understanding of Cluster Setup
Hadoop Administrator In Real World (Preview)
  • Cloudera Manager - Introduction
  • Cloudera Manager - Installation
File Formats
  • Compression
  • Sequence File
  • AVRO
  • File Formats - Pig
  • File Formats - Hive
  • Introduction to RCFile
  • Working with RCFile
  • Introduction to ORC
  • Working with ORC
  • Parquet - Another Columnar Format
  • Avro Schema and It's Importance
  • Schema Evolution in Avro (Part 1)
  • Schema Evolution in Avro (Part 2)
  • Test your understanding of File Formats
Troubleshooting and Optimizations
  • Exploring Logs
  • MRUnit
  • MapReduce Tuning
  • Pig Join Optimizations (Part 1)
  • Pig Join Optimizations (Part 2)
  • Hive Join Optimizations
  • Test your understanding of Troubleshooting & Optimizations
Apache Sqoop