You, Us & This Course
  • You, Us & This Course
Introducing Hive
  • Hive: An Open-Source Data Warehouse
  • Hive and Hadoop
  • Hive vs Traditional Relational DBMS
  • HiveQL and SQL
Hadoop and Hive Install
  • Hadoop Install Modes
  • Hadoop Install Step 1 : Standalone Mode
  • Hadoop Install Step 2 : Pseudo-Distributed Mode
  • Hive install
  • Code-Along: Getting started
Hadoop and HDFS Overview
  • What is Hadoop?
  • HDFS or the Hadoop Distributed File System
Hive Basics
  • Primitive Datatypes
  • Collections_Arrays_Maps
  • Structs and Unions
  • Create Table
  • Insert Into Table
  • Insert into Table 2
  • Alter Table
  • HDFS
  • HDFS CLI - Interacting with HDFS
  • Code-Along: Create Table
  • Code-Along : Hive CLI
Built-in Functions
  • Three types of Hive functions
  • The Case-When statement, the Size function, the Cast function
  • The Explode function
  • Code-Along : Hive Built - in functions
Sub-Queries
  • Quirky Sub-Queries
  • More on subqueries: Exists and In
  • Inserting via subqueries
  • Code-Along : Use Subqueries to work with Collection Datatypes
  • Views
Partitioning
  • Indices
  • Partitioning Introduced
  • The Rationale for Partitioning
  • How Tables are Partitioned
  • Using Partitioned Tables
  • Dynamic Partitioning: Inserting data into partitioned tables
  • Code-Along : Partitioning
Bucketing
  • Introducing Bucketing
  • The Advantages of Bucketing
  • How Tables are Bucketed
  • Using Bucketed Tables
  • Sampling
Windowing
  • Windowing Introduced
  • Windowing - A Simple Example: Cumulative Sum
  • Windowing - A More Involved Example: Partitioning
  • Windowing - Special Aggregation Functions
Understanding MapReduce
  • The basic philosophy underlying MapReduce
  • MapReduce - Visualized and Explained
  • MapReduce - Digging a little deeper at every step
MapReduce logic for queries: Behind the scenes
  • MapReduce Overview: Basic Select-From-Where
  • MapReduce Overview: Group-By and Having
  • MapReduce Overview: Joins
Join Optimizations in Hive
  • Improving Join performance with tables of different sizes
  • The Where clause in Joins
  • The Left Semi Join
  • Map Side Joins: The Inner Join
  • Map Side Joins: The Left, Right and Full Outer Joins
  • Map Side Joins: The Bucketed Map Join and the Sorted Merge Join
Custom Functions in Python
  • Custom functions in Python
  • Code-Along : Custom Function in Python
Custom functions in Java
  • Introducing UDFs - you're not limited by what Hive offers
  • The Simple UDF: The standard function for primitive types
  • The Simple UDF: Java implementation for replacetext()
  • Generic UDFs, the Object Inspector and DeferredObjects
  • The Generic UDF: Java implementation for containsstring()
  • The UDAF: Custom aggregate functions can get pretty complex
  • The UDAF: Java implementation for max()
  • The UDAF: Java implementation for Standard Deviation
  • The Generic UDTF: Custom table generating functions
  • The Generic UDTF: Java implementation for namesplit()
SQL Primer - Select Statemets
  • Select Statements
  • Select Statements 2
  • Operator Functions
SQL Primer - Group By, Order By and Having
  • Aggregation Operators Introduced
  • The Group By Clause
  • More Group By Examples
  • Order By
  • Having
SQL Primer - Joins
  • Introduction to SQL Joins
  • Cross Joins aka Cartesian Joins