Introduction
  • Breaking the Ice with Warm Welcome!
  • Course's Curriculum - Journey to the excellence!
Section 1 - Apache Spark Introduction and Architecture Deep Dive
  • Apache Spark in the context of Hadoop Evolution
  • Say Hello to Apache Spark - Thorough Dissemination of Capabilities
  • In-Depth Understanding of Spark's Ecosystem of High Level Libraries
  • Apache Spark and its integration within Enterprise Lambda Architecture
  • Apache Spark and where it fits in whole Hadoop Ecosystem
Working with Text Files to create Resilient Distributed Datasets (RDDs) in Spark
  • Setting up development Environment
  • Better Development Environment Employing DataBricks - Part 1 (**New Lecture**)
  • Better Development Environment Employing Databricks - Part 2 (**New Lecture**)
  • Loading Text Files (in HDFS) in Spark to create RDDs
  • Loading All Directory Files (in HDFS) simultaneously in Spark and implications
  • Loading Text Files (in HDFS) in Spark - Continued
  • Using Wildcards to selectively load text files (in HDFS) in Spark and use-cases
  • Real Life Challenge: Different Record Delimiters in Text Files in Spark
  • Solution: Handling Different Record Delimiters in Text Files in Spark
  • T1
Creating RDDs by Distributing Scala Collections in Spark
  • The semantics and implications behind parallelizing Scala Collections
  • Hands-on: Distributing/Parallelizing Scala Collections
Understanding the Partitioning and Distributed Nature of RDDs in Spark
  • How Data gets Partitioned and Distributed in Spark Cluster
  • Accessing Hadoop YARN RM and AM Web UIs to understand RDDs Partitioning
  • Manually Changing Partitions of RDDs in Spark and Implications
Developing Mastery in Spark's Map Transformations and lazy DAG Execution Model
  • Demystifying Spark's Direct Acyclic Graph (DAG) and Lazy Execution Model
  • Introducing Map Transformation - the Swiss Army Knife of Transformations
  • Hands-on: Map Transformation via Scala's Functional Programming constructs
  • Understanding the Potential of Map Transformation to alter RDDs Types
  • Using Your Own Functions, in addition to Anonymous ones, in Map Transformations
Assignment - Using Map Transformation on Real World Big Data Retail Analytics
  • Introducing the Real World Online Retail Data-set and Assignment Challenges
  • Detailed Hands-on Comprehension of Assignment Challenges' Solutions
  • Conceptual Understanding of Distributing Scala Collections and Implications
  • Hands-on Understanding of Distributing Scala Collections and use-cases
Developing Mastery in Spark's Filter Transformation
  • Introducing Filter Transformation and its Powerful Use-Cases
  • Hands on: Spark's Filter Transformation in Action
Assignment - Using Filter and Map on Apache Web Server Logs and Retail Dataset
  • Introducing the Data-sets and Real-World Assignment Challenges
  • Challenge 1: Removing Empty Lines in Web Logs Data-set
  • Challenge 2: Removing Header Line in Retail Data-set
  • Challenge 3: Selecting rows in Retail Data-set Containing Specific Countries
Developing Mastery in RDD of Scala Collections
  • Introducing RDDs of Scala Collections and their Relational Analytics use-cases
  • Transforming Scala Collections using Functional Programming Constructs
  • Creating and Manipulating RDDs of Arrays of String from Different Data Sources
Assignment - Customer Churn Analytics using Apache Spark
  • Introducing the Context, Challenges and Data-set of Customer Churn Use-Case
  • Challenge 1: Finding Number of Unique States in the Data-set
  • Challenge 2: Performing Data Integrity Check on Individual Columns of Data-Set
  • Challenge 3: Finding Summary Statistics on number of Voice Mail Messages
  • Challenge 4: Finding Summary Statistics on Voice Mail in Selected States
  • Challenge 5: Finding Average Value of Total Night Calls Minutes
  • Challenge 6: Finding conditioned Total day calls for customers
  • Challenge 7: Using Scala Functions and Pattern Matching for advanced processing
  • Challenge 8: Finding Churned Customers with International and Voice Mail Plan
  • Challenge 9: Performing Data Quality and Type Checks on Individual Columns
Developing Mastery in Spark's Key-Value (Pair) RDDs
  • Introduction
  • Developing Intuition for Solving Big Data Problems using KeyValue Pair Construct
  • Developing Hands-on Understanding of working with KeyValue RDDs in Spark
  • Proof - Transformations' exclusivity to KeyValue RDDs
  • Transforming Text File Data to Pair RDDs for KeyValue based Data Processing
  • The Case of Different Data Types of "Values" in KeyValue RDDs
  • Transforming Complex Delimited Text File to Pair RDDs for KeyValue Processing
Assignment - Analyzing Video Games (Kaggle Dataset) using Spark's KeyValue RDDs
  • Challenge 1: Determining Frequency Distribution of Video Games Platforms
  • Challenge 2: Finding Total Sales of Each Video Games Platform
  • Challenge 3: Finding Global Sales of Video Games Platform
  • Challenge 4: Maximum Sales Value of Each Gaming Console
  • Challenge 5: Data Ranking - Top 10 platforms by global sales
Developing Mastery in Join Operations on Key Value Pair RDDs in Apache Spark
  • Introducing Join Operations on Relational Data with Examples
  • Getting started with join operation in Spark with Key Value Pair RDDs
  • Working towards complex Join Operations in Apache Spark with advanced indexing
Assignment - A Real Life Relational Dataset about Retail Customers
  • Setting context and developing understanding of relationships in the dataset
  • Challenge 1 - Top 5 states with Most Orders' Status as Cancelled
  • Challenge 2 - Top 5 Cities from CA State with Orders Status as Cancelled
Apache Spark - Advanced Concepts
  • Introducing Caching in RDDs, Motivation and Relation to DAG Based Execution
  • Caching and Persistence in RDDs in Action
  • Technique: Finding and Filtering Dirty Records in Data-Set using Apache Spark
  • Sentiment Analysis of Trump's Tweets using Azure Cognitive Services & Databricks
Bonus Section
  • My lecture to University of Tromso students - When Databases Meet Hadoop
  • Bonus Lecture: Exceptional Discount on My Course(s)/Book(s)