Introduction to Course
  • Introduction
  • Course Overview
  • Frequently Asked Questions
  • What is Spark? Why Python?
Setting up Python with Spark
  • Set-up Overview
  • Note on Installation Sections
Databricks Setup
  • Recommended Setup
  • Databricks Setup
Local VirtualBox Set-up
  • Local Installation VirtualBox Part 1
  • Local Installation VirtualBox Part 2
  • Setting up PySpark
AWS EC2 PySpark Set-up
  • AWS EC2 Set-up Guide
  • Creating the EC2 Instance
  • SSH with Mac or Linux
  • Installations on EC2
AWS EMR Cluster Setup
  • AWS EMR Setup
Python Crash Course
  • Introduction to Python Crash Course
  • Jupyter Notebook Overview
  • Python Crash Course Part One
  • Python Crash Course Part Two
  • Python Crash Course Part Three
  • Python Crash Course Exercises
  • Python Crash Course Exercise Solutions
Spark DataFrame Basics
  • Introduction to Spark DataFrames
  • Spark DataFrame Basics
  • Spark DataFrame Basics Part Two
  • Spark DataFrame Basic Operations
  • Groupby and Aggregate Operations
  • Missing Data
  • Dates and Timestamps
Spark DataFrame Project Exercise
  • DataFrame Project Exercise
  • DataFrame Project Exercise Solutions
Introduction to Machine Learning with MLlib
  • Introduction to Machine Learning and ISLR
  • Machine Learning with Spark and Python with MLlib
Linear Regression
  • Linear Regression Theory and Reading
  • Linear Regression Documentation Example
  • Regression Evaluation
  • Linear Regression Example Code Along
  • Linear Regression Consulting Project
  • Linear Regression Consulting Project Solutions
Logistic Regression
  • Logistic Regression Theory and Reading
  • Logistic Regression Example Code Along
  • Logistic Regression Code Along
  • Logistic Regression Consulting Project
  • Logistic Regression Consulting Project Solutions
Decision Trees and Random Forests
  • Tree Methods Theory and Reading
  • Tree Methods Documentation Examples
  • Decision Tress and Random Forest Code Along Examples
  • Random Forest - Classification Consulting Project
  • Random Forest Classification Consulting Project Solutions
K-means Clustering
  • K-means Clustering Theory and Reading
  • KMeans Clustering Documentation Example
  • Clustering Example Code Along
  • Clustering Consulting Project
  • Clustering Consulting Project Solutions
Collaborative Filtering for Recommender Systems
  • Introduction to Recommender Systems
  • Recommender System - Code Along Project
Natural Language Processing
  • Introduction to Natural Language Processing
  • NLP Tools Part One
  • NLP Tools Part Two
  • Natural Language Processing Code Along Project
Spark Streaming with Python
  • Introduction to Streaming with Spark!
  • Spark Streaming Documentation Example
  • Spark Streaming Twitter Project - Part
  • Spark Streaming Twitter Project - Part Two
  • Spark Streaming Twitter Project - Part Three
Bonus
  • Bonus Lecture: