Introduction
  • Instructor and Course Introduction
  • Pre-requisites - What you'll need for this course
  • Course Objectives
  • Course Content, Convention and Resources
AWS Serverless Analytics and Data Lake Basics
  • Section Agenda
  • What is Serverless Computing ?
  • Basics of AWS Serverless Data Lake Architecture
Amazon S3 - Test-Data Setup
  • Section Agenda
  • Lab: Sample Data Setup on Amazon S3
  • Lab: Amazon S3 - Analytics Configuration
Amazon Redshift - Cluster and Sample Data Setup
  • Section Agenda
  • Amazon Redshift - Introduction and Pre-requisites
  • Amazon Redshift - Developing a Redshift Cluster
  • Amazon Redshift - Installing Client Tools
  • Amazon Redshift - Installing Sample Data
AWS Glue - Architecture and Setup
  • Section Agenda
  • AWS Glue - Architecture
  • AWS Glue - Terminology
  • AWS Glue - Applications
  • AWS Glue - Internals
  • AWS Glue - Cost
  • Lab: AWS Glue - Security and Privileges Setup
  • AWS Glue - Advance Network Configuration
  • Lab: AWS Glue - Advance Network Configuration
AWS Glue - Database Objects
  • Section Agenda
  • AWS Glue - Data Catalog
  • Lab: AWS Glue - Databases
  • AWS Glue - Tables
  • AWS Glue - Designing Tables
AWS Glue - Crawlers
  • Section Agenda
  • AWS Glue - Introduction to Crawlers
  • Lab - Introduction to AWS Glue Classifiers
  • Lab 1 - AWS Glue - Developing Data Catalog with Crawlers
  • Lab 2 - AWS Glue - Developing Data Catalog with Crawlers
  • Lab 3 - AWS Glue - Developing Data Catalog with Crawlers
  • Lab 4 - AWS Glue - Developing Data Catalog with Crawlers
  • Lab 5 - AWS Glue - Developing Data Catalog with Crawlers
  • Lab 6 - AWS Glue - Developing Data Catalog with Crawlers
  • Lab 7 - AWS Glue - Developing Data Catalog with Crawlers
AWS Glue - ETL Jobs
  • Section Agenda
  • Introduction to AWS Glue Jobs
  • Lab 1 - Developing AWS Glue Jobs
  • AWS Glue Job Properties
  • Lab 2 - Developing AWS Glue Jobs
  • Lab 3 - Assignment : Importing Data from Redshift
  • Lab 4 - Developing AWS Glue Jobs
  • AWS Glue Job Scripts and Properties
  • Lab 5 - Developing AWS Glue Jobs
  • AWS Glue - Built-in ETL Transformations and Job Bookmarks
AWS Glue - Triggers
  • Section Agenda
  • Introduction to AWS Glue Triggers
  • Lab 1 - Developing AWS Glue Triggers
  • Lab 2 - Developing AWS Glue Triggers
AWS Glue - Dev Ops Setup
  • Section Agenda
  • Lab: Creating a AWS Glue Development Endpoint
  • Lab: Installing and configuring Apache Zeppelin
  • Lab: Port Forwarding Configuration
  • Lab: Integrating AWS Glue Development Endpoint with Apache Zeppelin
  • AWS Glue Monitoring
AWS Glue New Features and Releases : 2018, 2019, 2020
  • 10-Apr-2018 : AWS Glue supports timeout values for ETL Jobs
  • 10-Jul-2018 : AWS Glue supports reading from Amazon DynamoDB Tables
  • 13-Jul-2018 : AWS Glue provides additional ETL Job metrics
  • 04-Sep-2018 : AWS Glue supports data encryption at rest
  • 05-Oct-2018 : AWS Glue supports connecting Sagemaker notebooks to dev endpoints
  • 15-Oct-2018 : AWS Glue supports resource based policies and permissions
  • 22-Jan-2019 : AWS Glue introduces Python Shell Jobs
  • 04-Feb-2019 : Download Source code AWS Glue Data Catalog Client - Hive Metastore
  • 14-Mar-2019 : AWS Glue enables running Apache Spark SQL Queries
  • 20-Mar-2019 : AWS Glue supports resource tagging
  • 05-Apr-2019 : AWS Glue supports additional options for memory-intensive jobs
  • 10-May-2019 : AWS Glue crawlers support existing Data Catalog tables as sources
  • 28-May-2019 : AWS Glue enables continuous logging for Spark ETL Jobs
  • 06-Jun-2019 : AWS Glue supports scripts compatible with Python 3.6 in Shell Jobs
  • 20-Jun-2019 : AWS Glue provides workflows to orchestrate ETL workloads
  • 25-Jul-2019 : AWS Glue supports running ETL Jobs on Spark 2.4.3 with Python 3
  • 25-Jul-2019 : AWS Glue supports additional options for memory intensive jobs
  • 26-Jul-2019 : AWS Glue supports bookmarking Parquet and ORC Files using ETL Jobs
  • 06-Aug-2019 : Launch AWS Glue, EMR and Aurora Serverless Clusters in Shared VPCs
  • 09-Aug-2019 : AWS Glue provides FindMatches ML Transform
  • 28-Aug-2019 : AWS Glue releases binaries of Glue ETL libraries for Glue Jobs
  • 19-Sep-2019 : AWS Glue provides Apache Spark UI to monitor Glue ETL Jobs
  • 22-Oct-2019 : AWS Glue provides ability to rewind Spark ETL Job bookmarks
  • 22-Nov-2019 :AWS Glue support FindMatches ML Transform on Spark 2.4.3 & Glue 1.0
  • 25-Nov-2019 : AWS Glue supports bringing your own JDBC driver for Spark ETL Jobs
  • 16-Jan-2020 : Glue adds new transforms - Purge, Transition and Merge
  • 03-Apr-2020 : Glue supports reading & writing to DocumentDB & MongoDB Collection
  • 03-Apr-2020 : AWS Glue supports new tables, update schema & partitions from Jobs
  • 27-Apr-2020 : AWS Glue supports serverless streaming ETL
AWS Athena - Architecture and Setup