Name: CCA 175 - Spark and Hadoop Developer - Python (pyspark)
Brand: Durga Viswanatha Raju Gadiraju
SKU: 11324074
Price: 19.99 $
Availability: InStoreOnly
Rating: 4.21 (1493 reviews)

Description

CCA 175 Spark and Hadoop Developer is one of the well recognized Big Data certifications. This scenario-based certification exam demands basic programming using Python or Scala along with Spark and other Big Data technologies.

This comprehensive course covers all aspects of the certification using Python as a programming language.

Python Fundamentals
Spark SQL and Data Frames
File formats

Please note that the syllabus is recently changed and now the exam is primarily focused on Spark Data Frames and/or Spark SQL.

Exercises will be provided to prepare before attending the certification. The intention of the course is to boost the confidence to attend the certification.

All the demos are given on our state of the art Big Data cluster. You can avail one-week complimentary lab access by filling this form which is provided as part of the welcome message.

Students also bought

Your Road to Better Photography

Corey Reese

3.8 (3788)

~~€19.99 $~~ 19.99 $

iPhone Photography | Take Professional Photos On Your iPhone

Dale McManus

4.63 (15646)

~~€49.99 $~~ 19.99 $

Beginner Nikon Digital SLR (DSLR) Photography

JP Teaches Photo .

4.63 (5665)

~~€29.99 $~~ 19.99 $

Affinity Photo: Solid Foundations

Simon Foster

4.62 (4452)

~~€99.99 $~~ 19.99 $

Mastering Architecture and Real Estate Photography

Charlie Borland

4.56 (3074)

~~€74.99 $~~ 19.99 $

Complete Filmmaker Guide: Become an Incredible Video Creator

Julian Melanson

4.43 (3867)

~~€199.99 $~~ 19.99 $

Beginner Canon Digital SLR (DSLR) Photography

JP Teaches Photo .

4.72 (2877)

~~€29.99 $~~ 19.99 $

Digital Photography for Beginners with DSLR cameras

Villiers Steyn

4.66 (2964)

~~€99.99 $~~ 19.99 $

Netcurso-great-photography-tips-photo-tutorials

Photography - Become a Better Photographer - Part I

Bernie Raffe AMPA

4.1 (1741)

~~€34.99 $~~ 19.99 $

Phantom & Mavic Filmschool 1- master autopilot with Litchi

Laurence Seberini

4.95 (1263)

~~€29.99 $~~ 19.99 $

Course content

Introduction

CCA 175 Spark and Hadoop Developer - Curriculum
Using labs for preparation
Setup Development Environment (Windows 10) - Introduction
Setup Development Environment - Python and Spark - Pre-requisites
Setup Development Environment - Python Setup on Windows
Setup Development Environment - Configure Environment Variables
Setup Development Environment - Setup PyCharm for developing Python applications
Setup Development Environment - Pass run time arguments or parameters
Setup Development Environment - Download Spark compressed tar ball
Setup Development Environment - Install 7z for uncompress and untar on windows
Setup Development Environment - Setup Spark
Setup Development Environment - Install JDK
Setup Development Environment - Configure environment variables for Spark
Setup Development Environment - Install WinUtils - integrate Windows and HDFS
Setup Development Environment - Integrate PyCharm and Spark on Windows 10

Python Fundamentals

Introduction and Setting up Python
Basic Programming Constructs
Functions in Python
Python Collections
Map Reduce operations on Python Collections
Setting up Data Sets for Basic I/O Operations
Basic I/O operations and processing data using Collections

Getting Started

Get revenue for given order id - as application
Setup Environment - Options
Setup Environment - Locally
Setup Environment - using Cloudera Quickstart VM
Using Itversity platforms - Big Data Developer labs and forum
Using itversity's big data labs
Using Windows - Putty and WinSCP
Using Windows - Cygwin
HDFS Quick Preview
YARN Quick Preview
Setup Data Sets

Apache Spark 1.6 - Transform, Stage and Store

Introduction
Introduction to Spark
Setup Spark on Windows
Quick overview about Spark documentation
Connecting to the environment
Initializing Spark job using pyspark
Create RDD from HDFS files
Create RDD from collection - using parallelize
Read data from different file formats - using sqlContext
Row level transformations - String Manipulation
Row Level Transformations - map
Row Level Transformations - flatMap
Filtering data using filter
Joining Data Sets - Introduction
Joining Data Sets - Inner Join
Joining Data Sets - Outer Join
Aggregations - Introduction
Aggregations - count and reduce - Get revenue for order id
Aggregations - reduce - Get order item with minimum subtotal for order id
Aggregations - countByKey - Get order count by status
Aggregations - understanding combiner
Aggregations - groupByKey - Get revenue for each order id
groupByKey - Get order items sorted by order_item_subtotal for each order id
Aggregations - reduceByKey - Get revenue for each order id
Aggregations - aggregateByKey - Get revenue and count of items for each order id
Sorting - sortByKey - Sort data by product price
Sorting - sortByKey - Sort data by category id and then by price descending
Ranking - Introduction
Ranking - Global Ranking using sortByKey and take
Ranking - Global using takeOrdered or top
Ranking - By Key - Get top N products by price per category - Introduction
Ranking - By Key - Get top N products by price per category - Python collections
Ranking - By Key - Get top N products by price per category - using flatMap
Ranking - By Key - Get top N priced products - Introduction
Ranking - By Key - Get top N priced products - using Python collections API
Ranking - By Key - Get top N priced products - Create Function
Ranking - By Key - Get top N priced products - integrate with flatMap
Set Operations - Introduction
Set Operations - Prepare data
Set Operations - union and distinct
Set Operations - intersect and minus
Saving data into HDFS - text file format
Saving data into HDFS - text file format with compression
Saving data into HDFS using Data Frames - json

Apache Spark 1.6 - Core Spark APIs - Get Daily Revenue Per Product

Problem Statement
Launching pyspark
Reading data from HDFS and filtering
Joining orders and order_items
Aggregate to get daily revenue per product id
Load products and convert into RDD
Join and sort the data
Save to HDFS and validate in text file format
Saving data in avro file format
Get data to local file system using get or copyToLocal
Develop as application to get daily revenue per product
Run as application on the cluster

Apache Spark 1.6 - Data Analysis - Spark SQL or HiveQL using Spark Context

Different interfaces to run SQL - Hive, Spark SQL
Create database and tables of text file format - orders and order_items
Create database and tables of ORC file format - orders and order_items
Running SQL/Hive Commands using pyspark
Functions - Getting Started

Información sobre el Instructor

4.19 Calificación
197000 Estudiantes
19 Cursos

Durga Viswanatha Raju Gadiraju

Technology Adviser and Evangelist

13+ years of experience in executing complex projects using vast array of technologies including Big Data and Cloud.

ITVersity, Inc. - a US based organization to provide quality training for IT professionals and we have the track record of training hundreds of thousands of professionals globally.

Building IT career for people with required tools such as high quality material, labs, live support etc to upskill and cross skill is paramount for our organization.

At this time our training offerings are focused on following areas:

* Application Development using Python and SQL

* Big Data and Business Intelligence

* Cloud

* Datawarehousing, Databases

Student feedback

4.19

Enter here:

CCA 175 - Spark and Hadoop Developer - Python (pyspark)

Cloudera Certified Associate Spark and Hadoop Developer using Python as Programming Language

Description

Students also bought

Your Road to Better Photography

iPhone Photography | Take Professional Photos On Your iPhone

Beginner Nikon Digital SLR (DSLR) Photography

Affinity Photo: Solid Foundations

Mastering Architecture and Real Estate Photography

Complete Filmmaker Guide: Become an Incredible Video Creator

Beginner Canon Digital SLR (DSLR) Photography

Digital Photography for Beginners with DSLR cameras

Photography - Become a Better Photographer - Part I

Phantom & Mavic Filmschool 1- master autopilot with Litchi

Información sobre el Instructor

Durga Viswanatha Raju Gadiraju

Technology Adviser and Evangelist

Student feedback

Course Rating

Reviews

Enter here:

CCA 175 - Spark and Hadoop Developer - Python (pyspark)

Cloudera Certified Associate Spark and Hadoop Developer using Python as Programming Language

What you'll learn

Description

netcurso.net free. The world's largest selection of free courses

Students also bought

Your Road to Better Photography

iPhone Photography | Take Professional Photos On Your iPhone

Beginner Nikon Digital SLR (DSLR) Photography

Affinity Photo: Solid Foundations

Mastering Architecture and Real Estate Photography

Complete Filmmaker Guide: Become an Incredible Video Creator

Beginner Canon Digital SLR (DSLR) Photography

Digital Photography for Beginners with DSLR cameras

Photography - Become a Better Photographer - Part I

Phantom & Mavic Filmschool 1- master autopilot with Litchi

Course content

Información sobre el Instructor

Durga Viswanatha Raju Gadiraju

Technology Adviser and Evangelist

Student feedback

Course Rating

Reviews