What Does the Course Cover ?
  • What Does the Course Cover ?
  • How to Download The Data files and Job files ?
TALEND OVERVIEW
  • Introduction to Talend Open Studio for Big Data
  • Installing Talend Open Studio for Big Data on Windows/Mac/Linux
BIG DATA OVERVIEW
  • The Three Vs of Big Data
  • About Hadoop
  • The Hadoop Ecosystem
  • HDFS - Understanding Block Storage, NameNode and DataNode
  • HDFS - Architecture
  • MapReduce - Overview of MapReduce
  • MapReduce - Understanding MapReduce
  • MapReduce - The Key/Value Pairs of MapReduce
  • HDFS - HDFS Federation & NameNode High Availability Hadoop 2
  • YARN - The Components of YARN
  • YARN - Lifecycle of a YARN Application
  • Big Data Overview - Quiz
Getting Started
  • Installing Cloudera CDH VM
  • Installing HortonWorks Sandbox VM
  • Opening Talend project
  • Creating Hadoop Cluster Metadata in Talend for HDP
  • Creating Hadoop Cluster Metadata in Talend for CDH
HDFS Components
  • HDFS - Basic Commands Using Unix Shell
  • How to create a reusable connection to the HDFS ?
  • How to copy a source file or folder into a target directory on HDFS ?
  • How to retrieve a list of files or folders based on a filemask pattern ?
  • How to copy files from HDFS to HDFS ?
  • How to get files from HDFS into local directory ?
  • How to rename the selected files or specified directory on HDFS ?
  • How to check whether a file exists in a specific directory in HDFS ?
  • How to delete a file located on a given HDFS ?
  • How to read a file located on a given HDFS and Assign schema to it?
  • How to count the number of rows in a file in HDFS ?
  • How to present the properties of a file processed in HDFS ?
  • How to transfer data flows into a given HDFS file system ?
  • How to transfer data in the form of a single column into a given HDFS ?
  • How to compare two files on HDFS ?
HIVE Components
  • What is Hive ?
  • Hive Architecture
  • HiveQL Vs SQL
  • How to Connect to Hive Shell
  • How to Create Hive Managed and External Tables Using Hive Shell
  • How to Load data from HDFS & Local File System to Hive table using Hive Shell
  • How to Load data from one Hive table to another Hive table using Hive Shell
  • How to join two HIVE Tables using Hive Shell
  • How to READ data from a HIVE Table and filter data using Hive Shell
  • How to open a connection to a Hive database using Talend?
  • How to close connection to a Hive databases using Talend?
  • How to create a Hive table using Talend?
  • How to extract data from Hive using Talend?
  • How to write data of different formats into a given Hive table using Talend?
  • How to execute the HiveQL query using Talend?
  • Hive - Quiz
PIG Components
  • What is Pig ?
  • What are the different Datatypes supported by Pig ?
  • How to Assign a schema to input file using Grunt Shell ?
  • What are aliases,relations and How to Load a file into Pig Alias ?
  • Pig - GROUP,GROUP ALL,DUMP,STORE,FILTER,LIMIT Operators
  • Pig - FOREACH, COUNT, MAX Operators
  • Pig - ORDER BY,DISTINCT,JOIN,COGROUP
  • How to load input data to an output stream in one single transaction ?
  • How to filter data from a relation based on conditions ?
  • How to select one or more columns from a relation ?
  • How to store the result of your Pig Job into a defined data storage space ?
  • How to remove duplicate tuples in a relation ?
  • How to perform the Pig COGROUP operation ?
  • How to perform aggregations on input data to create data to be used by Pig ?
  • How to perform join of two files based on join keys ?
  • How to sort a relation based on one or more defined sort keys ?
  • How to duplicate the incoming schema into identical output flows as needed ?
  • How to compute the cross data of two or more relations ?
  • How to transform data from multiple sources to multiple targets ?
  • How to integrate personalized Pig Code into a Talend Job ?
  • Pig - Quiz
SQOOP Components
  • What is Sqoop ?
  • How to transfer data from a RDBMS into the HDFS ? - Part1
  • How to transfer data from a RDBMS into the HDFS ? - Part2
  • How to transfer all of the tables of a RDBMS into the HDFS ?
  • How to import incremental data ? - Part1
  • How to import incremental data ? - Part2
  • How to import incremental data ? - Part3
  • How to transfer data from the HDFS to a RDBMS ?
  • Sqoop - Quiz
HCATALOG Components
  • What is HCatalog ?
  • How to perform Operations on HCatalog managed Hive database/table/partition
  • How to Load data into a Hive Table from a file on HDFS using Hcatalog?
  • How to Load data into a Hive Table from a file on Local System using Hcatalog?
  • How to Read/extract data from hive tables using hcatalog ?
HBASE Components
  • What is HBase ?
  • How to open a connection to an HBase database ?
  • How to close an active connection to an HBase database ?