During the class you will go through the following hadoop contents.
- Linux (Ubuntu/Centos) – Tips and Tricks
- Introduction to Big Data and Hadoop
- Hadoop ecosystem concepts
- Hadoop MapReduce concepts and features
- Hive concepts
- Oozie workflow concepts
- Sqoop Data Ingestion
- Flume Agents
- Real Time tools like Hue, Putty, FileZilla, Cloudera Manager
- Real Time Projects
Hadoop Cluster Understanding
- Hadoop 2.x Architecture
- Typical workflow
- HDFS Commands
- Writing files to HDFS
- Reading files from HDFS
- Rack awareness
- Hadoop daemons
Mapreduce
- MapReduce overview
- Word count problem
- Word count flow and solution
- MapReduce flow
Hive
- Hive Architecture
- Types of Metastore
- Hive Data Types
- HiveQL
- File Formats – Parquet, ORC, Sequence and Avro Files Comparison
- Partitioning & Bucketing
- Hive JDBC Client
- Hive UDFs
- Hive Serdes
- Hands-on exercises
PIG understanding
- Pig Architecture
- Pig Data Types
- Load/Store Functions
- PigLatin
- Pig Udfs
SQOOP
- Sqoop Architecture
- Sqoop Import Command Arguments, Incremental Import
- Sqoop Export
- Sqoop Jobs
Flume
- Flume Architecture
- Flume Agent Setup
- Types of sources, channels, sinks Multi Agent Flow
- Hands-on exercises
Oozie
- Oozie Fundamentals
- Oozie workflow creations
- Oozie Job submission, monitoring, debugging
- Concepts on Coordinators and Bundles
- Hands-on exercises
Hadoop Real time project implimentation
Thanks for visiting this post. Please contact us if you are looking for training.
Subscribe to our channel for unlimited learning. Thank you!!!