Hadoopexpress - Big Data Training, Consulting and Development
  • Login
  • Sign up

Hadoop Professional Certificate (Batch 1)

Time
EST IST
Day Time Day Time
Saturday 21:00 - 00:00 Sunday 07:30 -10:30

This course is designed for programmers and architects wishing to learn Hadoop from scratch or to improve their understanding of Hadoop and its ecosystem. All essential topics of core Hadoop framework as well as popular tools such as PIG, HIVE and Sqoop are included as essential components of the course modules. The course aims to convert new Hadoop learners into practitioners.

About this Course

Time
EST IST
Day Time Day Time
Saturday 21:00 - 00:00 Sunday 07:30 -10:30

This course is designed for programmers and architects wishing to learn Hadoop from scratch or to improve their understanding of Hadoop and its ecosystem. All essential topics of core Hadoop framework as well as popular tools such as PIG, HIVE and Sqoop are included as essential components of the course modules. The course aims to convert new Hadoop learners into practitioners.

Course Syllabus

The course moves quickly from basic concepts of Hadoop and Big Data to technical aspects of installation of Hadoop clusters, creation of Map Reduce programs, File Systems, Streaming,  data movement and storage in Hadoop clusters, practical demonstration of Pig and Hive as well as essential concepts of Sqoop and Zookeeper.


It’s an ideal course for getting up to speed quickly on Big Data and Hadoop in order to start writing useful Programs using the Hadoop framework.


  • Session 1 : Introduction to Hadoop and Big Data

  • This session introduces concepts of Hadoop and Big Data and their relevance in the industry. Uses of Big Data are explained. The architecture of Hadoop and an explanation of Hadoop ecosystem are part of this introduction.


  • Session 2 : Installing and Configuring a Hadoop Cluster

  • This unit deals with concepts of Hadoop installation. Detailed explanation of each step is provided with demos from an actual installation. Commands for Hadoop cluster startup, shutdown and general monitoring are explained.


  • Session 3 : Hands-on: Creating a Hadoop Cluster on Ubuntu


  • Participants get a chance to practice installation learnt from the previous session with guidance from the instructor.


  • Session 4 : File System Commands

  • Common commands of the UNIX file system and Hadoop file system are covered in this session. Differences between local file system and the Hadoop Distributed file system are explained.


  • Session 5 : Map Reduce Framework

  • The Map Reduce framework is explained with emphasis on key-value pair concept as well as key terms such involved with Map Reduce such as input splits, combiners, mappers, reducers and sort. The word count example is explained in detail with relevance to map reduce.


  • Session 6 : Streaming

  • Hadoop streaming is explained in detail with a demo of streaming example using Python. The lesson deals with detailed explanation of how streaming is executed in Hadoop, the mechanism followed by Hadoop and practical demo of command options.


  • Session 7 : Practise Creating Streaming Mappers and Reducers

  • Participants get a chance to practice writing programs in non-java languages and submitting them to Hadoop streaming for execution.


  • Session 8 : File I/O

  • detailed explanation of how files may be handled programmatically in Hadoop using Java programs. Sample programs are explained to get a deep understanding of classes that are used for reading files, writing to files and querying file contents as well as file metadata.


  • Session 9 : Hands-On Writing File I/O programs

  • Students get a chance to practise file handling using Java programs. Exercises are provided by the instructor.


  • Session 10 : Distcp and Archive

  • Copying files over the distributed file system is explained in detail and Hadoop Archive system commands are dealt with in this session.


  • Session 11 : PIG

  • The PIG language is explained and demonstrated. Commands and examples are provided to understand how PIG is used to load unstructured data into structured and formatted forms in Hadoop clusters.


  • Session 12 : PIG Continued with Hands-On

  • This session continues discussion of PIG commands and provided hands-on exercises to students to practice basic commands using PIG language.

  • Session 13 : HIVE

  • A detailed explanation of Hive language is covered in this session. Participants learn to exploit Hive for querying the data using SQL commands.

  • Session 14 : HIVE Continued with Hands-On

  • In this instructor-led session participants gain further knowledge of Hive and are provided challenge exercises to get a hands-on experience of Hive. This equips the attendees to use Hive in real world situations.


  • Session 15 : Map Reduce Programming

  • This session provided deeper understanding of Map Reduce programming by touching upon more advanced concepts and by examining live code examples.


  • Session 16 : Sqoop and Zookeeper

  • Participants get to learn how data residing in external source like MySQL databases may be imported into Hadoop clusters as well as exporting data out of Hadoop to external sources. A conceptual understanding of Zookeeper is also provided to understand features provided by Zookeeper to handle partial failures during transfer of messages between nodes.

Course Structure

All Sessions are 1.5 hours duration. Quizzes and/or exercises follow each session.

  • Session 1 : Introduction to Hadoop and Big Data

  • Session 2 : Installing and Configuring a Hadoop Cluster

  • Session 3 : Hands-on: Creating a Hadoop Cluster on Ubuntu

  • Session 4 : File System Commands

  • Session 5 : Map Reduce Programming

  • Session 6 : Streaming

  • Session 7 : Practise Creating Streaming Mappers and Reducers

  • Session 8 : File I/O

  • Session 9 : Hands-On Writing File I/O programs

  • Session 9 : Distcp and Archive

  • Session 10 : PIG

  • Session 11 : PIG Continued with Hands-On

  • Session 12 : HIVE

  • Session 13 : HIVE Continued with Hands-On

  • Session 14 : Map Reduce Programming

  • Session 15 : Sqoop and Zookeeper

Course Logistics

The course is delivered over the internet by an instructor through live web sessions.


Students are provided a web conference link allowing them to join the course on line at the scheduled date and time. Students have a choice of dialing in using a phone or using their computer microphone and speaker (or headset).


The course is divided into 15 lectures each consisting of 2 hours duration. This includes time for initial setup (getting connected over the internet and dialing into the web conference) as well as recap of previous lecture(s) and/ or Q&A.

The instructor sessions are recorded and made available to students from their on-line dashboard. This is helpful for students who miss a lecture or wish to revisit the lecture at their pace after the live sessions.


Further, students receive soft copies of a student guide, sample codes and answers to quiz and exercises. These are made available for download prior to the start of the first lecture on the course.Hands-on experience is provided by allowing access to Linux computers during the entire duration of the course. A discussion forum is available on-line to allow students to discuss between themselves as well as with the instructor. Access to recorded videos, course material and discussion forum is disabled one month after the delivery of the last lecture.Visit the FAQ page for commonly asked questions, feel free to submit a message, chat online with a specialist, request a call back  or simply call our corporate office at the number provided.

Opportunities after the course

Hadoop is an emerging technology that has made rapid progress. It has already been adopted by a majority of Fortune 100 companies and is considered as the technology of the future for dealing with storage, retrieval and analysis of massive amounts of data. Naturally, the careers on this technology are at an upswing and the demand for professionals has started outweighing the supply of knowledgeable professionals.


The salaries of programmers of Hadoop are in the top bracket in IT. Career opportunities exist in large companies across industry segments i.e., social media, banking, pharmaceuticals, energy, insurance, airlines, railways and many others.


The demand for Hadoop professionals is growing at over 20% a year and is expected to peak over the next one or to years. Large companies like Yahoo, Google, Facebook and Twitter are leading users of this relatively new technology.


Opportunities exist in IT Consulting companies as well as Fortune 500 companies.


Sessions

India: [Sat, Sun 6.00 PM-9.00 PM] USA: [Sat, Sun 7.30 AM - 10.30 AM EST]

Course at a Glance
  • English
  • Skill Level: Intermediate
Online Classes
Assignments: 6
Project: 1
System Requirements

High speed internet connection, laptop or PC with good screen resolution and ability to connect to internet, Headset with microphone or built-in speaker and microphone on laptop or PC.

Prerequisites
  • Knowledge of an object oriented programming language is highly recommended, though not essential. However, prior programming knowledge or background is required. Such background may be on any programming language.Java knowledge is recommended and would be very useful for the course though it is not essential.

Testimonials

" The course was very interactive and easy to understand even for a beginner like me! It helped me prepare and pass my certification soon after completing the course!! "

- Priyam

" I really loved this course. It was fast paced, very hands on with fun filled exercises. Not only do I have lifetime access to lectures and notes, I can also email the instructor any time for help! Awesome!! "

- Samuel Adlekha

" Loved the the course. The instructor was patient and provided great demos and examples. I am new to programming but felt so comfortable since it was well explained. Awesome! "

- Shveta

" It was a pleasure and great learning experience with Net Serpents under the guidance of Mr. Shashi Prakash. "

- Aijaz

Contact Us:

Hadoop is a registered trademark of the Apache Software Foundation(ASF) and Hadoop is a product owned by Apache. Hadoop Express is not affiliated in any way to ASF . All educational material, resources, videos and other content available on this site is created and owned by Net Serpents and is intended only to provide training. This website does not own any of the products on which it provides training, many of which are owned by Apache while others are owned companies such as SAS, Python and Oracle. Net Serpents LLC is committed to education and online learning. All recognizable terms, names of software, tools, programming languages that appear on this site belong to the respective copyright and/or trademark owners.