Hadoopexpress - Big Data Training, Consulting and Development
  • Login
  • Sign up

Hadoop Professional - Instructor Led

Duration: 2 to 4 weeks
Flex timing and duration
$1499

The course is now being offered on an individual need basis. The planned time for completing the course is twenty hours. spread over two to four weeks depending upon the student's needs and availability. If required, the course duration will be extended beyond twenty hours so that the participants get confidence and a good hands-on experience. Students may take this course online (remotely) or at our facility.  

The course is designed for programmers wishing to learn Hadoop from scratch or to improve their understanding of Hadoop and its ecosystem. All essential topics of core Hadoop framework as well as popular tools such as MapReduce,  PIG, HIVE, Sqoop and OOZIE are included as essential components of the course modules. The course aims to provide hands-on experience to those who are new or have minimal exposure to Hadoop and its ecosystem. Apache Hadoop will be used on an Amazon cloud based Linux server to which students will be granted access for hands-on learning till the course end. 

About this Course

Duration: 2 to 4 weeks
Flex timing and duration
$1499

The course is now being offered on an individual need basis. The planned time for completing the course is twenty hours. spread over two to four weeks depending upon the student's needs and availability. If required, the course duration will be extended beyond twenty hours so that the participants get confidence and a good hands-on experience. Students may take this course online (remotely) or at our facility.  

The course is designed for programmers wishing to learn Hadoop from scratch or to improve their understanding of Hadoop and its ecosystem. All essential topics of core Hadoop framework as well as popular tools such as MapReduce,  PIG, HIVE, Sqoop and OOZIE are included as essential components of the course modules. The course aims to provide hands-on experience to those who are new or have minimal exposure to Hadoop and its ecosystem. Apache Hadoop will be used on an Amazon cloud based Linux server to which students will be granted access for hands-on learning till the course end. 

Course Syllabus

The course contains explanation of Big Data Hadoop, writing Map Reduce programs, using File Systems commands, Streaming,  data movement and storage in Hadoop clusters, practical demonstration of Pig and Hive as well as essential concepts of Sqoop and Oozie.


It’s an ideal course for getting up to speed quickly on Big Data and Hadoop in order to start writing useful Programs using the Hadoop framework.


  • Topic 1 : Introduction to Hadoop and Big Data

  • Conceptual understanding of Hadoop and Big Data and their relevance in the industry. Uses of Big Data. Architecture of Hadoop and an explanation of Hadoop ecosystem.


  • Topic 2 : Installing and Configuring a Hadoop Cluster

  • Hadoop single and multi-node cluster installation. Live demo of an actual installation. Commands for cluster startup, shutdown and monitoring.


  • Topic 3 : File System Commands

  • Common commands of Unix and Hadoop file system. Differences between local file system and the Hadoop Distributed file system.


  • Topic 4 : Distcp and Archive

  • Usage of distcp and har commands in detail for copying files across Hadoop file systems and archiving of files in Hadoop


  • Topic 5 : Map Reduce Introduction

  • MapReduce framework with emphasis on key-value pair concept. Essential concepts of MapReduce such as input splits, combiners, mappers, reducers, shuffle and sort. The word count example with relevance to MapReduce.


  • Topic 6 : MapReduce: I/O

  • Detailed explanation of input and output types used with mappers and reducers. Reading and writing to files using java APIs within Hadoop. Sample programs for programmatic reading of files, writing to files and querying file contents as well as file metadata.


  • Topic 7 : MapReduce: Advanced Concepts

  • Definitions of default mappers and reducers such as Identity Mapper, Identity Reducer, Inverse Mapper, Chain Mapper, Token Counter Mapper and Regex Mapper. Using Configuration API, Tool, ToolRunner and GenericOptionsParser for command line options. Writing programs that interact with the file system and its metadata. Understanding Sequence and AVRO files


  • Topic 8 : Streaming

  • Hadoop streaming is explained in detail with a demo of streaming example using Python. The lesson deals with detailed explanation of how streaming is executed in Hadoop, the mechanism followed by Hadoop and practical demo of command options.


  • Topic 9 : PIG

  • PIG language demonstrated with commands and examples. Demos on PIG usage for loading unstructured data into structured and formatted forms into Hadoop clusters.


  • Topic 10 : HIVE

  • Demos and examples of Hive usage for querying the data using SQL commands.


  • Topic 11 : SQOOP and OOZIE

  • Sqoop examples and demos for importing data residing in external source like MySQL databases into Hadoop clusters as well as exporting data out of Hadoop to external sources. Live Demo of Oozie to schedule running of scripts and jobs in Hadoop.


  • Topic 12 : Introduction to NoSQL databases and HBASE

Course Structure


  • Topic 1 : Introduction to Hadoop and Big Data

  • Topic 2 : Installing and Configuring a Hadoop Cluster

  • Topic 3 : File System Commands

  • Topic 4 : Distcp and Archive

  • Topic 5 : Map Reduce Introduction

  • Topic 6 : MapReduce: I/O

  • Topic 7 : MapReduce: Advanced Concepts

  • Topic 8 : Streaming

  • Topic 9 : PIG

  • Topic 10 : HIVE

  • Topic 11 : SQOOP and OOZIE

  • Topic 12 : Introduction to NoSQL databases and HBASE

Course Logistics

How the course is delivered:

At our facility online using a web conference. If joining remotely, you will be provided a link in your student dashboard which will take you too the live class at the time scheduled with you. An instructor delivers the course live . Students have two choices to join the lectures:

  • Join the lecture remotely
  • Classroom training  at our facility

If you prefer joining the lecture from our facility, you must book a spot at the facility one week before the start of the course. You may do so by using the email or phone or email. Make sure you have a confirmation email from us for your booking before you arrive at the facility. After receiving a confirmation, you may arrive at the facility with or without a laptop.  

 Steps to join the lecture remotely:

  • If you haven’t done so already, create an account by clicking on Register on top right of home page
  • Login with your user-id and password and click Enroll Now on the course card in the home page. Click Enroll Now again in the pop-up window. You will navigate to the course order page.  
  • On successful payment you will receive a confirmation email
  • On the scheduled date and time of the course, go to hadoopexpress.com and login with your user-id and password
  • Click on "Go to course" or  "My Dashboard"
  • In your dashboard page click the Go to course button
  • Click on Go to live class on right hand side of page
  • You will land on a Zoom meeting page where you will be able to  join the meeting. 
  • Make sure you have a microphone and speaker on your laptop or a headset connected to it.


Steps to join the classroom at our facility:

  • Create an online account and enroll for the course online by paying for course
  • On successful payment you will receive a confirmation email.
  • Call or email or email us to schedule your time for the class.
  • Please email or call for additional questions or queries.

Discussion Forum:

A discussion forum is available on-line to allow students to post any queries or discuss any topic with other students or the instructor.

Course Material and Videos

Recordings are made available in your dashboard for your self-study before, during and after the course commences. Further, you may download student guides, examples, exercises and videos.

Opportunities after the course

If you are taking the course purely for gaining knowledge of Hadoop without a Hadoop career objective, you can take this course anytime.

For  a career choice, the Hadoop course should be taken by those who already have some programming or IT background. If you do not have prior knowledge of programming or software architecture or database administration, you should first go for getting experience on those topics. This is not a course for novices. 

Hadoop is an emerging technology that has made rapid progress. It has already been adopted by a majority of Fortune 100 companies and is considered as the technology of the future for dealing with storage, retrieval and analysis of massive amounts of data. Naturally, the careers on this technology are at an upswing and the demand for professionals has started outweighing the supply of knowledgeable professionals.


The salaries of programmers of Hadoop are in the top bracket in IT. Career opportunities exist in large companies across industry segments i.e., social media, banking, pharmaceuticals, energy, insurance, airlines, railways and many others.


The demand for Hadoop professionals is growing at over 20% a year and is expected to peak over the next one or to years. Large companies like Yahoo, Google, Facebook and Twitter are leading users of this relatively new technology.


Opportunities exist in IT Consulting companies as well as Fortune 500 companies.


Delivery Method
Self Placed $ 499

Additional Batches
Course at a Glance
  • English
  • Skill Level: Intermediate
Online Classes
Assignments: 6
Project: 1
Lifetime Access
Certificates
System Requirements

If taking the course remotely, high speed internet connection, laptop or PC with good screen resolution and ability to connect to internet, Headset with microphone or built-in speaker and microphone on laptop or PC. If taking the course at out facility, a laptop with Windows or Linux installed is recommended though not essential. We can provide a workstation for running hands-on examples and exercises. 

Prerequisites
  • Hadoop is completely written in Java and it is implemented in production environments that use UNIX. So any knowledge of Java Knowledge and UNIX will be beneficial. however it is not essential. We also provide free Java/ UNIX training to those enrolling for the course. Prior programming knowledge or background is required. Such background may be on any programming language. 

Testimonials

" The course was very interactive and easy to understand even for a beginner like me! It helped me prepare and pass my certification soon after completing the course!! "

- Priyam

" I really loved this course. It was fast paced, very hands on with fun filled exercises. Not only do I have lifetime access to lectures and notes, I can also email the instructor any time for help! Awesome!! "

- Samuel Adlekha

" Loved the the course. The instructor was patient and provided great demos and examples. I am new to programming but felt so comfortable since it was well explained. Awesome! "

- Shveta

" It was a pleasure and great learning experience with Net Serpents under the guidance of Mr. Shashi Prakash. "

- Aijaz

Contact Us:

Hadoop is a registered trademark of the Apache Software Foundation(ASF) and Hadoop is a product owned by Apache. Hadoop Express is not affiliated in any way to ASF . All educational material, resources, videos and other content available on this site is created and owned by Net Serpents and is intended only to provide training. This website does not own any of the products on which it provides training, many of which are owned by Apache while others are owned companies such as SAS, Python and Oracle. Net Serpents LLC is committed to education and online learning. All recognizable terms, names of software, tools, programming languages that appear on this site belong to the respective copyright and/or trademark owners.