Learn Python

Learn Data Structure & Algorithm

Learn Numpy

Learn Pandas

Learn Matplotlib

Learn Seaborn

Learn Statistics

Learn Math

Learn MATLAB

Learn Machine learning

Learn Github

Learn OpenCV

Learn Deep Learning

Learn MySQL

Learn MongoDB

Learn Web scraping

Learn Excel

Learn Power BI

Learn Tableau

Learn Docker

Hadoop Introduction

Hadoop Hbase

Hadoop HDFS

Hadoop Hive

Hadoop Map Reduce

Hadoop Introduction

What is hadoop?

In this time big data is a big problem and to solve the problem we use hadoop as solution.
So hadoop is a framework that can handle a huge amount of data on a low cost and simple hardware cluster. Here cluster means collection of multiple computers and low cost means here we will use commodity server and the hardware cost will be low and that's why we can get a low cost solution by hadoop for the big data problems. It is scalable. Here we can divide the workflow into multiple servers. Hadoop is fault-tolerant. It means if a server or node goes down then other servers or nodes can process the data . Hadoop is a storage system and we can also processed data using hadoop.

Why we should use hadoop?

1. Hadoop capture minimum or at least 90% of the big data market.
2.Here the cost is low. To work with hadoop we can use cheap hardware.
3.Hadoop is fault-tolerant and scalable.
4.Here we can store structure data, unstructured and semi-structure data.

Different hadoop modules

Hadoop Distributed File System(HDFS):
HDFS is used as the primary storage system for hadoop. So whatever the data we are going to store on hadoop, will be stored in the HDFS.
Map Reduce v2:
The hadoop map reduce is used to process different data of the HDFS and we can write different types of application that can manage the data stored in HDFS. Here the data will be splitted into multiple different portions and the data will process parallelly in the distributed environment.
Yet Another Resource Negotiator(YARN):
It is the resource management and job scheduling technology in Hadoop . It used for allocating system resources to the various applications which are running in the Hadoop cluster. It also make schedule all the tasks which are going to be executed on different cluster nodes.
Common Utilities:
Some modules or libraries will be here which will access from other different module of the hadoop system.

CodersAim is created for learning and training a self learner to become a professional from beginner. While using CodersAim, you agree to have read and accepted our terms of use, privacy policy, Contact Us

© Copyright All rights reserved www.CodersAim.com. Developed by CodersAim.