Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction
2.
Hadoop Yarn
3.
Resource Manager
4.
Node Manager
5.
Application Master
6.
Frequently Asked Questions
6.1.
What do you mean by the Hadoop?
6.2.
What do you mean by the Yarn?
6.3.
What is the role of the Resource Manager in Hadoop?
6.4.
What is the role of the Node Manager in Hadoop?
6.5.
What does the Application Master do in Hadoop?
7.
Conclusion
Last Updated: Mar 27, 2024

Managing Applications and Resources with Hadoop Yarn

Author Aniket Majhi
0 upvote

Introduction

Welcome readers! We hope you are doing well.

Day by day, the amount of data produced is increasing rapidly. Managing and storing this huge amount of data is a very complex and challenging task for the traditional data processing software.

Apache Hadoop is a collection of open-source software that solves problems involving massive amounts of data computation. 

If you want to learn more about Hadoop, you can refer to this link Hadoop.

Hadoop Yarn is an Apache Hadoop module which is responsible for managing computing resources in clusters and using them for scheduling users.
Today, In this blog, we will discuss Managing Applications and Resources with Hadoop Yarn.
 

So, without further ado, let’s start our discussion.

                                                                       Source

Hadoop Yarn

Hadoop is an open-source framework used to efficiently store and process large datasets ranging in size from gigabytes of data to petabytes of data.

Yarn stands for “Yet Another Resource Negotiator”. It is the cluster resource management of Hadoop. It was introduced in Hadoop 2.0

Yarn allows different data processing engines like interactive processing, graph processing, stream processing and batch processing to run and process data stored in HDFS(Hadoop Distributed File System). Apart from this, it also allows job scheduling.

The fundamental idea of the YARN is to split up the functionalities of resource management and job scheduling. It provides two major services:

  • Global resource management (ResourceManager)
  • Per-application management (ApplicationMaster)


 

                                                                   Source

Resource Manager

The resource manager is the core component of Hadoop Yarn. It is the central controlling authority for resource management. The Resource Manager controls NodeManager in each of the nodes of a Hadoop cluster. 

Resource manager has two main components Scheduler and Application manager

  • Scheduler: The Scheduler's task is to allocate system resources to specific running applications, but it does not track the status of the application.
     
  • Application Manager: The task of the Application Manager is to accept the job submission and negotiate resources for executing the Application Master. If there is a failure, it restarts the Application Master.
     

Resource container stores all the required system information. It contains a detailed CPU, disk, network, and other important resource attributes necessary for running node and cluster applications.

Node Manager

Each node has a Node Manager slaved to the global Resource Manager in the cluster. It takes care of the individual compute nodes in a Hadoop cluster.

A few functionalities of the Node Manager are shown below:

  • The Node Manager is responsible for launching the application containers for app execution.
     
  • The Node Manager monitors the application’s CPU, disk, network, and memory usage and reports back to the Resource Manager. 
     
  • All nodes of the cluster have a certain number of containers. Containers are computing units and wrappers for node resources to perform user application tasks. 
     
  • It is also responsible for tracking job status and progress within its node.

Application Master

For each application running on the node, there is a corresponding Application Master.

A few essential points regarding the Application Master are shown below:

  • It is per application-specific entity.
     
  • If more resources are necessary to support the running application, the Application Master notifies the Node Manager, and the Node Manager negotiates with the Resource Manager.
     
  • It works with Node Manager for executing and monitoring component tasks.

 

Read about Batch Operating System here.

Frequently Asked Questions

What do you mean by the Hadoop?

Hadoop is an open-source framework used to efficiently store and process large datasets ranging in size from gigabytes of data to petabytes of data.
 

What do you mean by the Yarn?

Yarn stands for “Yet Another Resource Negotiator”. It is the cluster resource management of Hadoop. 
 

What is the role of the Resource Manager in Hadoop?

The Resource Manager is responsible for tracking the resource in a cluster and scheduling applications.
 

What is the role of the Node Manager in Hadoop?

The Node Manager is responsible for managing the workflow and application of the node.
 

What does the Application Master do in Hadoop?

The Application Master is responsible for negotiating resources from the Resource Manager and working with the Node Manager to execute and monitor the containers and their resource consumption. 

Conclusion

In this article, we have extensively discussed Managing Applications and Resources with Hadoop Yarn.

We started with the basic introduction, and then we discussed:

  • About the Hadoop Yarn
  • Resource Manager in Hadoop Yarn
  • Node Manager  in Hadoop Yarn
  • Application Master in Hadoop Yarn
     

We hope that this blog has helped you enhance your knowledge regarding Managing Applications and Resources with Hadoop Yarn and if you would like to learn more, check out our articles on Hadoop Interview QuestionsSpart vs Hadoop, and Big Data Engineer Salary. Do upvote our blog to help other ninjas grow.

Head over to our practice platform Coding Ninjas Studio to practice top problems, attempt mock tests, read interview experiences, follow our guided paths, and crack product based companies Interview Bundle.

Happy Reading!

Live masterclass