Table of contents
1.
Introduction
2.
Hadoop FS Command Line
3.
List of HDFS Commands:
3.1.
1. Hadoop Touchz Command
3.1.1.
Syntax:
3.1.2.
Example:
3.2.
2. Hadoop Test Command
3.2.1.
Syntax:
3.2.2.
Example:
3.3.
3. Hadoop Text Command
3.3.1.
Syntax:
3.3.2.
Example:
3.4.
4. Hadoop Find Command
3.4.1.
Syntax:
3.4.2.
Example:
3.5.
5. Hadoop Getmerge Command
3.5.1.
Syntax:
3.5.2.
Example:
3.6.
6. Hadoop Count Command
3.6.1.
Syntax:
3.6.2.
Example:
3.7.
7. Hadoop AppendToFile Command
3.7.1.
Syntax:
3.7.2.
Example:
3.8.
8. Hadoop ls Command
3.8.1.
Syntax:
3.8.2.
Example:
3.9.
9. Hadoop mkdir Command
3.9.1.
Syntax:
3.9.2.
Example:
3.10.
10. Hadoop chmod Command
3.10.1.
Syntax:
3.10.2.
Example:
3.11.
11. Hadoop copyFromLocal Command
3.11.1.
Syntax:
3.12.
12. Hadoop copyToLocal Command
3.12.1.
Syntax:
3.12.2.
Example:
3.13.
13. Hadoop cat Command
3.13.1.
Syntax:
3.14.
14. Hadoop mv Command
3.14.1.
Syntax:
3.14.2.
Example:
3.15.
15. Hadoop rm Command
3.15.1.
Syntax:
3.15.2.
Example: 
3.16.
16. Hadoop moveFromLocal Command
3.16.1.
Syntax: 
3.16.2.
Example:
3.17.
17. Hadoop cp Command
3.17.1.
Syntax: 
3.17.2.
Example:
3.18.
18. Hadoop rmr (Deprecated in Newer Versions) Command
3.18.1.
Syntax: 
3.18.2.
Example:
3.19.
19. Hadoop du Command
3.19.1.
Syntax: 
3.19.2.
Example:
3.20.
20. Hadoop dus Command
3.20.1.
Syntax: 
3.20.2.
Example:
3.21.
21. Hadoop stat Command
3.21.1.
Syntax: 
3.21.2.
Example:
3.22.
22. Hadoop setrep Command
3.22.1.
Syntax:
3.22.2.
Example:
4.
Frequently Asked Questions
4.1.
What is the purpose of hadoop fs -ls?
4.2.
What does the hadoop getmerge command do?
4.3.
How can I copy a file from my local machine to HDFS?
5.
Conclusion
Last Updated: Sep 15, 2024
Medium

Hadoop Commands

Author Rahul Singh
0 upvote
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

Hadoop is a popular open-source framework designed for distributed storage and processing of large datasets. It relies on its distributed file system, HDFS, and a range of commands to manage data efficiently. 

Hadoop Commands

In this article, we’ll cover essential Hadoop commands, including file management, permissions, and more. By the end, you'll be well-versed in the commonly used commands to interact with HDFS.

Hadoop FS Command Line

The Hadoop File System (FS) command line interface is a vital tool for managing files and directories within Hadoop’s distributed file system (HDFS). It allows users to store, retrieve, and manage files effectively, mimicking traditional Unix-style file operations.

List of HDFS Commands:

  1. Hadoop Touchz Command
  2. Hadoop Test Command
  3. Hadoop Text Command
  4. Hadoop Find Command
  5. Hadoop Getmerge Command
  6. Hadoop Count Command
  7. Hadoop AppendToFile Command
  8. Hadoop ls Command
  9. Hadoop mkdir Command
  10. Hadoop chmod Command
  11. Hadoop copyFromLocal Command
  12. Hadoop copyToLocal Command
  13. Hadoop cat Command
  14. Hadoop mv Command
  15. Hadoop rm Command
  16. Hadoop moveFromLocal Command
  17. Hadoop cp Command
  18. Hadoop rmr Command
  19. Hadoop du Command
  20. Hadoop dus Command
  21. Hadoop stat Command
  22. Hadoop setrep Command

1. Hadoop Touchz Command

The hadoop touchz command creates an empty file in HDFS, similar to how the touch command works in Unix-based systems.

Syntax:

hadoop fs -touchz <HDFS file path>

Example:

hadoop fs -touchz /user/hadoop/newfile.txt


This will create an empty file named newfile.txt in HDFS.

2. Hadoop Test Command

The hadoop test command checks specific conditions, such as whether a file exists or determining its type, within the HDFS system.

Syntax:

hadoop fs -test -e <HDFS file path>

 

  • -e: Check if a file exists.
     
  • -z: Check if the file is zero-length.

Example:

hadoop fs -test -e /user/hadoop/newfile.txt

3. Hadoop Text Command

The hadoop text command is used to display the content of compressed files stored in HDFS.

Syntax:

hadoop fs -text <HDFS file path>

Example:

hadoop fs -text /user/hadoop/logfile.gz

4. Hadoop Find Command

The hadoop find command helps locate files based on conditions like name or size within the HDFS file system.

Syntax:

hadoop fs -find <HDFS path> -name <filename>

 

Example:

hadoop fs -find /user/hadoop/ -name 'datafile.txt'

5. Hadoop Getmerge Command

The hadoop getmerge command merges multiple files in an HDFS directory into a single file on the local file system.

Syntax:

hadoop fs -getmerge <HDFS directory> <local file>

Example:

hadoop fs -getmerge /user/hadoop/logs /local/logs/mergedlogs.txt

6. Hadoop Count Command

The hadoop count command displays the count of files, directories, and their sizes in a specified HDFS directory.

Syntax:

hadoop fs -count <HDFS path>

Example:

hadoop fs -count /user/hadoop/

7. Hadoop AppendToFile Command

The hadoop appendToFile command appends content from a local file to an existing HDFS file.

Syntax:

hadoop fs -appendToFile <local file> <HDFS file>

Example:

hadoop fs -appendToFile /local/data.txt /user/hadoop/datafile.txt

8. Hadoop ls Command

The hadoop ls command lists the contents of a directory in HDFS.

Syntax:

hadoop fs -ls <HDFS directory>

Example:

hadoop fs -ls /user/hadoop/

9. Hadoop mkdir Command

The hadoop mkdir command creates a new directory in HDFS.

Syntax:

hadoop fs -mkdir <HDFS directory>

Example:

hadoop fs -mkdir /user/hadoop/newdirectory

10. Hadoop chmod Command

The hadoop chmod command modifies the permissions of files and directories in HDFS.

Syntax:

hadoop fs -chmod <permissions> <HDFS file/directory>

Example:

hadoop fs -chmod 755 /user/hadoop/newfile.txt

11. Hadoop copyFromLocal Command

The hadoop copyFromLocal command transfers a file from the local file system to HDFS.

Syntax:

hadoop fs -copyFromLocal <local file path> <HDFS file path>


Example:

hadoop fs -copyFromLocal /local/data.txt /user/hadoop/data.txt


This command copies data.txt from your local machine to HDFS.

12. Hadoop copyToLocal Command

The hadoop copyToLocal command transfers a file from HDFS to the local file system.

Syntax:

hadoop fs -copyToLocal <HDFS file path> <local file path>


Example:

hadoop fs -copyToLocal /user/hadoop/data.txt /local/data.txt

13. Hadoop cat Command

The hadoop cat command prints the content of a file in HDFS to the console.

Syntax:

hadoop fs -cat <HDFS file path>

 

Example:

hadoop fs -cat /user/hadoop/data.txt

14. Hadoop mv Command

The hadoop mv command moves a file from one directory to another within HDFS.

Syntax:

hadoop fs -mv <source path> <destination path>

Example:

hadoop fs -mv /user/hadoop/oldfile.txt /user/hadoop/newfile.txt

15. Hadoop rm Command

The hadoop rm command deletes a file or directory from HDFS.

Syntax:

hadoop fs -rm <HDFS file path>

 

  • -r: Recursively deletes the directory.

Example:
 

hadoop fs -rm /user/hadoop/data.txt

16. Hadoop moveFromLocal Command

Moves a file or directory from the local filesystem to HDFS. Unlike -copyFromLocal, this command moves the file, which means it will be deleted from the local filesystem after the move.

Syntax: 

hadoop fs -moveFromLocal [local_path] [hdfs_path]

Example:

hadoop fs -moveFromLocal /home/user/file.txt /user/hadoop/

17. Hadoop cp Command

Copies files or directories from one location to another within HDFS. It does not move the files but creates a copy of them.

Syntax: 

hadoop fs -cp [source] [destination]

Example:

hadoop fs -cp /user/hadoop/file.txt /user/hadoop/backup/file.txt

18. Hadoop rmr (Deprecated in Newer Versions) Command

Deletes a file or directory recursively from HDFS. This command is deprecated in favor of -rm -r.

Syntax: 

hadoop fs -rmr [path]

Example:

hadoop fs -rmr /user/hadoop/old_dir

19. Hadoop du Command

Displays the disk usage of the specified directory or file in HDFS. It provides the size of each file or directory.

Syntax: 

hadoop fs -du [path]

Example:

hadoop fs -du /user/hadoop/

20. Hadoop dus Command

Provides the summary of disk usage for a directory, showing only the total size.

Syntax: 

hadoop fs -dus [path]

Example:

hadoop fs -dus /user/hadoop/

21. Hadoop stat Command

Displays information about a file or directory, such as size, modification time, and permissions. The format string can specify what details to display.

Syntax: 

hadoop fs -stat [format] [path]

Example:

hadoop fs -stat "%n %s" /user/hadoop/file.txt

22. Hadoop setrep Command

Sets the replication factor for a file or directory in HDFS. The -w option waits for the replication to complete.

Syntax:

 hadoop fs -setrep -w [replication_factor] [path]

Example:

hadoop fs -setrep -w 3 /user/hadoop/file.txt

Frequently Asked Questions

What is the purpose of hadoop fs -ls?

The hadoop fs -ls command shows the files and folders in a directory on HDFS, including file permissions, size, and modification date.

What does the hadoop getmerge command do?

The hadoop getmerge command merges several files in HDFS into one file on your local system, which helps combine smaller files into a single large one.

How can I copy a file from my local machine to HDFS?

You can use the hadoop fs -copyFromLocal command to transfer files from your local system to HDFS.

Conclusion

This article discussed various essential Hadoop commands for managing and interacting with HDFS. From directory creation to transferring files between local and HDFS, these commands are fundamental for efficient data management in a Hadoop environment. Understanding them is vital for developers and data engineers working with big data systems.

You can also check out our other blogs on Code360.

Live masterclass