Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction
2.
Pandas
3.
Pandas read excel
3.1.
Code 
3.2.
Output 
4.
Parameters of pandas read excel
5.
Understanding the parameter of panda read excel 
5.1.
io                 
5.1.1.
Code 
5.1.2.
Output
5.1.3.
Explanation
5.2.
sheet_name
5.2.1.
Code
5.2.2.
Output
5.2.3.
Explanation
5.3.
header
5.3.1.
Code
5.3.2.
Output
5.3.3.
Explanation
5.4.
use_col
5.4.1.
Method 1
5.4.2.
Method 2
5.4.3.
Output 1
5.4.4.
Output 2
5.4.5.
Explanation
5.5.
index_col
5.5.1.
Code
5.5.2.
Output
5.5.3.
Explanation
5.6.
dtype
5.6.1.
Code
5.6.2.
Output
5.6.3.
Explanation
5.7.
skiprow
5.7.1.
Code
5.7.2.
Output
5.7.3.
Explanation
5.8.
nrows
5.8.1.
Code
5.8.2.
 Output
5.8.3.
 Explanation
5.9.
skipfooter     
5.9.1.
Code
5.9.2.
Output
5.9.3.
Explanation 
6.
Frequently Asked Question
6.1.
Can pandas read csv read excel?
6.2.
Which Python package read Excel file?
6.3.
What is the difference between read CSV and read Excel?
7.
Conclusion
Last Updated: Mar 27, 2024
Easy

Pandas Read Excel

Author Muskan Sharma
0 upvote

Introduction

Hey Readers!!

You guys have worked multiple times on Excel sheets. The data in the Excel sheet is present in table form. But now, if you want to use that in your Python project. What will you do?

To do this, we have a Python library called pandas. In Pandas, there is a function called pandas read excel.

Pandas read excel function helps you parse the data in the Excel sheet into the pandas DataFrame object, and then you can use it where you want.

Pandas Read Excel

This article will help you understand the function pandas.read_excel in depth.

So, let us dive into the topic and explore more about it.

Pandas

Pandas is a commonly used Python Library. Pandas provides high-performance data structures and data analysis tools that can be used for data manipulation and analysis. Data scientists mostly use Pandas for working with data sets. Pandas are generally used in finance, retail, accounting, and academic sectors.

Now let us look at how pandas read excel files using the pandas read excel function.

Pandas read excel

Pandas read excel ( pandas.read_excel() ) is a pandas function that reads an excel sheet with the extension xlsx into a pandas DataFrame object. In the Excel sheet, the data is in two-dimensional table form. In the DataFrame object, the data is also in a two-dimensional tabular form.

Let us understand this with the help of an example:


Excel sheet

Example Excel sheet

Code
 

import pandas
r=pandas.read_excel('Student_data.xlsx')
print(r)
You can also try this code with Online Python Compiler
Run Code


Output 

Output example

In the above example, we have an Excel file named Student_data and we print the data into DataFrame objects with the help of pandas.read_excel function.
 

The pandas read excel function has different parameters to read the excel file as per the user's requirement. So, let us look into some of the parameters.

Parameters of pandas read excel

This section of the article will give you a brief overview of the different parameters of the pandas read excel function:
 

Parameters

Description

io It is a valid string path directing to the workbook. The string could be any URL, local path, filename, or file-like object.
sheet_name It reads the sheet name of the workbook. A sheet name can only be a string, an integer can be used in zero-indexed sheet positions.
header It reads the header row. If there is no header, then use none. The dtype is an object. dtype is nothing but the datatype.
use_col It parses all the columns if none is mentioned, and it will parse the column of that index that is mentioned. If the list of string or integer is mentioned, it will parse all the columns mentioned.
index_col It reads the column mentioned in the index_col and will print the column at the index position. It passes None if there is no such column.
dtype It tells the type of data present in the file and mentions the type of data present in the column.
skiprow It skips the row which is mentioned in the skiprow; it takes an integer value.
nrows It parses upto the given row mentioned in the nrows parameter.
skipfooter It skips the number of rows from the bottom of the file which is being parsed and contains this parameter.

Now let us understand these parameters more with the help of examples.

Understanding the parameter of panda read excel 

We have an Excel file having 2 sheets named "Student" and "Article". Using these sheets, we will see examples of the parameters we’ve discussed above.

io                 

It is a valid string path directing to the workbook. The string could be any URL, local path, filename, or file-like object. The path directing to the workbook must be in single quotes [  ' '  ].  

Code
 

import pandas

# printing file data using read_excel

read1=pandas.read_excel('C:/Users/Public/Pandas/Student_data.xlsx')

print(read1)
You can also try this code with Online Python Compiler
Run Code

Output

io example

Explanation

As we see, we are reading the data from the Student_data.xlsx file using read_excel and printing the data into pandas DataFrame object. So in the above example, the valid string path directs to the file "C:/Users/Public/Pandas/Student_data.xlsx " i.e., "io"  in the code. We can see that the Student data is printed, i.e., the first sheet only, so if "sheet_name" is not mentioned, it will print the first sheet only by default.

sheet_name

It reads the sheet name of the workbook. A sheet name can only be a string, and an integer can be used in zero-indexed sheet positions.
 

Code

import pandas

#printing file data with sheet name

read2=pandas.read_excel('Student_data.xlsx',sheet_name='Article')

print(read2)
You can also try this code with Online Python Compiler
Run Code


Output

Output sheet_name


Explanation

The sheet_name will print all the data present in that particular sheet. In the above example, we have used the Article as the sheet name, and then as you can see, all the data of the articles sheet is printed.

header

It reads the header row. If there is no header, then use none. The dtype is an object. dtype is nothing but the datatype.
 

Code

import pandas

read1=pandas.read_excel('Student_data.xlsx')

# To print the heading row

print(read1.columns.ravel())
You can also try this code with Online Python Compiler
Run Code


Output

output (header)


Explanation

"read1.columns.ravel()" is used to get the data of the header row, which is printed using print(). We are using the read1.columns.ravel() to print the header as it will print the column names at index 1, which eventually is the header row.

use_col

It parses all the columns if none is mentioned, and it will parse the column of that index that is mentioned. If the list of string or integer is mentioned, it will parse all the columns mentioned.
 

Method 1

import pandas

#using usecols to prints the given columns by column number

read1=pandas.read_excel('Student_data.xlsx',usecols=[1,2])

print(read1)
You can also try this code with Online Python Compiler
Run Code


Method 2

import pandas

#using usecols to print the given columns, you can use the heading name also

read1=pandas.read_excel('Student_data.xlsx',usecols=['Student ID','Student Name','Student Marks'])

print(read1)
You can also try this code with Online Python Compiler
Run Code

 

Output 1

output 1 usecol


Output 2

output 2 usecol


Explanation

usecol=[ ]  to parse through the specific column and get the desired data. It can be done in two different ways:

  1. Using column number
  2. Using column name

index_col

It reads the column mentioned in the index_col and will print the column at the index position. We’ll pass None if there is no such column.

Code

import pandas

#using student ID as index_col

read1=pandas.read_excel('Student_data.xlsx',index_col='Student ID')

print(read1)
You can also try this code with Online Python Compiler
Run Code


Output

indexcol output

Explanation

It will read the column mentioned in the index_col and will print the column at the index position. As you see in the above example, the "Student ID" has become the index of the data present.

dtype

It tells the type of data present in the file and also mentions the type of data present in the column.


Code

import pandas

read1=pandas.read_excel('Student_data.xlsx')

#using read1.dtypes to get the datatype of the read data

print(read1.dtypes)
You can also try this code with Online Python Compiler
Run Code


Output

dtype output


Explanation

This is used to check the file's data type and each column's data type. Look at the above output; you will see that different columns have different datatypes, and the file's data type is also mentioned as an "object".

skiprow

It skips the row which is mentioned in the skiprow, it takes an integer value.
 

Code

import pandas

#using the row number to skip the row

read1=pandas.read_excel('Student_data.xlsx',skiprows=[1])

print(read1) 
You can also try this code with Online Python Compiler
Run Code


Output

skiprow output


Explanation

This will skip the parsing of the data of the row mentioned; like in the above code, row 1 is skipped as the "skiprow=1".  

nrows

It will parse upto the given row mentioned in the nrows parameter.
 

Code

import pandas

# Use the number of rows to print only desired number of rows

read1=pandas.read_excel('Student_data.xlsx',nrows=3)

print(read1)
You can also try this code with Online Python Compiler
Run Code

 
Output

nrows output

 
Explanation

This will parse into the desired number of rows only, as you can see in the above code "nrow=3"so it parses up to three rows.

skipfooter     

It is used to skip the number of rows from the bottom of the file which is being parsed and contains this parameter.
 

Code

import pandas

# Use the number of rows to skip the row from the last using skipfooter

read1=pandas.read_excel('Student_data.xlsx',skipfooter=1)

print(read1)
You can also try this code with Online Python Compiler
Run Code


Output

skipfooter output


Explanation 

skipfooter is used to skip the data at the bottom of the page or the row present at the bottom. In the above example, the last row is skipped.

Frequently Asked Question

Can pandas read csv read excel?

The pandas library can read data from CSV or excel files. The pandas function that can be used is read_csv() or read_excel(). The file path can only contain a string and must include the file name and extension.

Which Python package read Excel file?

Openpyxl is a tool that lets you read from or write to Excel files using Python. Data scientists use it to analyze and manipulate data, copy information, mine for insights, make charts, format sheets, add formulas, and automate tasks for data processing.

What is the difference between read CSV and read Excel?

CSV is a plain text format with a series of values separated by commas, whereas Excel is a binary file containing data for all worksheets in a workbook. CSV files are unable to perform data operations, but Excel can.

Conclusion

In this article, we learned the pandas library in Python, the pandas read excel function, and the different parameters of the pandas read excel function. We hope this article helped you in knowing more about the pandas read excel.

If you want to learn more, refer to these articles:

You may refer to our Guided Path on Code Studios for enhancing your skill set on DSACompetitive ProgrammingSystem Design, etc. Check out essential interview questions, practise our available mock tests, look at the interview bundle for interview preparations, and so much more!

Happy Learning, Ninja!

Live masterclass