Table of contents
1.
Introduction
2.
What is Pandas query() Method?
2.1.
Syntax
2.2.
Sample Data Frame 
2.3.
Python
3.
Basic Queries
3.1.
1. Query rows with Courses equals “R Programming”
3.2.
Python
3.3.
Python
3.4.
2. Query rows with Courses Fees greater than equals 23000 
3.5.
Python
4.
Multiple Conditions Query
4.1.
Python
5.
Query Rows using apply()
5.1.
Python
6.
Query using index 
6.1.
Python
7.
Frequently Asked Questions
7.1.
Pandas query() method for index filtering be used?
7.2.
Pandas query() method suitable for large datasets? 
7.3.
Variables be used in the Pandas query() method?
8.
Conclusion
Last Updated: Mar 27, 2024
Easy

Pandas query() Method

Author Dhruv Rawat
0 upvote
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

In the world of data analysis and manipulation, Pandas have become an important library that comes to the rescue. It contains many powerful tools, such as the query() method, which is a versatile way to filter out and extract data from the DataFrames. 

pandas query() method

Keep reading the blog to understand how the query() method can make our life easier when working with large datasets. We will discuss the Pandas query(), its syntax and several working examples with some faqs at last. 

What is Pandas query() Method?

Before we start looking into what the Pandas query() method does and its working, let's understand the basics. A DataFrame is like a table of data with rows and columns exactly similar to an Excel sheet. It's a popular data structure that s used in Pandas, which allows the storage and manipulation of data efficiently.

The Pandas query() method is like a smart assistant that lets users ask questions about the data. Instead of writing lengthy lines of code to filter data based on certain conditions, we can use query() to express the conditions in a more intuitive way.

In a formal way, it queries the rows based on the expression, whether single or multiple column conditions given and then returns a new DataFrame. In case we want to update the existing referring DataFrame, we can use the inplace=True argument to do so.

Syntax

Below is the syntax:

DataFrame.query(expr, inplace=False, **kwargs)
You can also try this code with Online Python Compiler
Run Code
  • expr – expression takes conditions in order to query rows
     
  • inplace – Default value is False. When set to True, it updates the referring DataFrame and the query() method returns None
     
  • **kwargs –  Keyword arguments that work with eval()


DataFrame.query() takes condition in expression to select rows from a DataFrame. This expression can have one or multiple conditions.

Also, the dataframe is the name of the DataFrame, and the expression is written as a string enclosed in single quotes. We can use operators like ==, <, >, <=, and >= to create custom conditions.

Let’s see with sample data and run through some of the examples and explore the output to understand the query() method better. But first, we need to create a pandas DataFrame.

Sample Data Frame 

Refer below code in order to create it:

  • Python

Python

import pandas as pd
import numpy as np

dsa_courses = {
   'Courses': ["Machine Learning", "Deep Learning", "Big Data", "R Programming", "SQL Fundamentals"],
   'Fee': [22000, 25000, 23000, 24000, 26000],
   'Duration': ['40days', '50days', '30days', '20days', '10days'],
   'Discount': [1000, 2300, 1000, 1200, 2500]
}

df = pd.DataFrame(dsa_courses)
print(df)
You can also try this code with Online Python Compiler
Run Code


Output:

sample data frame output

We will be using the above sample data in the upcoming examples. So let us start.

Basic Queries

1. Query rows with Courses equals “R Programming”

  • Python

Python

df2=df.query("Courses == “R Programming")
print(df2)
You can also try this code with Online Python Compiler
Run Code


Output:

rows equals "R programming" output

The output shows all the details related to the R programming course.

In the above code example, a new DataFrame is returned after filtering the rows. What if we wanted to update the existing DataFrame only, use inplace=True.

Below is the example code:

  • Python

Python

df = pd.DataFrame(dsa_courses)
df2=df.query("Courses == 'R Programming'",inplace=True)
You can also try this code with Online Python Compiler
Run Code

2. Query rows with Courses Fees greater than equals 23000
 

  • Python

Python

df2=df.query("Fee >= 23000")
print(df2)
You can also try this code with Online Python Compiler
Run Code


Output:

rows where fees greater than equals 23000

The above output shows all the course details having fees greater than or equal to 23000.

Multiple Conditions Query

Most time, we need to select the rows based on multiple conditions, and by using multiple columns, we can do it as below:

  • Python

Python

print(df.query(" Fee >= 25000 and Fee <= 26000"))
You can also try this code with Online Python Compiler
Run Code


Output

multiple conditions query output

The above output displays the list of all the courses whose fees are in the range of 25k to 26k

Query Rows using apply()

The DataFrame.apply() method is used to apply the expression row by row and return the rows that have values that match the expression. The below example will return every match when Courses contains a list of specified string values.

  • Python

Python

print(df.apply(lambda row: row[df['Courses'].isin(['Big Data','Deep Learning'])]))
You can also try this code with Online Python Compiler
Run Code


Output:

query rows using apply() output


The above code examples use a lambda expression, which is nothing but a small anonymous function that can take any number of arguments and execute an expression.

So the lambda function is applied to each row and yields the output shown above.

Query using index 

We can use the query() method in Pandas to filter out the rows based on conditions involving the index of the DataFrame. Let's see an example:

  • Python

Python

print(df.query('index%2 == 0'))
You can also try this code with Online Python Compiler
Run Code


Output:

query using index output


In the above code, the query 'index%2 == 0' selects all the rows from the data frame df where the index values are even. So, the output will display the list of courses that have even-number indices.

Frequently Asked Questions

Pandas query() method for index filtering be used?

The pandas query() method can be easily used with an index, like 'index%2 == 0' . This will select all the rows where the index meets the given condition, such as even values.

Pandas query() method suitable for large datasets? 

Pandas query() is suitable for large datasets, but then it may be slower. It is recommended to balance its usage with other Pandas techniques for faster analysis of large datasets.

Variables be used in the Pandas query() method?

Yes, variables can be used in the query by using @. For example, query('Fee > @max_fee') filters rows with fees higher than the value stored in the max_fee variable.

Conclusion

Congratulations, you did a fantastic job!!. This article has gone through a comprehensive guide to Pandas query() and its syntax and discussed several working examples. At last, some frequently asked questions have been discussed.

Here are some more related articles:
 

Check out The Interview Guide for Product Based Companies and some famous Interview Problems from Top Companies, like AmazonAdobeGoogle, etc., on CodeStudio.

Also, check out some of the Guided Paths on topics such as Data Structure and AlgorithmsCompetitive ProgrammingOperating SystemsComputer Networks, DBMSSystem Design, etc., as well as some Contests, Test SeriesInterview Bundles, and some Interview Experiences curated by top Industry Experts only on CodeStudio.

We hope you liked this article.

"Have fun coding!”

Live masterclass