Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction
2.
Syntax of pandas.concat
3.
Examples of Using pandas.concat
3.1.
Example 1: Concatenating DataFrames Vertically
3.2.
Python
3.3.
Example 2: Concatenating DataFrames Horizontally
3.4.
Python
3.5.
Example 3: Handling Different Columns
3.6.
Python
3.7.
Example 4: Using Hierarchical Indexing
3.8.
Python
4.
Key Features of pandas.concat
5.
Frequently Asked Questions
5.1.
Can pandas.concat handle DataFrames with different columns?
5.2.
How do I concatenate DataFrames horizontally?
5.3.
What does the ignore_index parameter do?
5.4.
How can I add a hierarchical index?
5.5.
What if I need to handle missing data after concatenation?
6.
Conclusion
Last Updated: Jul 20, 2024
Easy

pandas.concat in Python

Author Sinki Kumari
0 upvote

Introduction

In data analysis, combining data from different sources or structures is a common task. The pandas library in Python provides a function called concat that makes this task easy. pandas.concat allows you to concatenate two or more DataFrame or Series objects along a specified axis. T

pandas.concat in Python

This function is useful for merging datasets, stacking them, or aligning data from various sources.

Syntax of pandas.concat

The basic syntax of pandas.concat is:

pandas.concat(objs, axis=0, join='outer', ignore_index=False, keys=None, levels=None, names=None, verify_integrity=False, sort=False, copy=True)

Examples of Using pandas.concat

Example 1: Concatenating DataFrames Vertically

Let’s start with a basic example where we concatenate two DataFrames vertically (one below the other).

  • Python

Python

import pandas as pd

df1 = pd.DataFrame({

   'A': ['A0', 'A1', 'A2'],

   'B': ['B0', 'B1', 'B2']

}, index=[0, 1, 2])

df2 = pd.DataFrame({

   'A': ['A3', 'A4', 'A5'],

   'B': ['B3', 'B4', 'B5']

}, index=[3, 4, 5])

result = pd.concat([df1, df2])

print(result)
You can also try this code with Online Python Compiler
Run Code


Output

   A   B
0  A0  B0
1  A1  B1
2  A2  B2
3  A3  B3
4  A4  B4
5  A5  B5


In this example, df1 and df2 are concatenated along rows (axis=0), which stacks the rows of df2 below df1.

Example 2: Concatenating DataFrames Horizontally

You can also concatenate DataFrames horizontally (side by side).

  • Python

Python

df3 = pd.DataFrame({

   'C': ['C0', 'C1', 'C2']

}, index=[0, 1, 2])

result = pd.concat([df1, df3], axis=1)

print(result)
You can also try this code with Online Python Compiler
Run Code


Output

   A   B   C
0  A0  B0  C0
1  A1  B1  C1
2  A2  B2  C2


In this example, df1 and df3 are concatenated along columns (axis=1), which places df3 next to df1.

Example 3: Handling Different Columns

If the DataFrames have different columns, pandas.concat will include all columns and fill missing values with NaN.

  • Python

Python

df4 = pd.DataFrame({

   'A': ['A6', 'A7'],

   'D': ['D6', 'D7']

}, index=[6, 7])

result = pd.concat([df1, df4], sort=False)

print(result)
You can also try this code with Online Python Compiler
Run Code


Output

    A    B    D
0   A0   B0  NaN
1   A1   B1  NaN
2   A2   B2  NaN
6   A6  NaN   D6
7   A7  NaN   D7


Here, columns B and D are filled with NaN where data is missing.

Example 4: Using Hierarchical Indexing

You can use the keys parameter to create a hierarchical index.

  • Python

Python

result = pd.concat([df1, df2], keys=['df1', 'df2'])

print(result)
You can also try this code with Online Python Compiler
Run Code


Output

       A   B
df1 0  A0  B0
    1  A1  B1
    2  A2  B2
df2 3  A3  B3
    4  A4  B4
    5  A5  B5


In this example, the keys parameter adds a hierarchical index to the result.

Key Features of pandas.concat

  1. Flexible Concatenation: Concatenate along rows or columns, or combine multiple DataFrames and Series.
     
  2. Handling Different Indices: pandas.concat aligns DataFrames based on their indices. Missing values are filled with NaN.
     
  3. Hierarchical Indexing: Use the keys parameter to add hierarchical indexing, making it easier to identify which DataFrame each row came from.
     
  4. Handling Missing Data: Options to fill or drop missing values after concatenation.

Frequently Asked Questions

Can pandas.concat handle DataFrames with different columns?

Yes, it includes all columns and fills missing values with NaN.

How do I concatenate DataFrames horizontally?

Set axis=1 in the pandas.concat function.

What does the ignore_index parameter do?

When set to True, it resets the index and ignores the existing index.

How can I add a hierarchical index?

Use the keys parameter to add a hierarchical index.

What if I need to handle missing data after concatenation?

Use methods like fillna or dropna to manage missing data.

Conclusion

The pandas.concat function is a powerful tool for combining DataFrame and Series objects. It allows for flexible and efficient concatenation along rows or columns. By understanding how to use pandas.concat, you can effectively manage and analyze your data, making it easier to work with datasets from different sources.

You can also practice coding questions commonly asked in interviews on Coding Ninjas Code360

Also, check out some of the Guided Paths on topics such as Data Structure and AlgorithmsCompetitive ProgrammingOperating SystemsComputer Networks, DBMSSystem Design, etc., as well as some Contests, Test Series, and Interview Experiences curated by top Industry Experts.

Live masterclass