Examples of Using pandas.concat
Example 1: Concatenating DataFrames Vertically
Let’s start with a basic example where we concatenate two DataFrames vertically (one below the other).
Python
import pandas as pd
df1 = pd.DataFrame({
'A': ['A0', 'A1', 'A2'],
'B': ['B0', 'B1', 'B2']
}, index=[0, 1, 2])
df2 = pd.DataFrame({
'A': ['A3', 'A4', 'A5'],
'B': ['B3', 'B4', 'B5']
}, index=[3, 4, 5])
result = pd.concat([df1, df2])
print(result)
You can also try this code with Online Python Compiler
Run Code
Output
A B
0 A0 B0
1 A1 B1
2 A2 B2
3 A3 B3
4 A4 B4
5 A5 B5
In this example, df1 and df2 are concatenated along rows (axis=0), which stacks the rows of df2 below df1.
Example 2: Concatenating DataFrames Horizontally
You can also concatenate DataFrames horizontally (side by side).
Python
df3 = pd.DataFrame({
'C': ['C0', 'C1', 'C2']
}, index=[0, 1, 2])
result = pd.concat([df1, df3], axis=1)
print(result)
You can also try this code with Online Python Compiler
Run Code
Output
A B C
0 A0 B0 C0
1 A1 B1 C1
2 A2 B2 C2
In this example, df1 and df3 are concatenated along columns (axis=1), which places df3 next to df1.
Example 3: Handling Different Columns
If the DataFrames have different columns, pandas.concat will include all columns and fill missing values with NaN.
Python
df4 = pd.DataFrame({
'A': ['A6', 'A7'],
'D': ['D6', 'D7']
}, index=[6, 7])
result = pd.concat([df1, df4], sort=False)
print(result)
You can also try this code with Online Python Compiler
Run Code
Output
A B D
0 A0 B0 NaN
1 A1 B1 NaN
2 A2 B2 NaN
6 A6 NaN D6
7 A7 NaN D7
Here, columns B and D are filled with NaN where data is missing.
Example 4: Using Hierarchical Indexing
You can use the keys parameter to create a hierarchical index.
Python
result = pd.concat([df1, df2], keys=['df1', 'df2'])
print(result)
You can also try this code with Online Python Compiler
Run Code
Output
A B
df1 0 A0 B0
1 A1 B1
2 A2 B2
df2 3 A3 B3
4 A4 B4
5 A5 B5
In this example, the keys parameter adds a hierarchical index to the result.
Key Features of pandas.concat
- Flexible Concatenation: Concatenate along rows or columns, or combine multiple DataFrames and Series.
- Handling Different Indices: pandas.concat aligns DataFrames based on their indices. Missing values are filled with NaN.
- Hierarchical Indexing: Use the keys parameter to add hierarchical indexing, making it easier to identify which DataFrame each row came from.
- Handling Missing Data: Options to fill or drop missing values after concatenation.
Frequently Asked Questions
Can pandas.concat handle DataFrames with different columns?
Yes, it includes all columns and fills missing values with NaN.
How do I concatenate DataFrames horizontally?
Set axis=1 in the pandas.concat function.
What does the ignore_index parameter do?
When set to True, it resets the index and ignores the existing index.
How can I add a hierarchical index?
Use the keys parameter to add a hierarchical index.
What if I need to handle missing data after concatenation?
Use methods like fillna or dropna to manage missing data.
Conclusion
The pandas.concat function is a powerful tool for combining DataFrame and Series objects. It allows for flexible and efficient concatenation along rows or columns. By understanding how to use pandas.concat, you can effectively manage and analyze your data, making it easier to work with datasets from different sources.
You can also practice coding questions commonly asked in interviews on Coding Ninjas Code360.
Also, check out some of the Guided Paths on topics such as Data Structure and Algorithms, Competitive Programming, Operating Systems, Computer Networks, DBMS, System Design, etc., as well as some Contests, Test Series, and Interview Experiences curated by top Industry Experts.