Do you think IIT Guwahati certified course can help you in your career?
No
Introduction
The iloc function is a function found in the Pandas module. It is a powerful tool that enables users to select specific rows and columns of a DataFrame by their integer position. It can extract a subset of a DataFrame, perform conditional indexing, or assign values to particular rows and columns. Understanding how to use the iloc function in python can be challenging for beginners in Python, but don't worry, we have your back!
This article will discuss how to use the iloc function in Python. We will cover its syntax, use cases, limitations, and unique features. So without further ado, let's get started!
iloc stands for "integer location". The iloc function is used to access and manipulate data in a tabular format. It is used mainly for selecting specific rows and columns of a data frame based on their integer indices.
With the use of the iloc function in Python, you can easily retrieve, filter, and transform data from a data frame. This feature makes the iloc function a powerful tool for data analysis. Whether you're working with large datasets or need to perform complex data operations, iloc can help you streamline your workflow and achieve your desired results.
To use the iloc function, you have to install the pandas library. You can do this by following below command:
pip install pandas
(or)
pip3 install pandas
Syntax of iloc() function
Following is the syntax of the iloc function in python:
df.iloc[row_selection, column_selection]
Parameter of iloc() function
The parameter used in the syntax of the iloc function in python:
row_selection
It is an optional parameter.
This parameter specifies the rows to be selected based on their integer index.
It can be a single integer, a list of integers, or a slice object.
For example, to select the first three rows of a DataFrame, you can use df.iloc[0:3].
column_selection
It is an optional parameter.
This parameter specifies the columns selected based on their integer index.
It can be a single integer, a list of integers, or a slice object.
For example, to select the first two columns of a DataFrame, you can use df.iloc[:, 0:2].
Return Value of iloc() function
The iloc function in python returns a data set based on the parameter we provide. You can provide dataset input in generally two formats.
First, if we only provide one value, i.e., a row or a column value, the iloc function returns a Pandas Series.
If we provide two values, both the row and the column. The iloc function will return the whole contents of the specified cell. Python's iloc method returns a Pandas DataFrame if we supply a list of values.
Use Cases of iloc function
The iloc function in Python is a versatile tool that can be used for a wide range of data analysis and manipulation tasks. Here are some common use cases for the iloc function:
Selecting subsets of data: One of the most common use cases for iloc is to extract subsets of data from a DataFrameor Series for further analysis. For example, you might use iloc to extract the top 10 rows of a DataFrame or select only the relevant columns for a particular analysis.
Filtering data based on specific criteria: iloc can also be used to filter data based on specific criteria. For example, you might use the iloc function in Python to select only the rows of a DataFrame where a certain column meets a certain condition (e.g., all rows where the value in column A is greater than 10).
Modifying data in place: Another common use case for the iloc function in Python is to modify values in a DataFrame or Series. For example, you might use iloc to set all values in a certain row or column to a specific value or to replace missing values with a specific value.
Combining multiple DataFrames or Series: iloc function in Python can also be used to combine multiple DataFrames or Series into a single DataFrame or Series. For example, you might use iloc to concatenate two DataFrames based on their row indices or to join two DataFrames based on their column indices.
Reshaping data: Finally, the iloc function in Python can be used to reshape data by pivoting, stacking, or unstacking DataFrames or Series. For example, you might use iloc to pivot a DataFrame from a long format to a wide format or to stack multiple columns into a single column.
Types of indexing with iloc function
There are different types of indexing available with iloc:
Single integer indexing: Here, you use a single integer index to select a single row or column.
List indexing: For a list of integer indices, you can use the iloc list indexing to select multiple rows or columns.
Slice indexing: By using this, you can slice objects to select a range of rows or columns.
Boolean indexing: With boolean indexing, you can select rows or columns based on a boolean condition.
Working of Python's iloc() function
Python's iloc() function is an important tool in Pandas for data manipulation. It allows the selection and retrieval of specific rows and columns in DataFrames or Series using integer-based indexing.
iloc() allows to identification data by specifying row and column indices numerically. For instance, df.iloc[0] retrieves the first row, while df.iloc[:, 2] fetches the third column for all rows. It employs zero-based indexing, meaning the first row or column is at index 0, the second at index 1, and so on.
The function provides versatility by taking single integers, slices, or lists of integers for indexing. It returns a new DataFrame or Series, facilitating precise data extraction and analysis in Pandas, essential for data manipulation tasks.
Examples of using the iloc function
Following are some practical examples of using iloc to access specific rows and columns of a data frame.
The data frame we use in the examples below is:
Name
Age
City
Alankrit
22
Jaipur
Vasu
22
Chandigarh
Ayush
23
Lucknow
1. Accessing a single row and all columns
Below is an example of accessing a single row and all the columns of the particular data associated with it.
import pandas as pd
# Create a sample DataFrame
df = pd.DataFrame({'name': ['Alankrit', 'Vasu', 'Ayush'], 'age': [22, 22, 23], 'city': ['Jaipur', 'Chandigarh', 'Lucknow']})
# Access the second row of the DataFrame
second_row = df.iloc[1, :]
# Output the second-row
print(second_row)
Output:
name
Vasu
age
22
city
Chandigarh
Name: 1
dtype: object
In this example, we are accessing the 1st row (0-based indexing) and all columns of that row.
2. Accessing multiple rows and all columns
Below is an example of accessing multiple rows and all the columns associated with them.
import pandas as pd
# Create a sample DataFrame
df = pd.DataFrame({'name': ['Alankrit', 'Vasu', 'Ayush'], 'age': [22, 22, 23], 'city': ['Jaipur', 'Chandigarh', 'Lucknow']})
# Access the first and third rows of the DataFrame
first_third_rows = df.iloc[[0, 2], :]
# Output the first and third rows
print(first_third_rows)
Output:
name
age
city
0
Alankrit
22
Jaipur
2
Ayush
23
Lucknow
In this example, we are accessing the 0th and 2nd row and all the columns of these rows are fetched.
3. Accessing a single column and all rows
Below is an example of accessing a single column and all associated rows using the iloc function in python.
import pandas as pd
# Create a sample DataFrame
df = pd.DataFrame({'name': ['Alankrit', 'Vasu', 'Ayush'], 'age': [22, 22, 23], 'city': ['Jaipur', 'Chandigarh', 'Lucknow']})
# Access the 'name' column of the DataFrame
name_column = df.iloc[:, 0]
# Output the 'name' column
print(name_column)
Output:
0
Alankrit
1
Vasu
2
Ayush
Name: name
dtype: object
In this example, we are only accessing the first column of all the rows.
4. Accessing multiple columns and all rows
Below is an example of accessing multiple columns and all the rows using the iloc function in python.
import pandas as pd
# Create a sample DataFrame
df = pd.DataFrame({'name': ['Alankrit', 'Vasu', 'Ayush'], 'age': [22, 22, 23], 'city': ['Jaipur', 'Chandigarh', 'Lucknow']})
# Access the 'name' and 'city' columns of the DataFrame
name_city_columns = df.iloc[:, [0, 2]]
# Output the 'name' and 'city' columns
print(name_city_columns)
Output:
name
city
0
Alankrit
Jaipur
1
Vasu
Chandigarh
2
Ayush
Lucknow
In this example, we are fetching all the rows and in each row, we are fetching its 0th and 2nd column.
Features and Capabilities of iloc function
The iloc function in pandas provides powerful features and capabilities for selecting data from a DataFrame based on its integer-location index. Here are some of its key features and capabilities:
Integer-based Indexing: iloc allows you to select rows and columns from a DataFrame using integer-based indexing, which means you can specify the exact integer positions of the rows and columns you want to select.
Single and Multiple Selection: You can use iloc to select single rows or columns by providing their integer positions, or you can select multiple rows or columns by passing a list or slice of integer positions.
Positional Indexing: iloc enables precise selection of data based on its position in the DataFrame, regardless of the index labels or column names.
Slicing and Subsetting: You can use slicing with iloc to select a range of rows or columns based on their integer positions. This allows for efficient subsetting of large datasets.
Integer Ranges and Steps: iloc supports specifying integer ranges and steps, allowing for flexible selection of consecutive rows or columns.
Data Manipulation and Analysis: Once data is selected using iloc, you can perform various data manipulation and analysis operations, such as computation, aggregation, filtering, and visualization.
Compatibility and Consistency: iloc provides consistent behavior across different versions of pandas and is compatible with other DataFrame indexing and selection methods, such as loc and boolean indexing.
Limitations of iloc function
While iloc is a handy function for indexing a DataFrame by integer position. It has some limitations too. Here are some of the limitations of the iloc function in Python:
1. Cannot index by label: iloc is designed to index a DataFrame by integer position only, which means it cannot be used to index by the label.
For example, suppose we have a DataFrame with integer labels. We can use iloc to select specific rows or columns using integer-based indexing. However, we cannot use iloc to select rows or columns using label-based indexing:
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}, index=[0, 1, 2])
# selects the first row
df.iloc[0]
# selects the second column
df.iloc[:, 1]
#trying iloc for label indexing
df.iloc['A']
# raises a TypeError: cannot do positional indexing on RangeIndex with these indexers [A] of type str
2. Not flexible for non-integer indexes: If a DataFrame has a non-integer index, such as a DateTime index, using iloc to select specific rows or columns can be more complicated and error-prone.
Suppose we have a DataFrame with a DateTime index. If we try to use iloc to select specific rows or columns, we will get a TypeError because the index is not an integer index. To select a specific row or column, we need to use loc or iloc with the index labels:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(5, 2), index=pd.date_range('20230319', periods=5), columns=list('AB'))
# TypeError: cannot do positional indexing on DatetimeIndex with these indexers [0] of type int
df.iloc[0]
# selects the row with index label '2023-03-19'
df.loc['2023-03-19']
# selects the first column using integer-based indexing
df.iloc[:, 0]
3. Can be ambiguous: If a DataFrame has integer labels that are not in a strictly increasing order, using iloc to index by integer position can be ambiguous and lead to unexpected results.
Suppose we have a DataFrame with non-consecutive integer labels. If we try to use iloc to select a specific row or column, we may get the wrong row or column if the integer labels are not in a strictly increasing order.
The DataFrame table of the following will be:
A
B
0
1
4
2
2
5
4
3
6
To avoid ambiguity, we need to use loc with the index labels instead:
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}, index=[0, 2, 4])
# returns the row with index label 2, not 1
df.iloc[1]
# returns the column with label 'B', not 'A'
df.iloc[:, 1]
# selects the row with index label 2
df.loc[2]
# selects the column with label 'A'
df.loc[:, 'A']
4. Not suitable for mixed-type DataFrames: If a DataFrame has columns of mixed types, such as both numeric and string columns, using iloc to index specific rows or columns can be challenging and may not produce the desired result.
Suppose we have a DataFrame with columns of mixed types. If we try to use iloc to select specific rows or columns, we may get unexpected results because iloc uses integer-based indexing:
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, 2, 3], 'B': ['hello', 'hey', 'hii']})
# selects the second column, but returns an object dtype Series instead of a string dtype Series
df.iloc[:, 1]
The iloc function in Python (Pandas library) is used to select and access data in DataFrames or Series using integer-based indexing. It allows you to specify rows and columns by their numerical indices.
How are ILOC () and LOC () different?
iloc() and loc() are different in their indexing methods. iloc() uses integer-based indexing, selecting rows and columns by their integer positions, while loc() uses label-based indexing, selecting rows and columns by their index labels or column names.
What is the position of ILOC?
The iloc() function in pandas is positioned as a method for integer-location-based indexing, allowing precise selection of rows and columns based on their integer positions within the DataFrame.
Why do we use ILOC function?
We use the iloc() function in pandas to perform integer-location-based indexing, which allows for precise selection of rows and columns based on their integer positions within the DataFrame, regardless of index labels or column names.
Is Loc or ILOC faster?
In general, iloc() tends to be faster than loc() for integer-based indexing operations because it directly accesses data by integer position without having to perform label-based lookups. However, the difference in speed may vary depending on the specific use case and size of the DataFrame.
Conclusion
This article briefly discussed the iloc function in python. We saw its syntax, parameters, types and various examples. We also discussed the uses and limitations of the iloc function in python.
We hope this article has helped you understand the iloc function in python and all about it.
You can check our other articles for further reading: