Table of contents

Introduction

1. Indexing

2. Hashing

3.1.

Python

Difference between Indexing and Hashing in DBMS:

10.

11.

Frequently Asked Questions

11.1.

Can indexing and hashing be used together in a database?

11.2.

How do I choose between indexing and hashing for a specific database table?

11.3.

What are the trade-offs of using indexing and hashing in terms of storage space?

12.

Conclusion

Last Updated: Aug 18, 2024

Easy

Indexing and Hashing in DBMS

Author Ravi Khorwal

Introduction

Databases are essential for storing and managing large amounts of data efficiently. To quickly access and retrieve specific information from a database, we use two important techniques: indexing and hashing. These methods help optimize database performance by reducing the time it takes to search for and locate desired data.

In this article, we will talk about the concepts of indexing and hashing in database management systems (DBMS) and understand their differences with the help of proper examples.

1. Indexing

Indexing is a technique used in databases to improve the speed of data retrieval. It works by creating a separate data structure, called an index, which contains a copy of selected columns from a table along with a pointer to the corresponding row in the original table. This allows the database to quickly locate the desired data without having to scan the entire table.

Imagine you have a library with thousands of books. To find a specific book, you could either search through each shelf one by one, or you could use a catalog that lists all the books along with their locations. The catalog acts as an index, making it faster to find the book you need.

In a database, an index is created on one or more columns of a table. For example, let's say we have a table called "students" with columns "id", "name", and "age". We can create an index on the "name" column to speed up searches based on student names.

Let’s see an example of how to create an index in SQL

CREATE INDEX idx_student_name ON students (name);

This statement creates an index named "idx_student_name" on the "name" column of the "students" table.

When we execute a query that includes a condition on the indexed column, like

SELECT * FROM students WHERE name = 'Rahul';

The database can use the index to quickly locate the rows that match the condition, rather than scanning the entire table.

Indexing is particularly useful for tables with a large number of rows and for queries that frequently search based on specific columns. However, it's important to note that indexes also consume additional storage space and can slow down insert, update, and delete operations, as the index needs to be updated along with the table.

2. Hashing

Hashing is another technique used in databases to enable fast data retrieval. It involves using a hash function to convert a key value (such as a primary key) into a unique index or memory address where the corresponding data is stored. This allows for constant-time (O(1)) lookup, regardless of the size of the dataset.

A hash function takes an input value and produces a fixed-size output, called a hash code or hash value. The hash function should be deterministic, meaning that for the same input, it always generates the same hash code. The goal is to distribute the keys evenly across the available memory addresses to minimize collisions (when multiple keys map to the same hash code).

For example :

Python

Python

class HashTable:

    def __init__(self, size):

        self.size = size

        self.table = [[] for _ in range(size)]



    def _hash_function(self, key):

        return hash(key) % self.size


    def insert(self, key, value):

        hash_index = self._hash_function(key)

        self.table[hash_index].append((key, value))


    def search(self, key):

        hash_index = self._hash_function(key)

        for item in self.table[hash_index]:

            if item[0] == key:

                return item[1]

        return None


# Example usage

hash_table = HashTable(10)

hash_table.insert("Rahul", 25)

hash_table.insert("Rinki", 30)

hash_table.insert("Harsh", 27)

print(hash_table.search("Rahul"))  

print(hash_table.search("Sinki"))

You can also try this code with Online Python Compiler

Run Code

Output

25
None

In this example, we create a simple hash table class with a fixed size. The `_hash_function` method takes a key and returns the corresponding index in the table using the built-in `hash` function and modulo operation. The `insert` method adds a key-value pair to the table, while the `search` method retrieves the value associated with a given key.

When we search for a key, the hash function quickly computes the index where the data should be stored, allowing for fast retrieval. If there are collisions (multiple keys mapping to the same index), we can handle them using techniques like chaining (storing multiple key-value pairs in a linked list at the same index) or open addressing (probing for the next empty slot).

Hashing is commonly used in databases for indexing, as well as in other data structures like dictionaries and sets. It provides excellent performance for lookups, especially when the dataset is large. However, we need to remember that it may not be suitable for range queries or scenarios where the keys are not known in advance.

Difference between Indexing and Hashing in DBMS:

Feature	Indexing	Hashing
Data Structure	Balanced tree (e.g., B-tree)	Hash table
Key Distribution	Keys are stored in sorted order	Keys are distributed uniformly across buckets
Search Complexity	O(log n) for balanced trees	O(1) on average, O(n) in worst case (collisions)
Range Queries	Efficient for range queries	Not suitable for range queries
Collision Handling	Not applicable	Chaining or open addressing
Insertion/Deletion	Requires rebalancing the tree	Requires rehashing when load factor exceeds threshold
Ordering	Preserves key order	Does not preserve key order
Uniqueness	Allows duplicate keys	Requires unique keys or additional handling for duplicates
Size Overhead	Moderate (depends on tree height)	High (depends on load factor and collision resolution)
Suitability	Suitable for range queries and ordered access	Suitable for fast lookups and unordered access

Frequently Asked Questions

Can indexing and hashing be used together in a database?

Yes, indexing and hashing can be used together in a database. Indexing can be used for columns that require range queries or ordered access, while hashing can be used for columns that need fast point lookups.

How do I choose between indexing and hashing for a specific database table?

The choice between indexing and hashing depends on the query patterns and requirements of your application. If you need to perform range queries or retrieve data in sorted order, indexing is a good choice. If you primarily need fast lookups based on unique keys, hashing is more suitable.

What are the trade-offs of using indexing and hashing in terms of storage space?

Indexing and hashing both require additional storage space compared to storing the data without any indexing. Indexing overhead depends on the size of the index and the chosen index structure, while hashing overhead depends on the load factor and collision resolution mechanism used. It's important to consider the storage trade-offs when designing your database schema.

Conclusion

In this article, we explained the concepts of indexing and hashing in database management systems. We learned that indexing is a technique that uses balanced tree structures to facilitate efficient data retrieval and is suitable for range queries and ordered access. On the other hand, hashing uses a hash function to distribute keys across a hash table, that provides fast lookups but not supporting range queries.

You can also check out our other blogs on Code360.

Live masterclass

Zomato Data Analysis Case Study: Ace 25L+ Roles in FoodTech

by Abhishek Soni

16 Mar, 2026

01:30 PM

40+ registered

Data Analysis for 20L+ CTC@Flipkart: End-Season Sales dataset

by Sumit Shukla

15 Mar, 2026

06:30 AM

267+ registered

Beginner to GenAI Engineer Roadmap for 30L+ CTC at Amazon

by Shantanu Shubham

15 Mar, 2026

08:30 AM

55+ registered

Multi-Agent AI Systems: Live Workshop for 25L+ CTC at Google

by Saurav Prateek

16 Mar, 2026

03:00 PM

8+ registered

Zomato Data Analysis Case Study: Ace 25L+ Roles in FoodTech

by Abhishek Soni

16 Mar, 2026

01:30 PM

40+ registered

Data Analysis for 20L+ CTC@Flipkart: End-Season Sales dataset

by Sumit Shukla

15 Mar, 2026

06:30 AM

267+ registered

View more events

Indexing and Hashing in DBMS

Are you ready for your Dream Job?

Introduction

1. Indexing

2. Hashing

Python

Difference between Indexing and Hashing in DBMS:

Frequently Asked Questions

Can indexing and hashing be used together in a database?

How do I choose between indexing and hashing for a specific database table?

What are the trade-offs of using indexing and hashing in terms of storage space?

Conclusion