Table of contents
1.
Introduction
2.
What is an inverted index?
3.
Utility of indexing
4.
Example
4.1.
Inverted index for table flower
5.
Types of inverted index
5.1.
Record level inverted index
5.2.
Word level inverted index
6.
FAQs
7.
Key takeaways
Last Updated: Mar 27, 2024

Inverted index

Author Ankit Kumar
0 upvote
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

Let us start this article by performing an activity. Try to find your name in this blog in the next three minutes! 

Did you find out your name? No? Maybe you did not check the blog thoroughly. You can recheck it. Did you find out your name this time? Again no? Your name is not in the blog, but you searched it, right? And it would have taken so much time to do this. Imagine you could have a tool by which you can directly type the word you are looking for, and the output will give you the word's location in no time. This tool is nothing but indexing or inverted index.

We may encounter this kind of situation many times while handling databases, so searching the particular word or group of words is very difficult without using inverted indexes. In indexing, we store the content using the indexing data structures so that whenever we demand the information, it can be accessed smoothly and within no time.

Not only this but there are benefits of indexing in handling the input/output cost of CPU also.

So let us dive deeper into the topic of indexing without wasting time as you already have wasted time in searching for your name.

Must Recommended Topic, Generalization in DBMS.

Recommended Topic, Schema in DBMS

What is an inverted index?

Inverted indexes are nothing but a kind of Data Structure similar to that of hashmaps; they store the data and map them with their locations to be accessed by the users quickly.

It is handy in searching for a particular thing in a database with their locations. Let us now see how they are helpful.

Also See, Multiple Granularity in DBMS and  Checkpoint in DBMS.

Utility of indexing

As you have already seen in the introduction, how expensive the process of searching a word is and how easy it is to find a talk with the help of an inverted index. So this was the first utility of the indexing. Another utility reduces the input/output cost inside the CPU. Let us see how?

We know that the architecture of a computer is such that the CPU generates the instructions according to the secondary memory (hard disk). Still, the speed of secondary memory is not compatible with the CPU, so we use the main memory to store the data temporarily. Whenever the CPU creates the instructions, the main memory inputs the required data from the secondary memory and exits the information, which is no more required. This process is costly. With the help of indexing, we create indexes for the database, making it easier for the CPU to find out the required data.

Now, as we have seen the utility of indexing, let us understand it with an example.

Example

Let us consider these table flowers in the database.

Flower Id Flower_name Flower_color
7 Rose Red
9 Lotus White, Pink
11 China rose Red, Orange

Suppose we want to search for the word rose in the above table, The traditional way is to traverse throughout the table, but it will be taking a tremendous amount of time. If we create an inverted index for this which will look like the following, It will be searched easily.

Inverted index for table flower

Flower_id term
7, 11 Rose
9 Lotus
11 China
7, 11 Red
9 White
9 Pink
11 Orange

Looking at the above table of the inverted index, one can find the word.

Now let us see the types of inverted index.

You can also read about - Specialization and Generalization in DBMS and Recursive Relationship in DBMS

Types of inverted index

Briefly, there are two types of inverted index

  1. Record level inverted index
  2. Word level inverted index

Record level inverted index

It contains the list of references to documents of each word. Comparatively, it is less functional than the other type, but it takes less time to form and requires low maintenance.

Word level inverted index

It contains the position of each word in a document. It is more functional, but it requires more time and high maintenance.

Let us now see some frequently asked questions on this topic.

You can also read about the Multiple Granularity Locking.

FAQs

  1. Is there any disadvantage of indexing?
    Though we discussed many advantages of inverted indexing, there are specific limitations. It acquires colossal space and to create an inverted index is again an expensive process.
     
  2. How can we transform the data before searching and saving it?
    Transformation can do this with the help of two methods
    Drop the stop words - We may drop the most common words in the database like “I, in, is.”
    Stemming is a process of transforming an expression into root form by clipping the ending of the word.
     
  3. What is lemmatization?
    It is a process of changing the word into its dictionary form. For example - Running would be altered to Run after lemmatization. This ends the blog, and let us summarize our learning

Key takeaways

In this article, we started with a fun yet time taking activity, and with its help, we learned about the utility of indexing in DBMS, followed by some examples of it and its types. At last, we saw some frequently asked questions from the topic, but that is not all. You need to access Coding Ninjas Studio to get ahead of your competitors and solve some critical problems by checking out our list here. Top 100 SQL problems

Till then, happy learning!

Live masterclass