Inverted Index
The inverted index is a data structure or a search method that allows index storing as a mapping from data such as words or numbers to its locations in a database or a document or a set of documents. Its primary purpose is to allow a fast full-text search.
Read About - Specialization and Generalization in DBMS and Checkpoint in DBMS.
Working of Inverted Indexing:
Let's suppose you have a website or want one so that each page can be easily searched via keywords located on every page. Following are the steps of inverted indexing:
- Collect all the keywords by going to the first page.
- Check if the keyword is already present in the index for every keyword.
- Adds the reference of that page to that index if it is already present.
- If it's not present, create a new entry for that particular keyword
- Add additional information such as the total number of times it occurs in the document location in the document.
- Go to the next page and repeat step 1 until all pages are indexed.
- Sort the keywords.
Results of searching:
| Keywords | Pages |
| come-on | http://www.example.com/page2 |
| Cpu | http://www.example.com/page3 |
| darwin | http://www.example.com/page3 |
| greetings | http://www.example.com/page2 |
| hello | http://www.example.com/page1 |
Also See, Multiple Granularity in DBMS and Recursive Relationship in DBMS
Forward Index
Forward indexing is a data structure that stores are mapping from documents to words. It means that the document ID is the key, and the number of occurrences of each keyword is recorded in the table. When searching, the information of each document in the table is scanned until all documents containing the query keyword are found.
Working of Forward Indexing:
Let's suppose you have a website or want one so that each page can be easily searched via keywords located on every page. Following are the steps of forward indexing:
- Collect all the keywords by going to the first page
- Add the keywords in the index entry for that respective document
- Repeat step 1 until all pages are indexed by going to the next page.
After all the pages are indexed, a list of all pages will be there, having their associated keywords attached to it. So you'll search the index itself, and you will come up with the list of pages associated with it whenever you need to find all pages that match a particular keyword.
| Documents | Keywords |
| http://www.example.com/page1 | hello, world, hey, wait |
| http://www.example.com/page2 | greetings, world, tube, come-on |
| http://www.example.com/page3 | Mr, darwin, CPU, world |
The Key differences between Inverted Index and Forward Index
| Inverted Index | Forward Index |
| Maps words or text to documents. | Maps document to words. |
| Indexing is slower since a check needs to validate if the word exists. | Indexing is faster since adding texts/ words can be appended only. |
| Searching for information retrieval is faster. | Searching for information is slower. |
You can also read about the Multiple Granularity Locking.
Advantages of inverted indexing over forward indexing:
- It stops replicated keywords from being stored. It records some information by checking first if the keyword is already present.
- Each keyword has to process the previous indexes, making the indexing a bit slower.
- But on the other hand, this makes querying VERY fast. Since the search engine only needs to look up a single record, all pages are already there.





