Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction
2.
What is Hash Function in Data Structure?
3.
Types Of Hash Function In Data Structures
3.1.
Division Method
3.1.1.
Advantage:
3.1.2.
Disadvantage:
3.2.
Mid Square Method 
3.2.1.
Syntax:
3.2.2.
Code:
3.3.
C++
3.3.1.
Advantages:
3.3.2.
Disadvantages:
3.4.
Digit Folding Method
3.4.1.
Syntax:
3.4.2.
Code:
3.4.3.
Advantages:
3.4.4.
Disadvantages:
3.5.
Multiplication Method
3.5.1.
Syntax:
3.5.2.
Code: 
3.5.3.
Advantages:
3.5.4.
Disadvantages:
4.
Frequently Asked Questions
4.1.
What is hash function used?
4.2.
What is the best hash function?
4.3.
What is the Key in hashing? 
4.4.
What is hash function for binary data?
4.5.
What is the formula for hash function?
5.
Conclusion
Last Updated: Mar 27, 2024
Medium

Hash Function In Data Structure

Author Lokesh Sharma
1 upvote

Introduction

Hashing is a powerful and efficient method of storing and retrieving data, often used in computer science and programming. It's the backbone behind Data structure such as hash tables, sets, and associative arrays. It helps us to find our required data in almost constant time.

In this blog, we'll dive deep into the world of hashing. We will explore how hashing works, its various applications, and the techniques used to apply the hash function in data structure. 

Let us start by understanding what hashing is.

hash function in data structure

What is Hash Function in Data Structure?

A hash function in data structure is a function that takes an input (or 'message') and returns a fixed-size string of characters. It usually produces a unique output for a unique input. Even if the input to a hash function is of arbitrary size, the output (hash) is always of fixed size. It is a one-way function, that is, it is easy to compute the hash for any given input, but it is computationally infeasible to generate the original input from the hash.

The hash function in data structures is used to convert a key into an index (hash code) that can be used to access the data stored in a hash table. The goal of a hash function is to produce unique hash codes for each unique key, so that no two keys map to the same location in the table. The hash code is computed using a mathematical operation that takes the key as input and returns an integer value.

A good hash function should have the following characteristics:

  • It should be deterministic. This means that a given input should always produce the same output.
     
  • It should be fast to compute.
     
  • It should produce a uniformly distributed output for a random set of inputs.
     
  • It should be hard to predict the output for a given input.
     
  • Collision free.
Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

Types Of Hash Function In Data Structures

There exist several types of hash function in data structures. We will focus on some of the most common hashing techniques present.

  • Division Hashing
  • Mid-Square Hashing
  • Digit Folding Method
  • Multiplication Hashing

Division Method

This method uses the modulo operator to compute the hash code for a key. The modulo operator returns the remainder of the key divided by the number of buckets in the table. For example, if the key is an integer and the number of buckets is 10, the hash function could be:

int hash(int key) {
    return key % 10;
}

 

Input: 

99

 

Output: 

9

 

Syntax:

F(K) = k mod M;

 

Here, M is the table’s size, and k is the key value.

Advantage:

It is a very fast method as it requires just one basic operation.

Disadvantage:

The value of M chosen must be a prime number. Since using prime numbers will make the value more distributed.

Mid Square Method 

This method involves squaring the key and extracting some digits from the middle of the resulting value. This is followed by taking the modulo of that value with the number of buckets in the table. This method is used when the key is a non-negative integer.

Syntax:

F(K) = h(k x k)

 

Here, k is the key value. 

Code:

  • C++

C++

#include <iostream>
#include <cmath>
using namespace std;


int hash(int key, int numBuckets) {


// square the key
   int square = key * key;


// count the number of digits in the square
   int digits = (int)log10(square) + 1;


// calculate the offset to extract the middle digits
   int offset = (digits - numDigits) / 2;


// extract the middle digits
   int middle = square / pow(10, offset);


// take the modulo of the middle digits with the number of buckets
   middle %= numBuckets;
   return middle;
}


int main() {
   int key;
   cout << "Enter a key: ";
   cin >> key;


   int numBuckets = 10;
   int h = hash(key, numBuckets);
   cout << "Hash code: " << h << endl;
   return 0;
}

Output:

Enter a key: 12
Hash Code: 4

 

Advantages:

  • It is simple to implement.
     
  • It can be used with non-negative integer keys.
     
  • It can handle non-uniformly distributed keys

 

Disadvantages:

  • The resulting hash code highly depends on the key's number of digits.
     
  • It is not suitable for use with keys that are not integers.
     
  • It can lead to poor distribution of keys if the key values are not large enough.

Digit Folding Method

This method involves taking the key, breaking it down into groups of digits, and then summing the digits in each group. The resulting value is then taken modulo the number of buckets in the table. This method can be used when the key is an integer or a string.

Syntax:

k = k1, k2, k3, k4, ….., kn
s = k1+ k2 + k3 + k4 +….+ kn
F(K)= s

 

Here,is obtained by adding the parts of the key k

Code:

int hash(int key, int numBuckets) {
    int sum = 0;
    while (key > 0) {
        sum += key % 10;
        key /= 10;
    }
    return sum % numBuckets;
}
int main()
{
	int numBuckets = 10;
	int key = 123;
	cout << hash(key, numBuckets);
}

 

Output: 

6

 

Advantages:

  • It is simple to implement.
     
  • It can be used with both integer and string keys.
     
  • It can handle non-uniformly distributed key.

 

Disadvantages:

  • The resulting hash code may not be unique for all keys.
     
  • It is not suitable for use with large keys.
     
  • It can lead to poor distribution of keys if the key values are not large enough.

Multiplication Method

This method involves multiplying the key by a constant value and then taking the fractional part of the result. The fractional part is multiplied by the number of buckets in the table, and the floor of that value is taken as the hash code.

Syntax:

h(K) = floor (M (kA mod 1))

 

Here,

M is the size of the hash table.

k is the key value.

A is a constant value.

Code: 

int hash(int key) {
    double A = 0.6180339887;
    double val = key * A;
    val = val - floor(val);
    return (int)(table_size * val);
}
int main(){


int table_size = 10;
int key = 123;
cout << hash(key);
}

 

Output:

3

 

Advantages:

  • It produces a good distribution of keys.
     
  • It is suitable for use with both integer and real keys.
     
  • It can handle non-uniformly distributed keys.
     

Disadvantages:

  • It is more complex to implement than the division and folding methods.
     
  • It may not work well with small table sizes.
     
  • It may not produce unique hash codes for all keys.

 

The choice of which hash function to use depends on the application's specific requirements, such as the type of keys, the size of the table, and the desired distribution of keys. 

The Mid-Square Method is simple to implement and can handle non-uniformly distributed keys, but it is not suitable for use with keys that are not integers. 

The Digit Folding Method can be used with both integer and string keys, but it may not produce unique hash codes for all keys. 

In comparison, the Multiplication Method is considered to be one of the best hash functions as it produces a good distribution of keys and can handle non-uniformly distributed keys. However, it is more complex to implement than the other methods.

Also read - Data Structure MCQ

Frequently Asked Questions

What is hash function used?

Developers use Hash Functions to convert data into numeric representations. This number is called a hash code or hash value. Hash codes find application in both hash tables and cryptographic algorithms. The main aim of using these functions is to ensure that the data is spread evenly in the hash table to avoid collisions.

What is the best hash function?

There is no such best hash function. It depends on the requirement of the user. Good hash functions have minimal collisions, good distribution, and efficient computation, such as the cryptographic hash functions SHA-256 and Blake2. 

What is the Key in hashing? 

In hashing, the key is the input data or value provided to the hash function. The key generates a unique hash code or index for storing or retrieving data in a hash table. The key plays a vital role in identifying the hash value and the corresponding index in the hash table. 

What is hash function for binary data?

A hash function for binary data is a mathematical algorithm that takes binary input and produces a fixed-size string of characters as output. It is used for data integrity verification and indexing binary data efficiently.

What is the formula for hash function?

A hash function converts input data (such as a string or file) into a fixed-size value or hash code. It typically involves a combination of arithmetic and bitwise operations, ensuring uniform distribution and minimizing collisions.

Conclusion

In this article, we discussed hashing in detail. We understood the hash function in data structure. The hash function enables us to uniquely identify and search data in constant time. There are several types of hash function in data structure and we can choose one according to our needs.

We hope you liked reading our blog. If you wish to explore more about hashing, please refer to the following blogs:

 

Visit our website to read more such blogs. Make sure you enroll in our other courses as well. You can take mock testssolve problems, and interview puzzles. Also, you can check out some exciting interview stuff- interview experiences and an interview bundle for placement preparations. Do upvote our blog to help fellow ninjas grow.

Keep Grinding! 🦾

Happy Coding! 💻

Live masterclass