Types Of Hash Function In Data Structures
There exist several types of hash function in data structures. We will focus on some of the most common hashing techniques present.
- Division Hashing
- Mid-Square Hashing
- Digit Folding Method
- Multiplication Hashing
Division Method
This method uses the modulo operator to compute the hash code for a key. The modulo operator returns the remainder of the key divided by the number of buckets in the table. For example, if the key is an integer and the number of buckets is 10, the hash function could be:
int hash(int key) {
return key % 10;
}
Input:
99
Output:
9
Syntax:
F(K) = k mod M;
Here, M is the table’s size, and k is the key value.
Advantage:
It is a very fast method as it requires just one basic operation.
Disadvantage:
The value of M chosen must be a prime number. Since using prime numbers will make the value more distributed.
Mid Square Method
This method involves squaring the key and extracting some digits from the middle of the resulting value. This is followed by taking the modulo of that value with the number of buckets in the table. This method is used when the key is a non-negative integer.
Syntax:
F(K) = h(k x k)
Here, k is the key value.
Code:
C++
#include <iostream>
#include <cmath>
using namespace std;
int hash(int key, int numBuckets) {
// square the key
int square = key * key;
// count the number of digits in the square
int digits = (int)log10(square) + 1;
// calculate the offset to extract the middle digits
int offset = (digits - numDigits) / 2;
// extract the middle digits
int middle = square / pow(10, offset);
// take the modulo of the middle digits with the number of buckets
middle %= numBuckets;
return middle;
}
int main() {
int key;
cout << "Enter a key: ";
cin >> key;
int numBuckets = 10;
int h = hash(key, numBuckets);
cout << "Hash code: " << h << endl;
return 0;
}
You can also try this code with Online C++ Compiler
Run Code
Output:
Enter a key: 12
Hash Code: 4
Advantages:
-
It is simple to implement.
-
It can be used with non-negative integer keys.
- It can handle non-uniformly distributed keys
Disadvantages:
-
The resulting hash code highly depends on the key's number of digits.
-
It is not suitable for use with keys that are not integers.
- It can lead to poor distribution of keys if the key values are not large enough.
Digit Folding Method
This method involves taking the key, breaking it down into groups of digits, and then summing the digits in each group. The resulting value is then taken modulo the number of buckets in the table. This method can be used when the key is an integer or a string.
Syntax:
k = k1, k2, k3, k4, ….., kn
s = k1+ k2 + k3 + k4 +….+ kn
F(K)= s
Here,s is obtained by adding the parts of the key k
Code:
int hash(int key, int numBuckets) {
int sum = 0;
while (key > 0) {
sum += key % 10;
key /= 10;
}
return sum % numBuckets;
}
int main()
{
int numBuckets = 10;
int key = 123;
cout << hash(key, numBuckets);
}
Output:
6
Advantages:
-
It is simple to implement.
-
It can be used with both integer and string keys.
- It can handle non-uniformly distributed key.
Disadvantages:
-
The resulting hash code may not be unique for all keys.
-
It is not suitable for use with large keys.
- It can lead to poor distribution of keys if the key values are not large enough.
Multiplication Method
This method involves multiplying the key by a constant value and then taking the fractional part of the result. The fractional part is multiplied by the number of buckets in the table, and the floor of that value is taken as the hash code.
Syntax:
h(K) = floor (M (kA mod 1))
Here,
M is the size of the hash table.
k is the key value.
A is a constant value.
Code:
int hash(int key) {
double A = 0.6180339887;
double val = key * A;
val = val - floor(val);
return (int)(table_size * val);
}
int main(){
int table_size = 10;
int key = 123;
cout << hash(key);
}
Output:
3
Advantages:
-
It produces a good distribution of keys.
-
It is suitable for use with both integer and real keys.
-
It can handle non-uniformly distributed keys.
Disadvantages:
-
It is more complex to implement than the division and folding methods.
-
It may not work well with small table sizes.
- It may not produce unique hash codes for all keys.
The choice of which hash function to use depends on the application's specific requirements, such as the type of keys, the size of the table, and the desired distribution of keys.
The Mid-Square Method is simple to implement and can handle non-uniformly distributed keys, but it is not suitable for use with keys that are not integers.
The Digit Folding Method can be used with both integer and string keys, but it may not produce unique hash codes for all keys.
In comparison, the Multiplication Method is considered to be one of the best hash functions as it produces a good distribution of keys and can handle non-uniformly distributed keys. However, it is more complex to implement than the other methods.
Also read - Data Structure MCQ
Frequently Asked Questions
What is hash function used?
Developers use Hash Functions to convert data into numeric representations. This number is called a hash code or hash value. Hash codes find application in both hash tables and cryptographic algorithms. The main aim of using these functions is to ensure that the data is spread evenly in the hash table to avoid collisions.
What is the best hash function?
There is no such best hash function. It depends on the requirement of the user. Good hash functions have minimal collisions, good distribution, and efficient computation, such as the cryptographic hash functions SHA-256 and Blake2.
What is the Key in hashing?
In hashing, the key is the input data or value provided to the hash function. The key generates a unique hash code or index for storing or retrieving data in a hash table. The key plays a vital role in identifying the hash value and the corresponding index in the hash table.
What is hash function for binary data?
A hash function for binary data is a mathematical algorithm that takes binary input and produces a fixed-size string of characters as output. It is used for data integrity verification and indexing binary data efficiently.
What is the formula for hash function?
A hash function converts input data (such as a string or file) into a fixed-size value or hash code. It typically involves a combination of arithmetic and bitwise operations, ensuring uniform distribution and minimizing collisions.
Conclusion
In this article, we discussed hashing in detail. We understood the hash function in data structure. The hash function enables us to uniquely identify and search data in constant time. There are several types of hash function in data structure and we can choose one according to our needs.
We hope you liked reading our blog. If you wish to explore more about hashing, please refer to the following blogs:
Visit our website to read more such blogs. Make sure you enroll in our other courses as well. You can take mock tests, solve problems, and interview puzzles. Also, you can check out some exciting interview stuff- interview experiences and an interview bundle for placement preparations. Do upvote our blog to help fellow ninjas grow.
Keep Grinding! 🦾
Happy Coding! 💻