Key Properties of Hash Functions
A hash function is an algorithm that converts input data into a fixed-size string (hash), crucial for data indexing, cryptography, and integrity verification.
1. Deterministic
A good hash function always produces the same output for the same input. This property ensures consistent hashing results, making it reliable for data indexing and cryptographic functions.
2. Fixed Output Size
Regardless of the input size, a hash function produces a fixed-length output (e.g., 256-bit hash for SHA-256). This uniformity is essential for efficient storage and comparison.
3. Efficiency
Hash functions should compute outputs quickly, even for large inputs. High efficiency is crucial for applications like databases, cryptography, and distributed systems.
4. Uniformity
A well-designed hash function evenly distributes outputs across its range, reducing the likelihood of clustering and improving performance in hash tables and load balancing.
5. Pre-image Resistance
Given a hash output, it should be computationally infeasible to reverse-engineer the original input. This property is critical for secure password storage and cryptography.
6. Collision Resistance
It should be extremely difficult to find two different inputs that produce the same hash value. Collision resistance is vital for digital signatures and cryptographic security.
7. Avalanche Effect
A small change in input should drastically alter the output hash. This property enhances security by making it hard to predict hash values based on similar inputs.
Applications of Hash Functions
Hash functions play a crucial role in data retrieval, security, and integrity, ensuring fast access, data consistency, and secure encryption.
1. Hash Tables
Used in data structures like hash maps, hash tables store key-value pairs efficiently, allowing quick lookups, insertions, and deletions, making them essential for databases and caching.
2. Data Integrity
Checksums and hash functions verify data consistency by detecting corruption or tampering in files, databases, and network communications.
3. Cryptography
Hashing secures passwords, digital signatures, and encryption algorithms. Secure hash functions like SHA-256 ensure data integrity and confidentiality in cryptographic applications.
4. Data Structures
Hash functions enable efficient data organization in Bloom filters, hash sets, and Merkle trees, optimizing memory usage and lookup performance in large-scale applications.
Types Of Hash Function In Data Structures
There exist several types of hash function in data structures. We will focus on some of the most common hashing techniques present.
- Division Hashing
- Mid-Square Hashing
- Digit Folding Method
- Multiplication Hashing
Division Method
This method uses the modulo operator to compute the hash code for a key. The modulo operator returns the remainder of the key divided by the number of buckets in the table. For example, if the key is an integer and the number of buckets is 10, the hash function could be:
int hash(int key) {
return key % 10;
}
Input:
99
Output:
9
Syntax:
F(K) = k mod M;
Here, M is the table’s size, and k is the key value.
Advantage:
- It is a very fast method as it requires just one basic operation.
Disadvantage:
- The value of M chosen must be a prime number. Since using prime numbers will make the value more distributed.
Mid Square Method
This method involves squaring the key and extracting some digits from the middle of the resulting value. This is followed by taking the modulo of that value with the number of buckets in the table. This method is used when the key is a non-negative integer.
Syntax:
F(K) = h(k x k)
Here, k is the key value.
Code:
C++
#include <iostream>
#include <cmath>
using namespace std;
int hash(int key, int numBuckets) {
// square the key
int square = key * key;
// count the number of digits in the square
int digits = (int)log10(square) + 1;
// calculate the offset to extract the middle digits
int offset = (digits - numDigits) / 2;
// extract the middle digits
int middle = square / pow(10, offset);
// take the modulo of the middle digits with the number of buckets
middle %= numBuckets;
return middle;
}
int main() {
int key;
cout << "Enter a key: ";
cin >> key;
int numBuckets = 10;
int h = hash(key, numBuckets);
cout << "Hash code: " << h << endl;
return 0;
}

You can also try this code with Online C++ Compiler
Run Code
Output:
Enter a key: 12
Hash Code: 4
Advantages:
- It is simple to implement.
- It can be used with non-negative integer keys.
- It can handle non-uniformly distributed keys
Disadvantages:
- The resulting hash code highly depends on the key's number of digits.
- It is not suitable for use with keys that are not integers.
- It can lead to poor distribution of keys if the key values are not large enough.
Digit Folding Method
This method involves taking the key, breaking it down into groups of digits, and then summing the digits in each group. The resulting value is then taken modulo the number of buckets in the table. This method can be used when the key is an integer or a string.
Syntax:
k = k1, k2, k3, k4, ….., kn
s = k1+ k2 + k3 + k4 +….+ kn
F(K)= s
Here,s is obtained by adding the parts of the key k
Code:
int hash(int key, int numBuckets) {
int sum = 0;
while (key > 0) {
sum += key % 10;
key /= 10;
}
return sum % numBuckets;
}
int main()
{
int numBuckets = 10;
int key = 123;
cout << hash(key, numBuckets);
}
Output:
6
Advantages:
- It is simple to implement.
- It can be used with both integer and string keys.
- It can handle non-uniformly distributed key.
Disadvantages:
- The resulting hash code may not be unique for all keys.
- It is not suitable for use with large keys.
- It can lead to poor distribution of keys if the key values are not large enough.
Multiplication Method
This method involves multiplying the key by a constant value and then taking the fractional part of the result. The fractional part is multiplied by the number of buckets in the table, and the floor of that value is taken as the hash code.
Syntax:
h(K) = floor (M (kA mod 1))
Here,
M is the size of the hash table.
k is the key value.
A is a constant value.
Code:
int hash(int key) {
double A = 0.6180339887;
double val = key * A;
val = val - floor(val);
return (int)(table_size * val);
}
int main(){
int table_size = 10;
int key = 123;
cout << hash(key);
}
Output:
3
Advantages:
- It produces a good distribution of keys.
- It is suitable for use with both integer and real keys.
- It can handle non-uniformly distributed keys.
Disadvantages:
- It is more complex to implement than the division and folding methods.
- It may not work well with small table sizes.
- It may not produce unique hash codes for all keys.
The choice of which hash function to use depends on the application's specific requirements, such as the type of keys, the size of the table, and the desired distribution of keys.
The Mid-Square Method is simple to implement and can handle non-uniformly distributed keys, but it is not suitable for use with keys that are not integers.
The Digit Folding Method can be used with both integer and string keys, but it may not produce unique hash codes for all keys.
In comparison, the Multiplication Method is considered to be one of the best hash functions as it produces a good distribution of keys and can handle non-uniformly distributed keys. However, it is more complex to implement than the other methods.
Also read - Data Structure MCQ
Frequently Asked Questions
What is hash function used?
Developers use Hash Functions to convert data into numeric representations. This number is called a hash code or hash value. Hash codes find application in both hash tables and cryptographic algorithms. The main aim of using these functions is to ensure that the data is spread evenly in the hash table to avoid collisions.
What is the best hash function?
There is no such best hash function. It depends on the requirement of the user. Good hash functions have minimal collisions, good distribution, and efficient computation, such as the cryptographic hash functions SHA-256 and Blake2.
What is the Key in hashing?
In hashing, the key is the input data or value provided to the hash function. The key generates a unique hash code or index for storing or retrieving data in a hash table. The key plays a vital role in identifying the hash value and the corresponding index in the hash table.
What is hash function for binary data?
A hash function for binary data is a mathematical algorithm that takes binary input and produces a fixed-size string of characters as output. It is used for data integrity verification and indexing binary data efficiently.
What is the formula for hash function?
A hash function converts input data (such as a string or file) into a fixed-size value or hash code. It typically involves a combination of arithmetic and bitwise operations, ensuring uniform distribution and minimizing collisions.
Conclusion
In this article, we discussed hashing in detail. We understood the hash function in data structure. The hash function enables us to uniquely identify and search data in constant time. There are several types of hash function in data structure and we can choose one according to our needs.
We hope you liked reading our blog. If you wish to explore more about hashing, please refer to the following blogs: