Heap Sort is an efficient sorting algorithm based on the binary heap data structure. It divides the input into a sorted and an unsorted region, repeatedly removing the largest element from the heap and placing it in the sorted region.
In this article, we will explore Heap Sort, its process, and how it efficiently organizes data.
What is meant by heap sort?
Heap Sort is a comparison-based sorting algorithm that uses a binary heap data structure to organize and sort elements. The algorithm consists of two main phases:
Building the Heap: First, the input array is transformed into a binary heap, which is a complete binary tree where each parent node is greater than or equal to its child nodes (for a max heap) or less than or equal to its child nodes (for a min heap). This property allows the largest (or smallest) element to be easily accessed.
Sorting the Elements: After the heap is built, the largest element (the root of the heap) is repeatedly removed and placed at the end of the sorted array. The heap is then restructured to maintain its properties. This process is repeated until all elements are sorted.
Heap Sort has a time complexity of O(n log n) in the average and worst cases, making it efficient for large datasets. It also has the advantage of being an in-place sorting algorithm, meaning it requires a constant amount of additional space. However, it is not a stable sort, which means that the relative order of equal elements may not be preserved.
Working of Heap Sort using max-heap
Suppose we have an array of [15, 20, 7, 9, 30].
15
20
7
9
30
0 1 2 3 4
We will be sorting the data in the array in ascending order using the heap sort method. There are two possible ways to sort the array-
Building Max Heap
Deleting the data from the max heap
After deleting all the data, we find out that the array is sorted in ascending order.
1) Building max heap
When building a tree, the first element to be stored is 15.
After inserting the second element 20, the tree will look like this-
As we know, in a max heap, the parent should be greater than the child element. Therefore we will swap 15 and 20.
Then we will insert seven as the right child element as the heap tree is a complete binary tree(must fill all the levels except the last level).
This satisfies the max heap property, and the level is filled. Hence it will do no swapping here.
Now we will insert 9 in the tree.
Here also, we do not need any swapping and we will be inserting 30 now.
Here we can see that the parent element is less than the child element(15 < 30). Hence we will be swapping both elements.
After checking 30 with its parent element, we see that 20 < 30. Therefore we will swap again.
Now, this is our max heap after sorting, and the array has been sorted like this-
30
20
7
9
15
0 1 2 3 4
2) Deletion of data from the max heap
We have our max heap,
We will delete the first element, and the element to be deleted first in a heap is the root element. After deleting the root element, we will move the last element data into the root node and delete the last element.
15
20
7
9
30
0 1 2 3 4
After each deletion, we have to check if the tree satisfies the max heap property. As we can see, the left child element of the root is more than the root data(15 < 20), therefore we will be needing swapping here.
20
15
7
9
30
0 1 2 3 4
Again, we will check if the next child element is satisfying the max heap property or not. We will continue this step till we check with the last element. Now, as the tree is a max heap, the next element to be deleted is 20, and the last element data will be in the root node.
Updating the array-
9
15
7
20
30
0 1 2 3 4
Now we know it will do swapping as the root element is smaller than the left child element.
15
9
7
20
30
0 1 2 3 4
The next element to be deleted is the root element 15. The last element to be moved into the top will be the right child element 7.
7
9
15
20
30
0 1 2 3 4
Since the tree is not satisfying the max heap property, we will swap the elements.
9
7
15
20
30
0 1 2 3 4
We will be deleting the root element and shifting the last element to the root node.
7
9
15
20
30
0 1 2 3 4
After deleting the last element 7, the array goes like this-
7
9
15
20
30
0 1 2 3 4
Heap Sort using Heapify Method
Let's take a binary tree, for e.g.-
14
4
20
1
16
10
30
0 1 2 3 4 5 6
In the heapify method, we start from last, considering the child elements first, and then moving up to one level and comparing the parent elements and the comparison goes on till we reach the root node.
For finding out the child elements in an array, we can use the formula-
[ N / 2 ] + 1 to N
For e.g.-
In the above array, there are seven elements.
Using the formula, we have
[7 / 2] + 1 to 7
=> 4 to 7
=> 4th element to 7th element ( index 3 to 6 )
(In the above array, 1, 16, 10, and 30 are child elements)
In the heapify method, we consider the leaf elements as the max heap. Therefore we work on the non-leaf elements.
For finding out the first non-leaf element, we use the formula-
N / 2
Note: If the starting index is 0, then the formula will be ( N / 2) - 1.
Therefore, in the above element, the first non-leaf element we should consider is
6 / 2 - 1 = 2nd index.
We will start the heapify method from the 2nd index with data 20(i.e., Index i=2).
We will first consider this portion and apply heapify here. Here 10 < 20, but 30 > 20. Therefore, we will swap 20 and 30.
14
4
30
1
16
10
20
0 1 2 3 4 5 6
Since the right portion is sorted, we will decrement index i, which was at 2.
Now, i = 1. The portion to consider now is-
Here, 1 < 4 but 16 > 4. Therefore, we will swap 16 and 4.
14
16
30
1
4
10
20
0 1 2 3 4 5 6
Again, the index i is decremented.
Therefore, i = 0.
Now we will consider this whole portion.
We can see that the left and right child elements are more than the root data. 14 < 16 and 14 < 30, but since 30 is the greatest, we will swap the right child with the root node.
We can see that the left subtree is sorted, but the right subtree is not. Therefore we will perform a comparison here.
14 > 10 and 14 < 20.
Hence we will swap 14 and 20.
30
16
20
1
4
10
14
0 1 2 3 4 5 6
Now, the tree has been sorted, and it is a max heap done by the heapify method. For deletion, we will use heapify method only.
Algorithm of Heap Sort
Build the heap using the given elements.
The heap must always be a complete binary tree.
We use the max heap/ min-heap property to sort the elements in ascending/descending order, respectively.
Once the heap has been created satisfying all conditions, we swap the last element with the root element and delete the last node.
Code for Heap Sort in C++
// To implement heap sort algorithm
#include <bits/stdc++.h>
using namespace std;
// Function to print the array
void printArray(int arr[], int N){
if(N <= 0){
return;
}
for(int i = 0; i < N ; i++){
cout << arr[i] << ” ”;
}
cout << endl;
}
// Function to swap two elements
void swap(int *i, int *j){
int temp;
temp = *i;
*i = *j;
*j = temp;
}
// Function for max-heapify
void heapify(int arr[], int N, int i){
// The root is the largest element
int root = i;
int leftChild = 2 * i + 1;
int rightChild = 2 * i + 2;
// To check if the left child is largest
if(leftChild < N && arr[leftChild] > arr[root]){
root = leftChild;
}
// To check if the right child is largest
if(rightChild < N && arr[rightChild] > arr[root]){
root = rightChild;
}
// To check if the largest element is not root
if(root != i){
swap(arr[i], arr[root]);
heapify(arr, N, root);
}
}
// Function for heap sorting
void heapSort(int arr[], int N){
for(int i = N/2-1; i >= 0 ; i--){
// For heap build
heapify(arr, N, i);
}
for(int i = N-1; i >= 0 ; i--){
// To move current root to end node
swap(arr[0], arr[i]);
// To call the max heapify function
heapify(arr, i, 0);
}
}
// Driver code
int main(){
int arr[] = {14, 4, 20, 1, 16, 10, 30};
int N = 7;
cout<<"Original array : ";
printArray(arr, N);
heapSort(arr, N);
cout<<"Sorted array : ";
printArray(arr, N);
return 0;
}
You can also try this code with Online C++ Compiler
Time Complexity Analysis of Heap Sort(Only max-heap method)
1) For Insertion
To insert an element, the time taken = O(log N) since we insert the element to the last and compare with its root element according to the tree height. The tree's height is log N; therefore, we need O(log N) for the insertion of one element.
To insert N number of elements, the time taken will be O(N log N).
2) For Deletion
To delete an element, the time taken = O(log N)
To delete N number of elements, the time taken will be O(N log N).
3) Total time complexity
Taking the total time of insertion and deletion respectively and adding up the two, we get,
O(N log N) + O(N log N) = O(2N log N)
= O(N log N)
The time complexity of the heap sort is O(N log N).
Time Complexity Analysis of Heap Sort(Heapify method)
Here, the time taken is O(N) since we would not be changing the height of the heap. This is a much more efficient and quick method.
Space Complexity Analysis of Heap Sort
Space taken in heap sort is O(1) as only one extra space is required as the heap is built inside the array to be sorted.
Advantages of Heap Sort
Efficient Time Complexity: Heap Sort has a time complexity of O(n log n) for both average and worst-case scenarios, making it efficient for large datasets.
In-Place Sorting: The algorithm requires a constant amount of additional space (O(1)), as it sorts the array without needing extra storage for another array.
No Worst-Case Scenario: Unlike some other sorting algorithms, Heap Sort does not degrade to O(n²) in the worst case, ensuring consistent performance.
Easy to Implement: The concept of heaps is relatively straightforward, making the implementation of Heap Sort simpler compared to other algorithms like Quick Sort.
Good for Priority Queues: Heap Sort leverages the properties of heaps, making it suitable for applications involving priority queues.
Disadvantages of Heap Sort
Not Stable: Heap Sort is not a stable sorting algorithm, meaning that the relative order of equal elements may change after sorting.
Poor Cache Performance: The non-contiguous access pattern of Heap Sort can lead to poor cache performance, making it slower on modern architectures compared to other algorithms like Quick Sort.
Complex Implementation: While the concept is simple, implementing Heap Sort can be more complex than other straightforward algorithms like Insertion Sort or Selection Sort.
Constant Factors: Despite its O(n log n) time complexity, the constant factors can make it slower than other algorithms like Quick Sort for smaller datasets due to overhead.
Frequently Asked Questions
Why is heap sort helpful?
The heap sort algorithm is very efficient compared to other sorting methods when performed on large datasets. The time required for heap sort increases logarithmically, whereas other sorting algorithms are exponentially slower as the number of items increases.
What is a priority queue?
A priority queue is an extension of a queue where every item has a priority associated with it. An element having a high priority is always taken out first before the rest of the elements. Heap sort is implemented using the heap, which is an implementation of a priority queue.
What is the advantage of heaps over sorted arrays?
Sorting arrays can take high time complexity. Therefore we use heaps to find the largest or smallest element as a way to lessen the time and increase efficiency.
How is heap sort different and better from selection sort?
Heap sort uses a much more efficient method to locate the array values that have to be moved. It is also called an improved version of the selection sort as it does not waste time with a linear time traversal of the unsorted elements. In heap sort, the elements are maintained in the unsorted region in a heap data structure, enabling us to find the largest element in each step more quickly.
Conclusion
In this blog, we learned what heapsort is, how it is performed, its working and how to implement its algorithm. Understanding this will help you perform sorting more efficiently.
Try some sorting problems here on Coding Ninjas Studio to have more understanding of the working of algorithms. To be more confident in data structures and algorithms, try out our DS and Algo Course.