To grasp the concept of bucket sort, it is essential first to understand the technique and the series of operations required to make the array sorted using counting sort.
In this blog, we have also used an example and its code snippet to understand the concept better.
Working of Bucket sort
After dividing the elements, each bucket can use any sorting algorithm or recursively apply the same bucket algorithm to sort elements within the bucket.
The sorted buckets are finally combined and made into a sorted array.
Approach
The bucket sort algorithm is a distribution sort algorithm that operates on a collection of elements by dividing the elements into several "buckets". Each bucket is then sorted individually, either using a different sorting algorithm or recursively applying the bucket sort algorithm. Finally, the elements in each bucket are concatenated to obtain the final sorted list.
Bucket sort has an average time complexity of O(n) if the elements are uniformly distributed, making it an efficient sorting algorithm for uniformly distributed data. However, if the data distribution is not uniform, the performance of bucket sort can degrade to O(n^2), making it less efficient than other sorting algorithms such as quicksort.
Bucket sort is useful for sorting floating-point numbers because it is relatively straightforward to divide the numbers into buckets based on their decimal places, making it easy to sort the numbers with a high degree of accuracy.
Bucket Sort Algorithm
The bucket sort algorithm works as follows:
STEP 1: Create an array of n empty buckets.
STEP 2: Iterate through the input array and place each element into a bucket based on its value (for example, if the array contains numbers from 0 to 1, you can divide each number by n and use the result as the index of the bucket).
STEP 3: Sort the elements in each non-empty bucket using another sorting algorithm, such as insertion sort.
STEP 4: Concatenate the elements in each bucket to obtain the sorted array.
Note: The number of buckets (n) should be chosen based on the range of values in the input array and the desired level of accuracy in the sorting process.
Dry Run
Let's consider a simple example to perform a dry run of the bucket sort algorithm:
Suppose we have an array of 6 floating-point numbers arr = 0.167, 0.234, 0.444, 0.566, 0.754, 0.958
Create an array of buckets. In this example, we'll use six buckets. We'll initialise each bucket as an empty list. The number of buckets depends on you. We are taking it as 6 because this is a small example, and we can take the number of buckets the same as the number of elements in an array.
Divide the range of values into equal parts: We'll determine the minimum and maximum value in the array and divide the range between them into 6 equal parts. For instance, the minimum value in this example is 0.167 and the maximum value is 0.958. We'll divide this range into six equal parts, so each part represents the range for one bucket.
Place the elements into the appropriate bucket. For example, 0.167 will go into the second bucket, 0.234 will go into the second bucket, and similarly, 0.444 will go into the third bucket and so on.
Sort the elements in each bucket: We can use any sorting algorithm for this step. In this example, let's assume we're using insertion sort. We'll sort the elements in each bucket so that they are in ascending order.
Concatenate the sorted buckets: Finally, we'll concatenate the sorted buckets to obtain the final sorted array. In this example, the sorted array would be [0.167, 0.234, 0.444, 0.566, 0.754, 0.958].
That's a simple dry run of the bucket sort algorithm using the example of the array [0.167, 0.234, 0.444, 0.566, 0.754, 0.958].
C++ Implementation
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
void bucketSort(float arr[], int n) {
vector<vector<float>> buckets(n);
// place elements into buckets based on value
for (int i = 0; i < n; i++) {
int index = n * arr[i];
buckets[index].push_back(arr[i]);
}
// sort the elements in each bucket
for (auto& bucket : buckets) {
sort(bucket.begin(), bucket.end());
}
// concatenate the elements in each bucket to obtain the sorted array
int index = 0;
for (auto& bucket : buckets) {
for (float item : bucket) {
arr[index++] = item;
}
}
}
int main() {
float arr[] = {0.566, 0.234, 0.958, 0.754, 0.444, 0.167};
int n = sizeof(arr) / sizeof(arr[0]);
bucketSort(arr, n);
cout << "Sorted array:" << endl;
for (int i = 0; i < n; i++) {
cout << arr[i] << " ";
}
return 0;
}
Output
Java Implementation
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
public class BucketSort {
public static void sort(float[] arr) {
// Create a list of buckets
List<List<Float>> buckets = new ArrayList<>();
for (int i = 0; i < arr.length; i++) {
// Initialize the buckets
buckets.add(new ArrayList<>());
}
// Place each element of the array into a bucket
for (float item : arr) {
int index = (int) (item * arr.length);
// Add the item to the corresponding bucket
buckets.get(index).add(item);
}
// Sort the elements in each bucket
for (List<Float> bucket : buckets) {
Collections.sort(bucket);
}
// Concatenate the elements in each bucket to obtain the sorted array
int index = 0;
for (List<Float> bucket : buckets) {
for (float item : bucket) {
arr[index++] = item;
}
}
}
public static void main(String[] args) {
float[] arr = {0.566f, 0.234f, 0.958f, 0.754f, 0.444f, 0.167f};
sort(arr);
System.out.println("Sorted array:");
for (float item : arr) {
System.out.print(item + " ");
}
}
}
Output
Note: The above implementation of the Bucket Sort algorithm assumes that the input array contains decimal values between 0 and 1. If the input array contains integers or values outside the range of 0 to 1, the algorithm will not work correctly.
To modify the implementation to handle arrays of integers, you can change the number of buckets to be equal to the range of the values in the input array and then use a different function to determine the bucket index for each element.
Approach
To sort an array of integers using the bucket sort algorithm, you can create an array of empty buckets, where a range of integers indexes each bucket. For example, if you are sorting a sequence of integers between 0 and 99, you can create an array of 10 buckets, where each bucket is indexed by the value of the integer divided by 10. Then, you can place each integer in the appropriate bucket by computing the index of the bucket as the integer divided by 10. Once all the integers are placed in the buckets, you can sort each bucket individually using a sorting algorithm of your choice and then concatenate the sorted buckets to obtain the final sorted array of integers.
Below is the implementation for integers also:
C++ Implementation
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
void bucketSort(int arr[], int n, int num_buckets) {
int min_val = *min_element(arr, arr + n);
int max_val = *max_element(arr, arr + n);
vector<vector<int>> buckets(num_buckets,vector<int>());
// place elements into buckets based on value
for (int i = 0; i < n; i++) {
int index = (float(arr[i] - min_val) / float(max_val-min_val))*num_buckets;
if (arr[i]==max_val){
index-=1;
}
buckets[index].push_back(arr[i]);
}
// sort the elements in each bucket
for (auto& bucket : buckets) {
sort(bucket.begin(), bucket.end());
}
// concatenate the elements in each bucket to obtain the sorted array
int index = 0;
for (auto& bucket : buckets) {
for (int item : bucket) {
arr[index++] = item;
}
}
}
int main() {
int arr[] = {8, 4, 2, 10, 3, 1, 9, 6, 5, 7};
int n = sizeof(arr) / sizeof(arr[0]);
int num_buckets=5;
bucketSort(arr, n,num_buckets);
cout << "Sorted array:" << endl;
for (int i = 0; i < n; i++) {
cout << arr[i] << " ";
}
return 0;
}
Output
Java Implementation
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
public class BucketSort {
public static void bucketSort(int[] arr, int numBuckets) {
// determine minimum and maximum values of input array
int min_val =Arrays.stream(arr).min().getAsInt();
int max_val =Arrays.stream(arr).max().getAsInt();
// create empty buckets
ArrayList<ArrayList<Integer>> buckets = new ArrayList<>(numBuckets);
for (int i =0; i < numBuckets; i++) {
buckets.add(new ArrayList<Integer>());
}
float divisor = (float)(max_val-min_val)/numBuckets;
// place elements into buckets based on value
for (int i = 0; i < arr.length; i++) {
float index1 = (float)(arr[i] - min_val) / divisor ;
if(arr[i]==max_val){
index1-=1;
}
int index=(int)index1;
buckets.get(index).add(arr[i]);
}
// sort the elements in each bucket
for (ArrayList<Integer> bucket : buckets) {
Collections.sort(bucket);
}
// concatenate the elements in each bucket to obtain the sorted array
int index = 0;
for (ArrayList<Integer> bucket : buckets) {
for (int item : bucket) {
arr[index++] = item;
}
}
}
public static void main(String[] args) {
int[] arr = {8,4,2,10,3,1,9,6,5,7};
int numBuckets=5;
bucketSort(arr,numBuckets);
System.out.println("Sorted array:");
for (int i =0; i <arr.length; i++) {
System.out.print(arr[i] +" ");
}
}
}
Output
Note: The time complexity of Bucket Sort remains the same whether the input values are integers or decimals.
Complexity
Time Complexity
Best Case
O(n), Where n is the size of the array.
Reason
Hear the algorithm performs a linear number of operations for each element in the input array. The operation of placing an element into a bucket takes constant time, and the operation of sorting the elements in each bucket takes O(n) time in the best case. Since the number of elements in each bucket is proportional to n, the total time complexity is O(n).
Worst Case
O(n^2), Where n is the size of the array.
Reason
The reason for the worst-case time complexity of O(n^2) is that if the data is not uniformly distributed, the elements in a single bucket may become a large fraction of the total number of elements, causing the sorting algorithm for that bucket to take O(n^2) time. In this case, the time complexity would be dominated by the sorting step, and the overall time complexity would be O(n^2).
Space Complexity
O(n), Where n is the size of the array.
Reason
It requires an array of n buckets and n elements to store the sorted data. Additionally, it requires O(n) space for temporary storage in each bucket during the sorting process. The space complexity is linear because it is proportional to the number of elements in the input array.
Frequently Asked Questions
When should bucket sort be used?
Bucket sort can be best used when data is distributed over a range uniformly, such as in a large array of integers.
Is bubble sort faster than bucket sort?
Dividing data into small buckets capable of being stored individually means lesser comparisons are needed. Therefore a significant advantage of bucket sort is faster than bubble sort.
Can bucket sort be used to sort strings?
Bucket sort cannot be used to sort arbitrary strings, but can efficiently sort a set of uniformly distributed floating-point numbers in an array.
What is the difference between bucket sort and counting sort?
Counting sort stores single elements in a bucket, i.e., the count of items. Bucket sort, on the other hand, requires dynamic arrays, linked lists or pre-allocated memory to hold collections of elements in every bucket.
What is the disadvantage of bucket sorting?
Bucket sort can only be used for some data types because it requires the ideal bucketing scheme. Moreover, it is inefficient in cases with tightly clustered values because of its sensitivity towards input distribution.
Conclusion
In this article, we learned how the bucket sort algorithm works and how it is implemented, along with an example.
The bucket sort technique is very efficient for cases where input values where the range of elements is wide. You can go to Coding Ninjas Studio and try solving problems. Share this blog with your friends if you found it helpful!
Want to explore more questions like this, check this out: