Bucket Sort Algorithm
Learn via video course

Overview
Consider we are given an array of numbers, and we want to sort it. There are a number of sorting algorithms available to accomplish this task.
The best sorting algorithm is supposed to be quick sort or merge sort because they take the least time complexity of O(nlog n).
Bucket sort is another sorting algorithm that can perform sorting in O(n) time complexity, but only in specific cases. Let us see when and how.
Takeaways
Bucket sort can be exceptionally fast because of the way elements are assigned to buckets, typically using an array where the index is the value.
Introduction
Whenever it comes to sorting, we think of bubble sort, quicksort, merge sort, etc because they are some of the most commonly used sorting algorithms.
They all, at some point perform the actual comparison between every two numbers to reach the final goal, i.e., the sorted array. In comparison, bucket sort works a bit differently.
It works on uniformly distributed data (explained below) and puts numbers into buckets to sort and gather them all to form the sorted array.
While conventional sorting algorithms can reach the best case time complexity of O(nlogn), Bucket sort can do the same in O(n).
Uniformly Distributed Data
- Uniformly distributed data has difference between every adjacent element almost the same.
- Examples and graph to represent uniformly distributed data.
An array is said to be uniformly distributed if the difference between each adjacent element inside the array is almost the same. Or, in other terms, the elements of the array spread evenly throughout the whole span/range of the array.
For example:
-
Consider this array: [10, 21, 29, 41, 52] The difference between each adjacent term is almost equal to 10. Hence, this array has uniformly distributed data and can be sorted using the bucket sort algorithm.
-
Consider another array: [1,4,23,5,44,9,6,43] This array is not uniformly distributed because the number of elements between the range [0-10] = 5 (i.e., 1,4,5,9, and 6), whereas the number of elements between [10-20] is 0, and the same for the range [30-40]. The data is scattered over different data ranges but is concentrated within some specific ranges, whereas sparse among the others. Hence, this array is not uniformly distributed.
This is how a uniform distribution data looks like: The complete data is scattered equally within each range and hence depicts uniformly distributed data.
Scatter-Gather-Approach
Solving large and complex problems can be a little hard at times. Scatter-gather-approach tries to simplify such complex problems by first scattering the complete data into clusters, solving these sub-problems individually, and finally gathering them all together to get the final result. This is how Bucket sort uses scatter-gather-approach:
Bucket Sort Algorithm
- Works on Uniformly Distributed data
- Follows Scatter-Gather-Approach
- Sorting technique with time complexity O(n)
Bucket sort is a sorting technique that uses the Scatter-Gather-Approach to sort the array.
It divides the unsorted array into separate groups and calls them buckets. Sort the individual buckets, and then gather them all together to form the final sorted array.
- If the elements of the array are floats ranging between 0 and 1, we primarily make 10 buckets, numbered from 0 to 9, and then insert elements into these buckets depending upon their most significant number. The bucket index is calculated as: (int) (elementNumber * 10)
- If the elements of the array are integers, as shown in the figure below, we simply calculate the range: range = (maximumNumber - minimumNumber) / noOfBuckets and divide the whole range into buckets and then perform bucket sorting.
Graphically, this is how the bucket sort with integer elements look like:
Working of Bucket Sort
Let us understand how the algorithm works with the help of the above example. Consider this unsorted array
Step 1:
Since we have the array with integer elements, we'll calculate the range first.
Hence, we'll have buckets of the following ranges: [1-5] [6-10] [11-15] [16-20] [21-25]
Step 2: Scatter
Now, iterate through the unsorted array and keep inserting the numbers in the bucket of their corresponding range.
For example:
and so on..
Finally, the buckets will look like this:
Step 3:
Now sort all the elements in each of the buckets, and the sorted buckets look like this:
Step 4: Gather
At last, visit each bucket and gather all the numbers together. Merge them all, and we'll get the sorted array.
Hence, the sorted array is:
PseudeCode
Hence, we have the sorted array.
Bucket Sort Algorithm for floating points numbers
Below written is the complete code for bucket sort.
At first, we create a vector, bucket and then create n buckets inside this vector. The whole array is then traversed, and elements are placed in the bucket corresponding to their range.
JAVA Code
Output:
Note: This applies if the elements of the array range between 0 and 1. What if the elements are integers? In that case, we'll set the range of each bucket a little differently, which we will learn in the next section.
Following images show the working of code with the inputs taken above:
Bucket Sorting for Integer elements
- Find range = (max - min) / noOfBuckets
- bucketIndex = (array[i] - minimum) / range
- Finally, sort and gather
For integer elements, we need to follow the following algorithm:
- Calculate the maximum and the minimum element of the array
- Calculate the range: range = (maximum - minimum) / n, where n is the number of buckets (Given as parameter)
- Create n empty buckets and initialize them with 0
- Loop through the unsorted array and perform the following: a) Calculate bucketIndex bucketIndex = (array[i] - minimum) / range b) Insert the ith element of the array into the bucket[bucketIndex]
- Sort the individual buckets
- Gather all the elements together
Hence, we have the sorted array with us.
This is how it works:
Python code - Bucket Sort with Integer elements
Output
Bucket Sort time complexity
- Best Case Time Complexity: O(n+k)
- Average Case Time Complexity: O(n)
- Worst Case Time Complexity: O(n^2^)
Best Case Time Complexity:
If the array elements are uniformly distributed, bucket size will almost be the same for all the buckets. Hence, this will be the best case which will take up the least amount of time.
Sorting time complexity will reduce even further if all the elements inside each bucket are already sorted.
To create n buckets and scatter each element from the array, time complexity = O(n). If we use Insertion sort to sort each bucket, time complexity = O(k). Hence, best case time complexity for bucket sort = O(n+k), where n = number of elements, and k = number of buckets
Worst Case Time Complexity
If the array elements are not uniformly distributed, i.e., elements are concentrated within specific ranges.
This will result in one or more buckets having more elements than other buckets, making bucket sort like any other sorting technique, where every element is compared to the other. Time complexity increases even further if the elements in the array are present in the reverse order. If insertion sort is used, the worst-case time complexity can go up to O(n^2^).
Bucket Sort Space Complexity
- Space Complexity : O(n+k)
Space Complexity for bucket sort is O(n+k), where n = number of elements in the array, and k = number of buckets formed Space taken by each bucket is O(k), and inside each bucket, we have n elements scattered. Hence, the space complexity becomes O(n+k).
Applications
- Uniformly Distributed data
- Floating point numbers between range 0.0 to 1.0.
- Bucket Sort is a very different type of sorting algorithm as it does not involve direct comparison between the numbers. It can only be applied to uniformly distributed data.
- Whenever we have floating-point numbers between 0 and 1, bucket sort might be the best sorting approach. Reason - if we use merge-sort, quick-sort, heap-sort, etc, the problem will take a minimum of O(nlogn) time complexity. Also, counting sort cannot be used because floating point numbers cannot be used as index. Hence, bucket sort is best suited for sorting the array with floating point numbers of range [0.0-1.0].
Conclusion
- The bucket Sort algorithm sorts the elements of the array by first segregating the array into a number of buckets, sorting each bucket, and then gathering the elements back to form the sorted array.
- Bucket Sort is used to sort an array where elements are uniformly distributed, or where the elements of the array range between 0 and 1.
- Bucket sort can exhibit the best case time complexity of O(n+k), where n is the number of buckets and k is the bucket size.
- The way buckets are provided ranges differ in cases when the elements of the array are floats and integers. This is discussed in detail above.