Bucket Sort

Bucket Sort – Searching and sorting – A simple way is to apply a comparison based sorting algorithm. The lower bound for Comparison based sorting algorithm (Merge Sort, Heap Sort, Quick-Sort .. etc) is Ω(n Log n), i.e., they cannot do better than nLogn.

Bucket sort is mainly useful when input is uniformly distributed over a range. For example, consider the following problem.
Sort a large set of floating point numbers which are in range from 0.0 to 1.0 and are uniformly distributed across the range. How do we sort the numbers efficiently?

A simple way is to apply a comparison based sorting algorithm. The lower bound for Comparison based sorting algorithm (Merge Sort, Heap Sort, Quick-Sort .. etc) is Ω(n Log n), i.e., they cannot do better than nLogn.
Can we sort the array in linear time? Counting sort can not be applied here as we use keys as index in counting sort. Here keys are floating point numbers.
The idea is to use bucket sort. Following is bucket algorithm.

bucketSort(arr[], n)
1) Create n empty buckets (Or lists).
2) Do following for every array element arr[i].
.......a) Insert arr[i] into bucket[n*array[i]]
3) Sort individual buckets using insertion sort.
4) Concatenate all sorted buckets.

Following diagram (taken from CLRS book) demonstrates working of bucket sort.


Time Complexity: If we assume that insertion in a bucket takes O(1) time then steps 1 and 2 of the above algorithm clearly take O(n) time. The O(1) is easily possible if we use a linked list to represent a bucket (In the following code, C++ vector is used for simplicity). Step 4 also takes O(n) time as there will be n items in all buckets.
The main step to analyze is step 3. This step also takes O(n) time on average if all numbers are uniformly distributed (please refer CLRS book for more details)

Following is C++ implementation of the above algorithm.

[pastacode lang=”c” manual=”%2F%2F%20C%2B%2B%20program%20to%20sort%20an%20array%20using%20bucket%20sort%0A%23include%20%3Ciostream%3E%0A%23include%20%3Calgorithm%3E%0A%23include%20%3Cvector%3E%0Ausing%20namespace%20std%3B%0A%20%0A%2F%2F%20Function%20to%20sort%20arr%5B%5D%20of%20size%20n%20using%20bucket%20sort%0Avoid%20bucketSort(float%20arr%5B%5D%2C%20int%20n)%0A%7B%0A%20%20%20%20%2F%2F%201)%20Create%20n%20empty%20buckets%0A%20%20%20%20vector%3Cfloat%3E%20b%5Bn%5D%3B%0A%20%20%20%20%0A%20%20%20%20%2F%2F%202)%20Put%20array%20elements%20in%20different%20buckets%0A%20%20%20%20for%20(int%20i%3D0%3B%20i%3Cn%3B%20i%2B%2B)%0A%20%20%20%20%7B%0A%20%20%20%20%20%20%20int%20bi%20%3D%20n*arr%5Bi%5D%3B%20%2F%2F%20Index%20in%20bucket%0A%20%20%20%20%20%20%20b%5Bbi%5D.push_back(arr%5Bi%5D)%3B%0A%20%20%20%20%7D%0A%20%0A%20%20%20%20%2F%2F%203)%20Sort%20individual%20buckets%0A%20%20%20%20for%20(int%20i%3D0%3B%20i%3Cn%3B%20i%2B%2B)%0A%20%20%20%20%20%20%20sort(b%5Bi%5D.begin()%2C%20b%5Bi%5D.end())%3B%0A%20%0A%20%20%20%20%2F%2F%204)%20Concatenate%20all%20buckets%20into%20arr%5B%5D%0A%20%20%20%20int%20index%20%3D%200%3B%0A%20%20%20%20for%20(int%20i%20%3D%200%3B%20i%20%3C%20n%3B%20i%2B%2B)%0A%20%20%20%20%20%20%20%20for%20(int%20j%20%3D%200%3B%20j%20%3C%20b%5Bi%5D.size()%3B%20j%2B%2B)%0A%20%20%20%20%20%20%20%20%20%20arr%5Bindex%2B%2B%5D%20%3D%20b%5Bi%5D%5Bj%5D%3B%0A%7D%0A%20%0A%2F*%20Driver%20program%20to%20test%20above%20funtion%20*%2F%0Aint%20main()%0A%7B%0A%20%20%20%20float%20arr%5B%5D%20%3D%20%7B0.897%2C%200.565%2C%200.656%2C%200.1234%2C%200.665%2C%200.3434%7D%3B%0A%20%20%20%20int%20n%20%3D%20sizeof(arr)%2Fsizeof(arr%5B0%5D)%3B%0A%20%20%20%20bucketSort(arr%2C%20n)%3B%0A%20%0A%20%20%20%20cout%20%3C%3C%20%22Sorted%20array%20is%20%5Cn%22%3B%0A%20%20%20%20for%20(int%20i%3D0%3B%20i%3Cn%3B%20i%2B%2B)%0A%20%20%20%20%20%20%20cout%20%3C%3C%20arr%5Bi%5D%20%3C%3C%20%22%20%22%3B%0A%20%20%20%20return%200%3B%0A%7D” message=”” highlight=”c” provider=”manual”/]



Sorted array is
0.1234 0.3434 0.565 0.656 0.665 0.897
Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

QuickSort on Doubly Linked List

QuickSort on Doubly Linked List – Searching and sorting -. The idea is simple, we first find out pointer to last node. Once we have pointer to last node, we can recursively sort the linked list using pointers to first and last nodes of linked list.
View Post

Comb Sort

Comb Sort – Searching and Sorting – Comb Sort is mainly an improvement over Bubble Sort. Bubble sort always compares adjacent values. So all inversions are removed one by one. Comb Sort improves on Bubble Sort by using gap of size more than 1.
View Post