I've been working on web based projects built mainly with PHP and JavaScript, where I mostly use Zend Framework and jQuery. I am interested in any webpage optimizations techniques - for a faster web! Stoimen is a DZone MVB and is not an employee of DZone and has posted 96 posts at DZone. You can read more from them at their website. View Full User Profile

Algorithm of the Week: Radix Sort

03.20.2012
| 30005 views |
  • submit to reddit

Algorithms always depend on the input. We saw that general purpose sorting algorithms like insertion sort, bubble sort and quicksort can be very efficient in some cases and inefficient in others. Indeed, insertion and bubble sort are considered slow, with a best-case complexity of O(n2), but they are quite effective when the input is fairly sorted. So, when you have a sorted array and you add some “new” values to the array you can sort it quite effectively with insertion sort. On the other hand, quicksort is considered one of the best general purpose sorting algorithms, but while it’s a great algorithm when the data is randomized, it’s practically as slow as bubble sort when the input is almost or fully sorted.

Now we see that the effectiveness of algorithms depends greatly on the input. For input that is almost sorted, insertion sort may be preferred instead of quicksort, which is generally a faster algorithm.

Because the input is so important for an algorithm's efficiency, we may ask if there are any sorting algorithms that are faster than O(n.log(n)), which is the average-case complexity for merge sort and quicksort. And the answer is yes there are faster, linear complexity algorithms, that can sort data faster than quicksort, merge sort and heapsort. But there are some constraints!

Everything sounds great but we can’t sort any particular data with linear complexity, so the question is what rules must the input follow in order to be sorted in linear time?

Such an algorithm that is capable of sorting data in linear O(n) time is radix sort and the domain of the input is restricted – it must consist only of integers.

Overview

Let’s say we have an array of integers which is not sorted. Because it consists only of integers and because array keys are integers in programming languages we can implement radix sort.

First for each value of the input array we put the value of “1” on the key-th place of the temporary array as explained on the following diagram.





If there are repeating values in the input array, we increment the corresponding value in the temporary array. After “initializing” the temporary array with one pass (with linear complexity) we can sort the input.


Implementation

Implementing radix sort is in fact very easy, which is great. The thing is that old-school programming languages weren’t very flexible and we needed to initialize the entire temporary array. That leads to another problem – we must know the interval of values from the input. Fortunately, modern programming languages and libraries are more flexible so we can initialize our temporary array even if we don’t know the interval of input values, as in the example bellow. Indeed, PHP is flexible enough to build-up arrays in the memory without knowing their size in advance.

$list = array(4, 3, 5, 9, 7, 2, 4, 1, 6, 5);
 
function radix_sort($input)
{
    $temp = $output = array();
	$len = count($input);
 
    for ($i = 0; $i < $len; $i++) {
		$temp[$input[$i]] = ($temp[$input[$i]] > 0) 
			? ++$temp[$input[$i]]
			: 1;
    }
 
    ksort($temp);
 
    foreach ($temp as $key => $val) {
		if ($val == 1) {
			$output[] = $key; 
		} else {
			while ($val--) {
				$output[] = $key;
			}
        }
    }
 
    return $output;
}
 
// 1, 2, 3, 4, 4, 5, 5, 6, 7, 9
print_r(radix_sort($list));
The problem is that PHP needs ksort – which is completely foolish as we’re trying to sort an array using “another” sorting method, but to overcome this you must know the interval of values in advance and initialize a temporary array with 0s, as in the example bellow.
define(MIN, 1);
define(MAX, 9);
$list = array(4, 3, 5, 9, 7, 2, 4, 1, 6, 5);
 
function radix_sort(&$input)
{
    $temp = array();
	$len = count($input);
 
	// initialize with 0s
    $temp = array_fill(MIN, MAX-MIN+1, 0);
 
    foreach ($input as $key => $val) {
    	$temp[$val]++;
    }
 
    $input = array();
    foreach ($temp as $key => $val) {
	if ($val == 1) {
		$input[] = $key;
	} else {
		while ($val--) {
			$input[] = $key;
		}
	}
    }
}
 
// 4, 3, 5, 9, 7, 2, 4, 1, 6, 5
var_dump($list);
 
radix_sort(&$list);
 
// 1, 2, 3, 4, 5, 5, 6, 7, 8, 9
var_dump($list);
Here the input is modified during the sorting process and it’s used as a result.

Complexity

The complexity of radix sort is linear, which in terms of omega means O(n). That is a great benefit in performance compared to O(n.log(n)) or even worse with O(n2) as we can see in the following chart.

Why Use Radix Sort


1. It’s fast

Radix sort is very fast compared to other sorting algorithms as we saw on the diagram above. This algorithm is very useful in practice because in practice we often sort sets of integers.


2. It’s easy to understand and implement

Even a beginner can understand and implement radix sort, which is great. You need no more than a few loops to implement it.

Why NOT using radix sort


1. Works only with integers

If you’re not sure about the input, you're better off not using radix sort. We may think that our input consists only of integers and we can go for radix sort, but what if in the future someone passes floats or strings to our routine.


2. Requires additional space

Radix sort needs additional space – at least as much as the input.

Final Words

Radix sort is restricted by the input’s domain, but I must say that in practice there are tons of cases where only integers are sorted. This is when we get some data from the db based on primary keys – typically primary in database tables are integers as well. So practically there are lots of cases of sorting integers, so radix sort may be one very, very useful algorithm and it is so cool that it is also easy to implement.

Published at DZone with permission of Stoimen Popov, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Yaron Levy replied on Sun, 2012/06/10 - 10:18am

I think this is counting sort. Radix sort sorts the data based on the digits. Eg: sort all the digits based on its LSB then move towards sorting based on its MSB.
Radix sort internally uses counting sort(the one you’ve mentioned above) to sort the data. Reference: ‘Introduction To Algorithms’ by CLRS

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.