Radix sort

From Free net encyclopedia

A Radix sort is a fast stable sorting algorithm which can be used to sort items that are identified by unique keys. Every key is a string or number, and radix sort sorts these keys in some particular lexicographic-like order. It is also known as a Postal sort.

The algorithm operates in O(n · k) time, where n is the number of items, and k is the average key length. This algorithm was originally used to sort punched cards in several passes. A computer algorithm was invented for radix sort in 1954 at MIT by Harold H. Seward. In many newer applications requiring super speeds and massive memory, the computer radix sort is an improvement on (slower) comparison sorts.

Radix sort has resurfaced as an alternative to other high performance comparison-based sorting algorithms (like heapsort and mergesort) that require Ω(n · log n) comparisons, where n is the number of items to be sorted. Comparison sorts can do no better than Ω(n · log n) but offer the additional flexibility of being able to sort with respect to more complicated orderings than a lexicographic one, however this ability is of little importance in many practical applications.

A radix sort algorithm works as follows:

take the least significant digit (or group of bits, both being examples of radices) of each key.
sort the list of elements based on that digit, but keep the order of elements with the same digit (this is the definition of a stable sort).
repeat the sort with each more significant digit.

The sort in step 2 is usually done using bucket sort or counting sort, which are efficient in this case since there are usually only a small number of digits.

In practice, O(n) time is only observed when the number of items is much greater than the size of the key space.

If the keys used are short integers, it is practical to complete the sorting with only two passes, and comparisons can be done with only a few bit operations that operate in constant time. In this case, and even when sorting 32-bit floating point numbers, radix sort can in practice be significantly faster than any other sorting algorithm.

The greatest disadvantages of radix sort are that it usually cannot be made to run in place, so O(n) additional memory space is needed, and that it requires one pass over the input list for each symbol in the key, so it is very slow for potentially-long keys. Multiple histogramming techniques, where all the required histograms are constructed in a single pass, can reduce the latter problem.

Radix sort typically uses the following sorting order: short keys come before longer keys, and keys of the same length are sorted lexicographically. This coincides with the normal order of numbers, if the numbers are represented as digit strings.

1 An example
2 Iterative version using queues
3 Recursion
4 A Recursive Forward Radix Sort Example
5 Efficiency
6 Incremental Trie-based Radix sort
- 6.1 Snow White analogy
7 Sample implementations
8 See also
9 External links
10 References

[edit]

An example

Sort the list:
170, 45, 75, 90, 2, 24, 802, 66

sorting by least significant digit (1s place) gives:
170, 90, 2, 802, 24, 45, 75, 66<p>Notice that we keep 2 before 802, because 2 occurred before 802 in the original list, and similarly for 170 and 90.
sorting by next digit (10s place) gives:<p>2, 802, 24, 45, 66, 170, 75, 90
sorting by most significant digit (100s place) gives:<p>2, 24, 45, 66, 75, 90, 170, 802

It is important to realise that each of the above steps requires just a single pass over the data, since each item can be placed in its correct bucket without having to be compared with other items.

[edit]

Iterative version using queues

A simple version of the Radix sort can be achieved easily using queues as buckets. The following process is repeated for a number of times equal to the greatest number of digits in any integer in the integers to be sorted, with k being the number of passes through the loop already completed:

the integers are enqueued into an array of ten separate queues based on their kth digit
So, using the numbers from the previous example, the queues for the 1st pass would be:
0: 170, 90; 1: none; 2: 2, 802; 3: none; 4: 24; 5: 45, 75; 6: 66; 7 - 9: empty
the queues are dequeued back into an array of integers, in increasing order
Using the same numbers, the array will look like this after the 1st pass:
170, 90, 2, 802, 24, 45, 75, 66

For the second pass:
Queues:
0: 2, 802; 1: none; 2: 24; 3: none; 4: 45; 5: none; 6: 66; 7: 170, 75; 8: none; 9: 90
Array:
2, 802, 24, 45, 66, 170, 75, 90 (note that at this point only the three-digit numbers are out of order)
For the third:
Queues:
0: 2, 24, 45, 66, 75, 90; 1: 170; 2 - 7: none; 8: 802; 9: none
Array:
2, 24, 45, 66, 75, 90, 170, 802 (sorted)
While this may not be the most efficient Radix sort algorithm, it is relatively simple, and still quite efficient.

[edit]

Recursion

A recursively subdividing radix sort algorithm works as follows:

take the most significant digit of each key.
sort the list of elements based on that digit, grouping elements with the same digit into one bucket.
recursively sort each bucket, starting with the next most significant digit.
concatenate the buckets together in order.

Implementation

A two pass per digit method can be used to first collect info and size the buckets on the first pass then place each row (or pointer to the row) into the appropriate bucket on the second pass. A single pass system can also be used, where each bucket is dynamically allocated as needed, but this runs the risk of serious memory fragmentation, which may degrade performance. This memory fragmentation could be avoided if a fixed allocation is used for each bucket, but, even for a single digit 8 bit key, this would require 256 times the memory allocated to the original array, so this approach is obviously going to use up all available memory quickly, going to paging space, which will radically degrade performance. This approach would only make sense if each digit was very small, such as single bit.

Application to Parallel Computing

Note that this recursive sorting algorithm has particular application to parallel computing, as each of the subdivisions can be sorted independently of the rest. In this case, each recursion is passed to the next available processor. A single processor would be used at the start (the most significant digit). Then, by the second or third digit, all available processors would likely be engaged. Finally, near the end, as each subdivision is fully sorted, fewer and fewer processors would be utilized.

[edit]

A Recursive Forward Radix Sort Example

Sort the list:
170, 45, 75, 90, 2, 24, 802, 66

Sorting by most significant digit (100s place) gives:
Zero hundreds bucket: 45, 75, 90, 2, 24, 66
One hundreds bucket: 170
Eight hundreds bucket: 802
Sorting by next digit (10s place) is only needed for those numbers in the zero hundreds bucket (no other buckets contain more than one item):
Zero tens bucket: 2
Twenties bucket: 24
Forties bucket: 45
Sixties bucket: 66
Seventies bucket: 75
Nineties bucket: 90
Sorting by least significant digit (1s place) is not needed, as there is no tens bucket with more than one number. Therefore, the now sorted zero hundreds bucket is concatenated with the one hundreds bucket and eight hundreds bucket to give:
2, 24, 45, 66, 75, 90, 170, 802

This example used base ten digits to be easily human readable, but of course base two or perhaps base 256 digits might make more sense for a binary computer.

[edit]

Efficiency

The forward recursive radix sort is the most efficient when most rows only need to be sorted a few digits deep. In this case, only a few digits need to be processed, versus all digits in the key by the nonrecursive reverse radix sort. On the other hand, if most rows need to be sorted to the last significant digit in each key (such as when many rows have duplicate keys) then the overhead of passing data to each recursion of the program increases processing time relative to the nonrecursive reverse radix sort, which should be used for such cases.

Neither version of the radix sort is very efficient when the data is almost completely sorted to begin with, since they would both ignore this fact and resort all the data. However, the completely sorted case can be easily detected in the first pass through the data (when building histograms).

[edit]

Incremental Trie-based Radix sort

Another way to proceed with a radix sort is to use more memory to create a trie to represent the keys and then traverse the trie to visit each key in order. A trie essentially represents a set of strings or numbers, and a radix sort which uses a trie structure is not necessarily stable, which means that the original order of duplicate keys is not necessarily preserved, because a set does not contain duplicate elements. Additional information will have to be associated with each key to indicate the population count or original order of any duplicate keys in a trie-based radix sort if keeping track of that information is important for a particular application. It may even be desirable to discard any duplicate strings as the trie creation proceeds if the goal is to find only unique strings in sorted order. Some people sort a list of strings first and then make a separate pass through the sorted list to discard duplicate strings, which can be less efficient. A depth-first traversal of the trie starting from the root node will then visit each key in order during movements away from the root node. A depth-first traversal of a trie, or any other kind of acyclic tree structure, is equivalent to traversing a maze via the right-hand rule. One of the advantages of maintaining the trie structure is that the trie makes it possible to determine quickly if a particular key is a member of the set of keys in a time that is proportional to the length of the key, in a time that is independent of the total length of all of the keys. The sorted lists of other non-incremental radix sorting algorithms that sort all of the keys in a batch must be searched with binary search, linear search, or some other method that is in some way dependent on the total length of all of the keys in order to determine set membership. Maintaining the trie structure also makes it possible to insert new keys into the set incrementally or delete keys from the set incrementally while maintaining sorted order in a time that is independent of the total length of all of the keys. In contrast, other radix sorting algorithms must re-sort the entire list of keys each time that a new key is added or deleted from an existing list.

[edit]

Snow White analogy

Image:7dwarves.PNG

If the nodes were rooms connected by hallways, then here is how a woman named Snow White might proceed to visit all of the dwarfs if the place were dark, keeping her right hand on a wall at all times:

She travels down hall B to find Bashful. She asks him where the other dwarfs are, but he is too shy to say anything.
She continues moving forward with her right hand on the wall, because it is dark, which takes her around the room and back up hall B.
She moves down halls D, O, and C to find Doc who is too busy with reading a book by a candle to help her find the other dwarfs.
Continuing to follow the wall with her right hand, she goes back up hall C, then down hall P.
She finds Dopey, but he just shrugs when she asks him where the other dwarfs are.
She continues back up halls P, O, D, and then goes down hall G to find Grumpy, who is too grumpy to help her.
She continues back up hall G, with her right hand still on the wall, and goes down hall H to the room where Happy is. He is happy to see her but does not know where the other dwarfs are.
She travels back up hall H and turns right down halls S and L, where she finds Sleepy, who is sleeping.
She goes back up hall L (with her right hand still on the wall), down hall N, where she finally finds Sneezy.
She travels back up halls N and S to her starting point and knows that she is done.

These series of steps serve to illustrate the path taken in the trie by Snow White via a depth-first traversal to visit the dwarfs by the ascending order of their names, Bashful, Doc, Dopey, Grumpy, Happy, Sleepy, and Sneezy. The algorithm for performing some operation on the data associated with each node of a tree first, such as printing the data, and then moving deeper into the tree is called a pre-order traversal, which is a kind of depth-first traversal. A pre-order traversal is used to process the contents of a trie in ascending order. If Snow White wanted to visit the dwarfs by the descending order of their names, then she could walk backwards while following the wall with her right hand, or, alternatively, walk forward while following the wall with her left hand. The algorithm for moving deeper into a tree first and then performing some operation on the data associated with each node is called a post-order traversal, which is another kind of depth-first traversal. A post-order traversal is used to process the contents of a trie in descending order.

The root node of the trie in the diagram essentially represents a null string, an empty string, which can be useful for keeping track of the number of blank lines in a list of words. The null string can be associated with a circularly linked list with the null string initially as its only member, with the forward and backward pointers both initially pointing to the null string. The circularly linked list can then be expanded as each new key is inserted into the trie. The circularly linked list is represented in the following diagram as thick, grey, horizontally linked lines:

Image:7DwarfsThreaded.PNG

If a new key, other than the null string, is inserted into a leaf node of the trie, then the computer can go to the last preceding node where there was a key or a bifurcation to perform a depth-first search to find the lexicographic successor or predecessor of the inserted key for the purpose of splicing the new key into the circularly linked list. The last preceding node where there was a key or a bifurcation, a fork in the path, is a parent node in the type of trie shown here, where only unique string prefixes are represented as paths in the trie. If there is already a key associated with the parent node that would have been visited during a movement away from the root during a right-hand, forward-moving, depth first traversal, then that immediately ends the depth-first search, as that key is the predecessor of the inserted key. For example, if Bashful is inserted into the trie, then the predecessor is the null string in the parent node, which is the root node in this case. In other words, if the key that is being inserted is on the leftmost branch of the parent node, then any string contained in the parent node is the lexicographic predecessor of the key that is being inserted, else the lexicographic predecessor of the key that is being inserted exists down the parent node's branch that is immediately to the left of the branch where the new key is being inserted. For example, if Grumpy were the last key inserted into the trie, then the computer would have a choice of trying to find either the predecessor, Dopey, or the successor, Happy, with a depth-first search starting from the parent node of Grumpy. With no additional information to indicate which path is longer, the computer might traverse the longer path, D, O, P. If Dopey were the last key inserted into the trie, then the depth-first search starting from the parent node of Dopey would soon find the predecessor, Doc, because that would be the only choice.

If a new key is inserted into an internal node, then a depth-first search can be started from the internal node to find the lexicographic successor. For example, if the literal string, "DO", were inserted in the node at the end of the path D, O, then a depth-first search could be started from that internal node to find the successor, "DOC", for the purpose of splicing the new string into the circularly linked list.

Forming the circularly linked list requires more memory but allows the keys to be visited more directly in either ascending or descending order via a linear traversal of the linked list rather than a depth-first traversal of the entire trie. This concept of a circularly linked trie structure is similar to the concept of a threaded binary tree. Let's call this structure a circularly threaded trie.

Image:Trie002.png

When a trie is used to sort numbers, the numbers must all be the same length unless you are willing to perform a breadth-first traversal. When the numbers will be visited via depth-first traversal, as in the above diagram, the numbers will always be on the leaf nodes of the trie. Note how similar in concept this particular example of a trie is to the recursive forward radix sort example which involves the use of buckets instead of a trie. Performing a radix sort with the buckets is like creating a trie and then discarding the non-leaf nodes.

[edit]

Sample implementations

[edit]

C#

Here we sort an integer array that's passed to the RadixSort method (Note: all arrays in C#/.NET are reference types, and therefore a is a reference to an array).

public void RadixSort(int[] a)
{  
    // our helper array 
    int[] t=new int[a.Length]; 
     
    // number of bits our group will be long 
    int r=4; // try to set this also to 2, 8 or 16 to see if it is quicker or not 
     
    // number of bits of a C# int 
    int b=32; 
     
    // counting and prefix arrays (note dimensions 2^r which is the number of all possible values of a r-bit number) 
    int[] count=new int[1<<r]; 
    int[] pref=new int[1<<r]; 
     
    // number of groups 
    int groups=(int)Math.Ceiling((double)b/(double)r); 
     
    // the mask to identify groups 
    int mask = (1<<r)-1; 
     
    // the algorithm: 
    for (int c=0, shift=0; c<groups; c++, shift+=r)
    { 
        // reset count array 
        for (int j=0; j<count.Length; j++)
            count[j]=0;

        // counting elements of the c-th group 
        for (int i=0; i<a.Length; i++)
            count[(a[i]>>shift)&mask]++; 

        // calculating prefixes 
        pref[0]=0; 
        for (int i=1; i<count.Length; i++)
            pref[i]=pref[i-1]+count[i-1]; 

        // from a[] to t[] elements ordered by c-th group 
        for (int i=0; i<a.Length; i++)
            t[pref[(a[i]>>shift)&mask]++]=a[i]; 

        // a[]=t[] and start again until the last group 
        a=(int[])t.Clone(); 
    } 
    // a is sorted 
}

[edit]

C/[[C++]]

typedef struct slist_ { 
  long val;
  struct slist_ *next; 
} slist;

slist *radix_list(slist *l, int t) {
  int i, j, d, m=1;
  slist *temp, *out, *head[10], *tail[10];
  out=l;

  for (j=1; j<=t; j++) { 
    for (i=0; i<=9; i++)
      head[i] = (tail[i]=NULL); 

    while ( l != NULL ) {
      d = ((int)(l->val/m))%(int)10;
      temp = tail[d];
      if ( head[d]==NULL ) head[d] = l;
      else temp->next = l;
      temp = tail[d] = l;
      l = l->next;
      temp->next = NULL;
    }
    for (i=0; i<=9; i++)
      if ( head[i] != NULL ) break;
    l = head[i];
    temp = tail[i];
    for (d=i+1; d<=9; d++) {
      if ( head[d] != NULL) { 
        temp->next = head[d];
        temp = tail[d];
      }
    }
    m*=10;
  }
  return (out);
}

[edit]

[[C++]] example of non-recursive implementation

/* Copyright (C) David Garcia 2006
 * THIS SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND.
 * You may copy, distribute and create derivative works of it as you want.
 */
#include <assert.h>

/* Counting sort for 4-byte unsigned integers.
 * This implementation tries to be as easy to understand as possible even
 * when that means some performance penalties.
 *
 * Preconditions:
 *     * the unsigned data type must be 4 bytes long
 *     * 'input' must be an array of size 'length'
 *     * 'output' must be an array of size 'length'
 *     * 'SHIFT' must be one of [0,8,16,24]
 * Postconditions:
 *     * 'output' contains all the elements in 'input'
 *     * the elements in 'output' are sorted in increasing order according
 *       to the SHIFT/8 byte
 *     * this sort is stable
 */
template <unsigned SHIFT>
inline static void radix_sort_helper(const unsigned input[/*length*/], const unsigned length,
	unsigned output[/*length*/], unsigned count[/*256*/])
{
	register unsigned i, bucket;
	register const unsigned MASK= 0xFF;

	// Initialise count (could have used memset instead)
	for(i=0; i < 256; ++i) {
		count[i]= 0;
	}

	// Count how many elements lie in each bucket
	for(i=0; i < length; ++i) {
		bucket= (input[i]>>SHIFT)&MASK;
		assert( bucket < 256 );
		count[bucket]++;
		assert( count[bucket] <= length );
	}

	// Count how many elements lie in each bucket or the buckets before it
	for(i=1; i < 256; ++i) {
		count[i]+= count[i-1];
		assert( count[i] <= length );
	}

	// Store each element in sorted order
	for(i=length-1; i < length; --i) {
		bucket= (input[i]>>SHIFT)&MASK;
		assert( bucket < 256 );
		assert( count[bucket] > 0 );
		// ...updating the counter for the next element that lies in this bucket
		count[bucket]--;
		assert( count[bucket] < length );
		output[count[bucket]]= input[i];
	}
}

/* Radix sort for 4-byte unsigned integers.
 * This implementation tries to be as easy to understand as possible even
 * when that means some performance penalties.
 *
 * Preconditions:
 *     * the unsigned data type must be 4 bytes long
 *     * 'array' must be an array of size 'length'
 * Postconditions:
 *     * the elements in 'array' are sorted in increasing order
 */
void radix_sort(unsigned array[/*length*/], const unsigned length)
{
	// Note that we must allocate a temporal array as long as the input,
	// whereas quicksort and others sort in place.
	unsigned* temporal= new unsigned[length];
	unsigned count[256];
	// Call the helper four times, one for each byte.
	// Note that the input and output are swapped in each call
	radix_sort_helper<0>(array, length, temporal, count);
	radix_sort_helper<8>(temporal, length, array, count);
	radix_sort_helper<16>(array, length, temporal, count);
	radix_sort_helper<24>(temporal, length, array, count);
	delete[] temporal;
}

[edit]

External links

Demonstration and comparison of Radix sort with Bubble sort, Merge sort and Quicksort implemented in JavaScript
Page with visual demonstrations of sorting algorithms including Radix sort, all implemented in Java.
Article about Radix sorting IEEE floating-point numbers with implementation. Note: Individual hexadecimal digits represent 4 bits, not the 8 bits that the author of the article indicates. Pairs of hexadecimal digits, such as 8A, represent 8 bits.
Faster Floating Point Sorting and Multiple Histogramming with implementation

[edit]

References

Donald Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89685-0. Section 5.2.5: Sorting by Distribution, pp.168–179.
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0262032937. Section 8.3: Radix sort, pp.170–173.
BRADSORT v1.50 source code de:Radixsort

es:Radix sort fr:Tri par base he:מיון בסיס lt:Skaitmeninis rūšiavimo algoritmas nl:Radix sort ja:基数ソート pt:Radix sort fi:Kantalukulajittelu

zh:基数排序

Retrieved from "http://www.netipedia.com/index.php/Radix_sort"

Categories: Sort algorithms

Radix sort

From Free net encyclopedia

Contents

An example

Iterative version using queues

Recursion

A Recursive Forward Radix Sort Example

Efficiency

Incremental Trie-based Radix sort

Snow White analogy

Sample implementations

C#

C/[[C++]]

[[C++]] example of non-recursive implementation

See also

External links

References

Views

Personal tools

Search

Partner sites