Login | Register

The Diverting Fast Radix Algorithm

Title:

The Diverting Fast Radix Algorithm

Thiel, Stuart ORCID: https://orcid.org/0000-0002-2172-3916 (2021) The Diverting Fast Radix Algorithm. PhD thesis, Concordia University.

[thumbnail of Thiel_PhD_S2021.pdf]
Preview
Text (application/pdf)
Thiel_PhD_S2021.pdf - Accepted Version
Available under License Creative Commons Attribution.
1MB

Abstract

Radix sort is a classical algorithm to sort $N$ records with integer keys. The keys are represented using digits of a radix. The classical algorithm executes a pass through the records to count the frequency of each digit to determine the digit’s bucket size, and then a pass to deal the records to the corresponding bucket. The algorithm may proceed from the most significant digit to the least significant, or from the least significant digit to the most significant.

This thesis presents the Fast Radix and the Diverting Fast Radix algorithms, which replace
all but one count pass with estimation of bucket sizes and correction of overflow, and may,
in addition, divert to another sorting algorithm such as insertion sort when the subproblem
size is below a threshold.

Extensive performance experiments compare the implementations of the two algorithms against the native std::sort in the C++ Standard Library, radix sort, and RADULS2, which is currently considered the fastest radix sort available.

We develop a mathematical model of the average-case performance, when keys have a uniform distribution of digits, that support the observed performance improvements. The modeling demonstrates that the costs of estimation and correction quickly approach zero, falling under 5\% of the costs of traditional counting for $N$ as low as 11100. The modeling also provides a formula to calculate the thresholds for diversion that conform to the experimental results.

Our focus is on algorithms, so our implementation is fixed-digit, single-threaded and avoids optimizations such as parallelism, cache, translation lookaside buffer (TLB), or the specific input distribution, which are utilised by RADULS2. Nevertheless, Diverting Fast Radix is more than twice as fast as std::sort on inputs of size up to one billion (and beyond), and approximately 10\% faster than RADULS2 for one million records (even faster on smaller inputs). It is notable that the compiled code size of Diverting Fast Radix is two orders of magnitude smaller than RADULS2.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (PhD)
Authors:Thiel, Stuart
Institution:Concordia University
Degree Name:Ph. D.
Program:Computer Science
Date:29 January 2021
Thesis Supervisor(s):Butler, Gregory
Keywords:sort, sorting, algorithm, analysis, radix sort, LSD
ID Code:988207
Deposited By: STUART THIEL
Deposited On:13 Jul 2021 18:30
Last Modified:13 Jul 2021 18:30
Additional Information:The repository for algorithms developed in this thesis is: https://github.com/ramou/dfr
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top