Towards a Realistic Analysis of Some Popular Sorting Algorithms

J. CLÉMENT; T. H. NGUYEN THI; B. VALLÉE

doi:10.1017/S0963548314000649

Towards a Realistic Analysis of Some Popular Sorting Algorithms

Published online by Cambridge University Press: 11 December 2014

J. CLÉMENT ,

T. H. NGUYEN THI and

B. VALLÉE

Show author details

J. CLÉMENT: Affiliation:
Université de Caen/ENSICAEN/CNRS – GREYC – Caen, France (e-mails: Julien.Clement@unicaen.fr, thu-hien.nguyen-thi@unicaen.fr, brigitte.vallee@unicaen.fr)
T. H. NGUYEN THI: Affiliation:
Université de Caen/ENSICAEN/CNRS – GREYC – Caen, France (e-mails: Julien.Clement@unicaen.fr, thu-hien.nguyen-thi@unicaen.fr, brigitte.vallee@unicaen.fr)
B. VALLÉE: Affiliation:
Université de Caen/ENSICAEN/CNRS – GREYC – Caen, France (e-mails: Julien.Clement@unicaen.fr, thu-hien.nguyen-thi@unicaen.fr, brigitte.vallee@unicaen.fr)

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We describe a general framework for realistic analysis of sorting algorithms, and we apply it to the average-case analysis of three basic sorting algorithms (QuickSort, InsertionSort, BubbleSort). Usually the analysis deals with the mean number of key comparisons, but here we view keys as words produced by the same source, which are compared via their symbols in lexicographic order. The ‘realistic’ cost of the algorithm is now the total number of symbol comparisons performed by the algorithm, and, in this context, the average-case analysis aims to provide estimates for the mean number of symbol comparisons used by the algorithm. For sorting algorithms, and with respect to key comparisons, the average-case complexity of QuickSort is asymptotic to 2n log n, InsertionSort to n2/4 and BubbleSort to n2/2. With respect to symbol comparisons, we prove that their average-case complexity becomes Θ (n log2n), Θ(n2), Θ (n2 log n). In these three cases, we describe the dominant constants which exhibit the probabilistic behaviour of the source (namely entropy and coincidence) with respect to the algorithm.

Keywords

Primary 68W32 68P10 68P05 68W40 68Q25 68Q87 68Q17 Secondary 68W05 94A15 94A17 97I80 30D10 30E15 30E20

Type: Paper
Information: Combinatorics, Probability and Computing , Volume 24 , Special Issue 1: Honouring the Memory of Philippe Flajolet - Part 3 , January 2015 , pp. 104 - 144

DOI: https://doi.org/10.1017/S0963548314000649 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2014

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Cesaratto, E. and Vallée, B. Gaussian distribution of trie depth for strongly tame sources. In the special issue of CPC (Combinatorics, Probability and Computing) dedicated to Philippe Flajolet (Cambridge University Press 2015).CrossRef Google Scholar

[2] Clément, J., Fill, J., Nguyen Thi, T. and Vallée, B. (2014) Towards a realistic analysis of the QuickSelect algorithm. Theory of Computing Systems, special Issue “STACS 2013” (Anca Muscholl and Martin Dietzfelbinger).Google Scholar

[3] Clément, J., Flajolet, P. and Vallée, B. (2001) Dynamical sources in information theory: A general analysis of trie structures. Algorithmica 29 307–369.CrossRef Google Scholar

[4] Clément, J., Nguyen Thi, T. and Vallée, B. (2013) A general framework for the realistic analysis of sorting and searching algorithms: Application to some popular algorithms. In STACS 2013, pp. 598–609.Google Scholar

[5] De La Briandais, R. (1959) File searching using variable length keys. In Papers Presented at the the March 3–5, 1959, Western Joint Computer Conference, IRE-AIEE-ACM '59 (Western), ACM, pp. 295–298.Google Scholar

[6] Dolgopyat, D. (1998) On decay of correlations in Anosov flows. Ann. of Math. 147 357–390.CrossRef Google Scholar

[7] Dolgopyat, D. (1998) Prevalence of rapid mixing in hyperbolic flows. Ergodic Theory Dynamical Systems 18 1097–1114.CrossRef Google Scholar

[8] Fill, J. A. (2013) Distributional convergence for the number of symbol comparisons used by Quicksort. Ann. Appl. Probab. 23 1129–1147.CrossRef Google Scholar

[9] Fill, J. A. and Janson, S. (2004) The number of bit comparisons used by Quicksort: An average-case analysis. In Proc. ACM–SIAM Symposium on Discrete Algorithms: SODA 2004, pp. 300–307. Long version in Electron. J. Probab. 17 (2012) #43.Google Scholar

[10] Fill, J. A. and Nakama, T. (2013) Distributional convergence for the number of symbol comparisons used by QuickSelect. Adv. Appl. Probab. 45 425–450.CrossRef Google Scholar

[11] Flajolet, P. (2006) The ubiquitous digital tree. In Proc. 23rd Annual Symposium on Theoretical Aspects of Computer Science: STACS 2006, Vol. 3884 of Lecture Notes in Computer Science, Springer, pp. 1–22.Google Scholar

[12] Flajolet, P. (2008) A journey between Rice, Mellin and Poisson. Personal communication.Google Scholar

[13] Flajolet, P., Gourdon, X. and Dumas, P. (1995) Mellin transforms and asymptotics: Harmonic sums. Theoret. Comput. Sci. 144 3–58.CrossRef Google Scholar

[14] Flajolet, P., Roux, M. and Vallée, B. (2010) Digital trees and memoryless sources: From arithmetics to analysis. In Proc. AofA'10, DMTCS Proc. AM, pp. 231–258.CrossRef Google Scholar

[15] Flajolet, P. and Sedgewick, R. (1995) Mellin transforms and asymptotics: Finite differences and Rice's integrals. Theor. Comput. Sci. 144 101–124.CrossRef Google Scholar

[16] Flajolet, P. and Sedgewick, R. (2009) Analytic Combinatorics, Cambridge University Press.CrossRef Google Scholar

[17] Fredkin, E. (1960) Trie memory. Commun. Assoc. Comput. Mach. 3 490–499.Google Scholar

[18] Jacquet, P. and Szpankowski, W. (1998) Analytical de-Poissonization and its applications. Theoret. Comput. Sci. 201 1–62.CrossRef Google Scholar

[19] Jacquet, P. and Szpankowski, W. (1998) Entropy computations for discrete distributions: Towards analytic information theory. In IEEE International Symposium on Information Theory.Google Scholar

[20] Nörlund, N. E. (1929) Leçons sur les équations linéaires aux différences finies. In Collection de Monographies sur la Théorie des Fonctions, Gauthier-Villars.Google Scholar

[21] Nörlund, N. E. (1954) Vorlesungen über Differenzenrechnung, Chelsea Publishing Company.Google Scholar

[22] Roux, M. and Vallée, B. (2011) Information theory: Sources, Dirichlet series, and realistic analyses of data structures. In Proc. 8th International Conference Words 2011, Vol. 63 of Electronic Proceedings in Theoretical Computer Science, pp. 199–214.Google Scholar

[23] Sedgewick, R. (1998) Algorithms in C, parts 1–4, third edition, Addison-Wesley.Google Scholar

[24] Seidel, R. (2010) Data-specific analysis of string sorting. In Proc. 21st Annual ACM–SIAM Symposium on Discrete Algorithms: SODA, pp. 1278–1286.CrossRef Google Scholar

[25] Szpankowski, W. (2001) Average Case Analysis of Algorithms on Sequences, Interscience series in Discrete Mathematics and Optimization, Wiley.CrossRef Google Scholar

[26] Vallée, B. Rice or Poisson–Mellin? In preparation.Google Scholar

[27] Vallée, B. (2001) Dynamical sources in information theory: Fundamental intervals and word prefixes. Algorithmica 29 262–306.CrossRef Google Scholar

[28] Vallée, B., Clément, J., Fill, J. A. and Flajolet, P. (2009) The number of symbol comparisons in QuickSort and QuickSelect. In Proc. ICALP 2009, part I, Vol. 5555 of Lecture Notes in Computer Science, Springer, pp. 750–763.Google Scholar

Article contents

Towards a Realistic Analysis of Some Popular Sorting Algorithms

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests