Hostname: page-component-76fb5796d-x4r87 Total loading time: 0 Render date: 2024-04-25T12:13:14.128Z Has data issue: false hasContentIssue false

Towards a Realistic Analysis of Some Popular Sorting Algorithms

Published online by Cambridge University Press:  11 December 2014

J. CLÉMENT
Affiliation:
Université de Caen/ENSICAEN/CNRS – GREYC – Caen, France (e-mails: Julien.Clement@unicaen.fr, thu-hien.nguyen-thi@unicaen.fr, brigitte.vallee@unicaen.fr)
T. H. NGUYEN THI
Affiliation:
Université de Caen/ENSICAEN/CNRS – GREYC – Caen, France (e-mails: Julien.Clement@unicaen.fr, thu-hien.nguyen-thi@unicaen.fr, brigitte.vallee@unicaen.fr)
B. VALLÉE
Affiliation:
Université de Caen/ENSICAEN/CNRS – GREYC – Caen, France (e-mails: Julien.Clement@unicaen.fr, thu-hien.nguyen-thi@unicaen.fr, brigitte.vallee@unicaen.fr)

Abstract

We describe a general framework for realistic analysis of sorting algorithms, and we apply it to the average-case analysis of three basic sorting algorithms (QuickSort, InsertionSort, BubbleSort). Usually the analysis deals with the mean number of key comparisons, but here we view keys as words produced by the same source, which are compared via their symbols in lexicographic order. The ‘realistic’ cost of the algorithm is now the total number of symbol comparisons performed by the algorithm, and, in this context, the average-case analysis aims to provide estimates for the mean number of symbol comparisons used by the algorithm. For sorting algorithms, and with respect to key comparisons, the average-case complexity of QuickSort is asymptotic to 2n log n, InsertionSort to n2/4 and BubbleSort to n2/2. With respect to symbol comparisons, we prove that their average-case complexity becomes Θ (n log2n), Θ(n2), Θ (n2 log n). In these three cases, we describe the dominant constants which exhibit the probabilistic behaviour of the source (namely entropy and coincidence) with respect to the algorithm.

Type
Paper
Copyright
Copyright © Cambridge University Press 2014 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Cesaratto, E. and Vallée, B. Gaussian distribution of trie depth for strongly tame sources. In the special issue of CPC (Combinatorics, Probability and Computing) dedicated to Philippe Flajolet (Cambridge University Press 2015).CrossRefGoogle Scholar
[2] Clément, J., Fill, J., Nguyen Thi, T. and Vallée, B. (2014) Towards a realistic analysis of the QuickSelect algorithm. Theory of Computing Systems, special Issue “STACS 2013” (Anca Muscholl and Martin Dietzfelbinger).Google Scholar
[3] Clément, J., Flajolet, P. and Vallée, B. (2001) Dynamical sources in information theory: A general analysis of trie structures. Algorithmica 29 307369.CrossRefGoogle Scholar
[4] Clément, J., Nguyen Thi, T. and Vallée, B. (2013) A general framework for the realistic analysis of sorting and searching algorithms: Application to some popular algorithms. In STACS 2013, pp. 598609.Google Scholar
[5] De La Briandais, R. (1959) File searching using variable length keys. In Papers Presented at the the March 3–5, 1959, Western Joint Computer Conference, IRE-AIEE-ACM '59 (Western), ACM, pp. 295298.Google Scholar
[6] Dolgopyat, D. (1998) On decay of correlations in Anosov flows. Ann. of Math. 147 357390.CrossRefGoogle Scholar
[7] Dolgopyat, D. (1998) Prevalence of rapid mixing in hyperbolic flows. Ergodic Theory Dynamical Systems 18 10971114.CrossRefGoogle Scholar
[8] Fill, J. A. (2013) Distributional convergence for the number of symbol comparisons used by Quicksort. Ann. Appl. Probab. 23 11291147.CrossRefGoogle Scholar
[9] Fill, J. A. and Janson, S. (2004) The number of bit comparisons used by Quicksort: An average-case analysis. In Proc. ACM–SIAM Symposium on Discrete Algorithms: SODA 2004, pp. 300307. Long version in Electron. J. Probab. 17 (2012) #43.Google Scholar
[10] Fill, J. A. and Nakama, T. (2013) Distributional convergence for the number of symbol comparisons used by QuickSelect. Adv. Appl. Probab. 45 425450.CrossRefGoogle Scholar
[11] Flajolet, P. (2006) The ubiquitous digital tree. In Proc. 23rd Annual Symposium on Theoretical Aspects of Computer Science: STACS 2006, Vol. 3884 of Lecture Notes in Computer Science, Springer, pp. 122.Google Scholar
[12] Flajolet, P. (2008) A journey between Rice, Mellin and Poisson. Personal communication.Google Scholar
[13] Flajolet, P., Gourdon, X. and Dumas, P. (1995) Mellin transforms and asymptotics: Harmonic sums. Theoret. Comput. Sci. 144 358.CrossRefGoogle Scholar
[14] Flajolet, P., Roux, M. and Vallée, B. (2010) Digital trees and memoryless sources: From arithmetics to analysis. In Proc. AofA'10, DMTCS Proc. AM, pp. 231258.CrossRefGoogle Scholar
[15] Flajolet, P. and Sedgewick, R. (1995) Mellin transforms and asymptotics: Finite differences and Rice's integrals. Theor. Comput. Sci. 144 101124.CrossRefGoogle Scholar
[16] Flajolet, P. and Sedgewick, R. (2009) Analytic Combinatorics, Cambridge University Press.CrossRefGoogle Scholar
[17] Fredkin, E. (1960) Trie memory. Commun. Assoc. Comput. Mach. 3 490499.Google Scholar
[18] Jacquet, P. and Szpankowski, W. (1998) Analytical de-Poissonization and its applications. Theoret. Comput. Sci. 201 162.CrossRefGoogle Scholar
[19] Jacquet, P. and Szpankowski, W. (1998) Entropy computations for discrete distributions: Towards analytic information theory. In IEEE International Symposium on Information Theory.Google Scholar
[20] Nörlund, N. E. (1929) Leçons sur les équations linéaires aux différences finies. In Collection de Monographies sur la Théorie des Fonctions, Gauthier-Villars.Google Scholar
[21] Nörlund, N. E. (1954) Vorlesungen über Differenzenrechnung, Chelsea Publishing Company.Google Scholar
[22] Roux, M. and Vallée, B. (2011) Information theory: Sources, Dirichlet series, and realistic analyses of data structures. In Proc. 8th International Conference Words 2011, Vol. 63 of Electronic Proceedings in Theoretical Computer Science, pp. 199214.Google Scholar
[23] Sedgewick, R. (1998) Algorithms in C, parts 1–4, third edition, Addison-Wesley.Google Scholar
[24] Seidel, R. (2010) Data-specific analysis of string sorting. In Proc. 21st Annual ACM–SIAM Symposium on Discrete Algorithms: SODA, pp. 12781286.CrossRefGoogle Scholar
[25] Szpankowski, W. (2001) Average Case Analysis of Algorithms on Sequences, Interscience series in Discrete Mathematics and Optimization, Wiley.CrossRefGoogle Scholar
[26] Vallée, B. Rice or Poisson–Mellin? In preparation.Google Scholar
[27] Vallée, B. (2001) Dynamical sources in information theory: Fundamental intervals and word prefixes. Algorithmica 29 262306.CrossRefGoogle Scholar
[28] Vallée, B., Clément, J., Fill, J. A. and Flajolet, P. (2009) The number of symbol comparisons in QuickSort and QuickSelect. In Proc. ICALP 2009, part I, Vol. 5555 of Lecture Notes in Computer Science, Springer, pp. 750763.Google Scholar