To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
A subgraph of a hypergraph H is even if all its degrees are positive even integers, and b-bounded if it has maximum degree at most b. Let fb(n) denote the maximum number of edges in a linearn-vertex 3-uniform hypergraph which does not contain a b-bounded even subgraph. In this paper, we show that if b ≥ 12, thenfor some absolute constant B, thus establishing fb(n) up to polylogarithmic factors. This leaves open the interesting case b = 2, which is the case of 2-regular subgraphs. We are able to show for some constants c, C > 0 thatWe conjecture that f2(n) = n1 + o(1) as n → ∞.
R. H. Schelp conjectured that if G is a graph with |V(G)| = R(Pn, Pn) such that δ(G) > , then in every 2-colouring of the edges of G there is a monochromatic Pn. In other words, the Ramsey number of a path does not change if the graph to be coloured is not complete but has large minimum degree.
Here we prove Ramsey-type results that imply the conjecture in a weakened form, first replacing the path by a matching, showing that the star-matching–matching Ramsey number satisfying R(Sn, nK2, nK2) = 3n − 1. This extends R(nK2, nK2) = 3n − 1, an old result of Cockayne and Lorimer. Then we extend this further from matchings to connected matchings, and outline how this implies Schelp's conjecture in an asymptotic sense through a standard application of the Regularity Lemma.
It is sad that we are unable to hear Dick Schelp's reaction to our work generated by his conjecture.
Upper and lower bounds are proved for the maximum number of triangles in C2k+1-free graphs. The bounds involve extremal numbers related to appropriate even cycles.
Let 1 ≤ p ≤ r + 1, with r ≥ 2 an integer, and let G be a graph of order n. Let d(v) denote the degree of a vertex v ∈ V(G). We show that ifthen G has more than(r + 1)-cliques sharing a common edge. From this we deduce that ifthen G contains more thancliques of order r + 1.
In turn, this statement is used to strengthen the Erdős–Stone theorem by using ∑v ∈ V(G)dp(v) instead of the number of edges.
Richard Schelp completed his PhD in lattice theory in 1970 at Kansas State University. However, he did not take a traditional route to a PhD in mathematics and an outstanding career as a professor and a mathematical researcher. He grew up in rural northeast Missouri. He received his BS in mathematics and physics from the University of Central Missouri. After the completion of his master's degree in mathematics from Kansas State University, he assumed a position as an associate mathematician in the Applied Science Laboratory at Johns Hopkins University for five years. To start his PhD programme at Kansas State University, he had to quit a well-paying position. Also, he was already married to his wife Billie (Swopes) Schelp and he had a family – a daughter Lisa and a son Rick. This was a courageous step to take, but it says something about who Dick Schelp was.
In 1965 Erdős conjectured a formula for the maximum number of edges in a k-uniform n-vertex hypergraph without a matching of size s. We prove this conjecture for k = 3 and all s ≥ 1 and n ≥ 4s.
A family of sets is said to be intersecting if A ∩ B ≠ ∅ for all A, B ∈ . It is a well-known and simple fact that an intersecting family of subsets of [n] = {1, 2, . . ., n} can contain at most 2n−1 sets. Katona, Katona and Katona ask the following question. Suppose instead ⊂ [n] satisfies || = 2n−1 + i for some fixed i > 0. Create a new family p by choosing each member of independently with some fixed probability p. How do we choose to maximize the probability that p is intersecting? They conjecture that there is a nested sequence of optimal families for i = 1, 2,. . ., 2n−1. In this paper, we show that the families [n](≥r) = {A ⊂ [n]: |A| ≥ r} are optimal for the appropriate values of i, thereby proving the conjecture for this sequence of values. Moreover, we show that for intermediate values of i there exist optimal families lying between those we have found. It turns out that the optimal families we find simultaneously maximize the number of intersecting subfamilies of each possible order.
Standard compression techniques appear inadequate to solve the problem as they do not preserve intersection properties of subfamilies. Instead, our main tool is a novel compression method, together with a way of ‘compressing subfamilies’, which may be of independent interest.
For each of us who appear to have had a successful experiment there are many to whom their own experiments seem barren and negative.
Melvin Calvin, 1961 Nobel Lecture
An experiment is not considered “barren and negative” when it disproves your conjecture: an experiment fails by being inconclusive.
Successful experiments are partly the product of good experimental designs, as described in Chapter 2; there is also an element of luck (or savvy) in choosing a well-behaved problem to study. Furthermore, computational research on algorithms provides unusual opportunities for “tuning” experiments to yield more successful analyses and stronger conclusions. This chapter surveys techniques for building better experiments along these lines.
We start with a discussion of what makes a data set good or bad in this context. The remainder of this section surveys strategies for tweaking experimental designs to yield more successful outcomes.
If tweaks are not sufficient, stronger measures can be taken; Section 6.1 surveys variance reduction techniques, which modify test programs to generate better data, and Section 6.2 describes simulation shortcuts, which produce more data per unit of computation time.
The key idea is to exploit the fact, pointed out in Section 5.1, that the application program that implements an algorithm for practical use is distinct from the test program that describes algorithm performance. The test program need not resemble the application program at all; it is only required to reproduce faithfully the algorithm properties of interest.
Richard Hamming, Numerical Methods for Scientists and Engineers
Some questions:
You are a working programmer given a week to reimplement a data structure that supports client transactions, so that it runs efficiently when scaled up to a much larger client base. Where do you start?
You are an algorithm engineer, building a code repository to hold fast implementations of dynamic multigraphs. You read papers describing asymptotic bounds for several approaches. Which ones do you implement?
You are an operations research consultant, hired to solve a highly constrained facility location problem. You could build the solver from scratch or buy optimization software and tune it for the application. How do you decide?
You are a Ph.D. student who just discovered a new approximation algorithm for graph coloring that will make your career. But you're stuck on the average-case analysis. Is the theorem true? If so, how can you prove it?
You are the adviser to that Ph.D. student, and you are skeptical that the new algorithm can compete with state-of-the-art graph coloring algorithms. How do you find out?
One good way to answer all these questions is: run experiments to gain insight.
This book is about experimental algorithmics, which is the study of algorithms and their performance by experimental means. We interpret the word algorithm very broadly, to include algorithms and data structures, as well as their implementations in source code and machine code.
In almost every computation a great variety of arrangements for the succession of the processes is possible, and various considerations must influence the selection amongst them for the purposes of a Calculating Engine. One essential object is to choose that arrangement which shall tend to reduce to a minimum the time necessary for completing the calculation.
Ada Byron, Memoir on the Analytic Engine, 1843
This chapter considers an essential question raised by Lady Byron in her famous memoir: How to make it run faster?
This question can be addressed at all levels of the algorithm design hierarchy sketched in Figure 1.1 of Chapter 1, including systems, algorithms, code, and hardware. Here we focus on tuning techniques that lie between the algorithm design and hardware levels. We start with the assumption that the system analysis and abstract algorithm design work has already taken place, and that a basic implementation of an algorithm with good asymptotic performance is in hand. The tuning techniques in this chapter are meant to improve upon the abstract design work, not replace it.
Tuning exploits the gaps between practical experience and the simplifying assumptions necessary to theory, by focusing on constant factors instead of asymptotics, secondary instead of dominant costs, and performance on “typical” inputs rather than theoretical classes. Many of the ideas presented here are known in the folklore under the general rubric of “code tuning.”
This guidebook is written for anyone – student, researcher, or practitioner – who wants to carry out computational experiments on algorithms (and programs) that yield correct, general, informative, and useful results. (We take the wide view and use the term “algorithm” to mean “algorithm or program” from here on.)
Whether the goal is to predict algorithm performance or to build faster and better algorithms, the experiment-driven methodology outlined in these chapters provides insights into performance that cannot be obtained by purely abstract means or by simple runtime measurements. The past few decades have seen considerable developments in this approach to algorithm design and analysis, both in terms of number of participants and in methodological sophistication.
In this book I have tried to present a snapshot of the state-of-the-art in this field (which is known as experimental algorithmics and empirical algorithmics), at a level suitable for the newcomer to computational experiments. The book is aimed at a reader with some undergraduate computer science experience: you should know how to program, and ideally you have had at least one course in data structures and algorithm analysis. Otherwise, no previous experience is assumed regarding the other topics addressed here, which range widely from architectures and operating systems, to probability theory, to techniques of statistics and data analysis
A note to academics: The book takes a nuts-and-bolts approach that would be suitable as a main or supplementary text in a seminar-style course on advanced algorithms, experimental algorithmics, algorithm engineering, or experimental methods in computer science.
Strategy without tactics is the slowest route to victory. Tactics without strategy is the noise before defeat.
Sun Tzu, The Art of War
W. I. B. Beveridge, in his classic guidebook for young scientists [7], likens scientific research “to warfare against the unknown”:
The procedure most likely to lead to an advance is to concentrate one's forces on a very restricted sector chosen because the enemy is believed to be weakest there.Weak spots in the defence may be found by preliminary scouting or by tentative attacks.
This chapter is about developing small- and large-scale plans of attack in algorithmic experiments.
To make the discussion concrete, we consider algorithms for the graph coloring (GC) problem. The input is a graph G containing n vertices and m edges. A coloring of G is an assignment of colors to vertices such that no two adjacent vertices have the same color. Figure 2.1 shows an example graph with eight vertices and 10 edges, colored with four colors. The problem is to find a coloring that uses a minimum number of colors – is 4 the minimum in this case?
When restricted to planar graphs, this is the famous map coloring problem, which is to color the regions of a map so that adjacent regions have different colors. Only four colors are needed for any map, but in the general graph problem, as many as n colors may be required.
Really, the slipshod way we deal with data is a disgrace to civilization.
M. J. Moroney, Facts from Figures
Information scientists tell us that data, alone, have no value or meaning [1]. When organized and interpreted, data become information, which is useful for answering factual questions: Which is bigger, X or Y? How many Z's are there? A body of information can be further transformed into knowledge, which reflects understanding of how and why, at a level sufficient to direct choices and make predictions: which algorithm should I use for this application? How long will it take to run?
Data analysis is a process of inspecting, summarizing, and interpreting a set of data to transform it into something useful: information is the immediate result, and knowledge the ultimate goal.
This chapter surveys some basic techniques of data analysis and illustrates their application to algorithmic questions. Section 7.1 presents techniques for analyzing univariate (one-dimensional) data samples. Section 7.2 surveys techniques for analyzing bivariate data samples, which are expressed as pairs of (X, Y) points. No statistical background is required of the reader.
One chapter is not enough to cover all the data analysis techniques that are useful to algorithmic experiments – something closer to a few bookshelves would be needed. Here we focus on describing a small collection of techniques that address the questions most commonly asked about algorithms, and on knowing which technique to apply in a given scenario.
Write your workhorse program well; instrument your program; your experimental results form a database: treat it with respect; keep a kit full of sharp tools.
Jon Louis Bentley, Ten Commandments for Experiments on Algorithms
They say the workman is only as good as his tools; in experimental algorithmics the workman must often build his tools.
The test environment is the collection of programs and files assembled together to support computational experiments on algorithms. This collection includes test programs that implement the algorithms of interest, code to generate input instances and files containing instances; scripts to control and document tests, tools for measuring performance, and data analysis software.
This chapter presents tips for assembling and building these components to create a reliable, efficient, and flexible test environment. We start with a survey of resources available to the experimenter. Section 5.1 surveys aspects of test program design, and Section 5.2 presents a cookbook of methods for generating random numbers and combinatorial objects to use as test inputs or inside randomized algorithms.
Most algorithm researchers prefer to work in Unix-style operating systems, which provide excellent tools for conducting experiments, including:
Utilities such as time and gprof for measuring elapsed and CPU times.
Shell scripts and makefiles. Shell scripting makes it easy to automate batches of tests, and makefiles make it easy to mix and match compilation units. Scripts and make files also create a document trail that records the history of an experimental project.
Let kr(n, δ) be the minimum number of r-cliques in graphs with n vertices and minimum degree at least δ. We evaluate kr(n, δ) for δ ≤ 4n/5 and some other cases. Moreover, we give a construction which we conjecture to give all extremal graphs (subject to certain conditions on n, δ and r).
Simple families of increasing trees were introduced by Bergeron, Flajolet and Salvy. They include random binary search trees, random recursive trees and random plane-oriented recursive trees (PORTs) as important special cases. In this paper, we investigate the number of subtrees of size k on the fringe of some classes of increasing trees, namely generalized PORTs and d-ary increasing trees. We use a complex-analytic method to derive precise expansions of mean value and variance as well as a central limit theorem for fixed k. Moreover, we propose an elementary approach to derive limit laws when k is growing with n. Our results have consequences for the occurrence of pattern sizes on the fringe of increasing trees.