To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The problem of increasing edge or vertex-connectivity of a given graph up to a specified target value k by adding the smallest number of new edges is called connectivity augmentation. These problems were first studied in 1976 by Eswaran and Tarjan [64] and Plesnik [275] and were shown to be polynomially solvable for k = 2. The problems have important applications such as the network construction problem [279], the rigidity problem in grid frameworks [13, 99], the data security problem [110, 172], and the rectangular dual graph problem in floor planning [303]. We refer to [81, 241] surveys for this study.
In this chapter, we mainly treat the edge-connectivity augmentation problem for a given target value k. For a general k, Watanabe and Nakamura [308] established in 1987 a min-max theorem, based on which they gave an O(k2(kn + m)n4) time algorithm. Afterward, Frank [78] gave a unified approach to various edgeconnectivity augmentation problems by making use of the edge-splitting theorems of Lovász [200, 202] and Mader [206, 208]. Then Nagamochi and Ibaraki [236] proposed an O((nm + n2 log n) log n) time algorithm by combining the minimumcut algorithm in Section 3.2 and the approach of Frank. If the graph under consideration is weighted by real numbers, this algorithm can be further simplified and can be extended to solve the edge-connectivity augmentation problem for the entire range of target k in O(nm + n2 log n) time [238], as will be explained in Section 8.4. By using extreme vertex sets in Section 1.5.3, Benczúr and Karger [20] gave an O(n2 log5n) time randomized algorithm of Monte Carlo type to optimally increase a multigraph.
Up to now we have described many specific data structures. There are also some general methods that add some additional capabilities or properties to a given data structure. Without any further knowledge about the structure, there is not much we can do, so in each case we need some further assumptions about the operation supported by the structure or its implementation. The two well-studied problems here are how to make a static structure dynamic and how to allow queries in old states of a dynamic data structure.
Making Structures Dynamic
Several of the structures we have discussed were static structures, like the interval trees: they are built once and then allow queries, but no changes of the underlying data. To make them dynamic, we want to allow changes in the underlying data. In this generality, there is not much we can do, but with some further assumptions, there are efficient construction methods that take the static data structure as a black box, which is used to build the new dynamic structure.
The most important such class is the decomposable searching problems. Here, the underlying abstract object is some set X, and in our queries we wish to evaluate some function f(X, query), and this function has the property that for any partition X = X1 ∪ X2, the function value f(X, query) can be constructed from f(X1, query) and f(X2, query).
A search tree is a structure that stores objects, each object identified by a key value, in a tree structure. The key values of the objects are from a linearly ordered set (typically integers); two keys can be compared in constant time and these comparisons are used to guide the access to a specific object by its key. The tree has a root, where any search starts, and then contains in each node some key value for comparison with the query key, so one can go to different next nodes depending on whether the query key is smaller or larger than the key in the node until one finds a node that contains the right key.
This type of tree structure is fundamental to most data structures; it allows many variations and is also a building block for most more complex data structures. For this reason we will discuss it in great detail.
Search trees are one method to implement the abstract structure called dictionary. A dictionary is a structure that stores objects, identified by keys, and supports the operations find, insert, and delete. A search tree usually supports at least these operations of a dictionary, but there are also other ways to implement a dictionary, and there are applications of search trees that are not primarily dictionaries.
Two Models of Search Trees
In the outline just given, we supressed an important point that at first seems trivial, but indeed it leads to two different models of search trees, either of which can be combined with much of the following material, but one of which is strongly preferable.
The concept of extreme vertex sets, defined in Section 1.5.3, was first introduced by Watanabe and Nakamura [308] to solve the edge-connectivity augmentation problem. The fastest deterministic algorithm currently known for computing all extreme vertex sets was given by Naor, Gusfield, and Martel [259]. Their algorithm first computes the Gomory–Hu cut tree of the graph and then finds all maximal k-edge-connected components for some k, from which all extreme vertex sets are identified by Lemma 1.42, taking O(n(mn log(n2/m)) running time. Bencz&úr and Karger [20] have given a Monte Carlo–type randomized algorithm, which runs in O(n2 log5n) time but is rather involved. Notice that computing all extreme vertex sets is not easier than finding a minimum cut since at least one of the extreme vertex sets is a minimum cut.
In this chapter, we give a simple and efficient algorithm for computing all extreme vertex sets in a given graph, and we show some applications of extreme vertex sets. The algorithm will be used to solve the edge-connectivity augmentation problem in Section 8.3. In Section 6.1,we design a deterministic O(nm + n2 log n) time algorithm for computing the family χ(G) of extreme vertex sets in a given edge-weighted graph G, which is a laminar family, as observed in Lemma 1.41. As a new application of extreme vertex sets, in Section 6.2 we consider a dynamic graph G in which the weight of edges incident to a designated vertex may increase or decrease with time and we give a dynamic minimum cut algorithm that reports a minimum cut of the current G whenever the weight of an edge is updated.
This is a textbook for a second course on formal languages and automata theory.
Many undergraduates in computer science take a course entitled “Introduction to Theory of Computing,” in which they learn the basics of finite automata, pushdown automata, context-free grammars, and Turing machines. However, few students pursue advanced topics in these areas, in part because there is no really satisfactory textbook.
For almost 20 years I have been teaching such a second course for fourth-year undergraduate majors and graduate students in computer science at the University of Waterloo: CS 462/662, entitled “Formal Languages and Parsing.” For many years we used Hopcroft and Ullman's Introduction to Automata Theory, Languages, and Computation as the course text, a book that has proved very influential. (The reader will not have to look far to see its influence on the present book.)
In 2001, however, Hopcroft and Ullman released a second edition of their text that, in the words of one professor, “removed all the good parts.” In other words, their second edition is geared toward second- and third-year students, and omits nearly all the advanced topics suitable for fourth-year students and beginning graduate students.
Because the first edition of Hopcroft and Ullman's book is no longer easily available, and because I have been regularly supplementing their book with my own handwritten course notes, it occurred to me that it was a good time to write a textbook on advanced topics in formal languages.
In this chapter we investigate methods for parsing and recognition in contextfree grammars (CFGs). Both problems have significant practical applications. Parsing, for example, is an essential feature of a compiler, which translates from one computer language (the “source”) to another (the “target”). Typically, the source is a high-level language, while the target is machine language.
The first compilers were built in the early 1950s. Computing pioneer Grace Murray Hopper built one at Remington Rand during 1951–1952. At that time, constructing a compiler was a black art that was very time consuming. When John Backus led the project that produced a FORTRAN compiler in 1955–1957, it took 18 person-years to complete.
Today, modern parser generators, such as Yacc (which stands for “yet another compiler-compiler”) and Bison, allow a single person to construct a compiler in a few hours or days. These tools are based on LALR(1) parsing, a variant of one of the parsing methods we will discuss here. Parsing is also a feature of natural language recognition systems.
In Section 5.1 we will see how to accomplish parsing in an arbitrary CFG in polynomial time. More precisely, if the grammar G is in Chomsky normal form, we can parse an arbitrary string w ∈ L(G) of length n in O(n3) time. While a running time of O(n3) is often considered tractable in computer science, as programs get bigger and bigger, it becomes more and more essential that parsing be performed in linear time.
Hash tables are a dictionary structure of great practical importance and can be very efficient. The underlying idea is quite simple: we have a universe U and want to store a set of objects with keys from U. We also have s buckets and a function h from U to S = {0, …, s − 1}. Then we store the object with key u in the h(u)th bucket. If several objects that we want to store are mapped to the same bucket, we have a collision between these objects. If there are no collisions, then we can realize the buckets just as an array, each array entry having space for one object. The theory of hash tables mainly deals with the questions of what to do about the collisions and how to choose the function h in such a way that the number of collisions is small.
The idea of hash tables is quite old, apparently starting in several groups at IBM in 1953 (Knott 1972). For a long time the main reason for the popularity of hash tables was the simple implementation; the hash funcions h were chosen ad hoc as some unintelligible way to map the large universe to the small array allocated for the table. It was the practical programmer's dictionary structure of choice, easily written and conceptually understood, with no performance guarantees, and it still exists in this style in many texts aimed at that group.
In this chapter, we investigate structures and algorithms of cactus representations, which were introduced in Section 1.5.4 to represent all minimum cuts in an edgeweighted graph G. Throughout this chapter, we assume that λ(G) > 0 for a given graph G, which implies that G is connected. Let C(G) denote the set of all minimum cuts in G. In Section 5.1, we define a canonical form of cactus representations. In Section 5.2, we show that a subset of C(G) that consists of minimum cuts separating two given vertices, s and t, can be represented by a simple cactus structure. In Section 5.3, we design an O(mn + n2 log n) time algorithm for constructing a cactus representation R of C(G).
Canonical Forms of Cactus Representations
In this section, we discuss cactus representations for a subset of minimum cuts, and we prove the existence of two canonical forms, which we call the cycle-type and junction-type normal cactus representations. Such a canonical representation is useful in designing an efficient algorithm that constructs a cactus representation for all the minimum cuts of a given graph [244]. It also helps to efficiently test whether two given graphs have the same “structure” with respect to their minimum cuts, which is based on a planar isomorphism algorithm due to Hopcroft and Tarjan [126].
A cactus representation for a given subset C ⊆ C(G), if one exists, may not be unique unless we impose further structural restrictions.
We study the number of subtrees on the fringe of random recursive trees and random binary search trees whose limit law is known to be either normal or Poisson or degenerate depending on the size of the subtree. We introduce a new approach to this problem which helps us to further clarify this phenomenon. More precisely, we derive optimal Berry–Esseen bounds and local limit theorems for the normal range and prove a Poisson approximation result as the subtree size tends to infinity.
We show that for each α>0 every sufficiently large oriented graph G with δ+(G), δ−(G)≥3|G|/8+α|G| contains a Hamilton cycle. This gives an approximate solution to a problem of Thomassen [21]. In fact, we prove the stronger result that G is still Hamiltonian if δ(G)+δ+(G)+δ−(G)≥3|G|/2 + α|G|. Up to the term α|G|, this confirms a conjecture of Häggkvist [10]. We also prove an Ore-type theorem for oriented graphs.