To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Let ${\mathcal H}$ denote a collection of subsets of {1,2,. . .,n}, and assign independent random variables uniformly distributed over [0,1] to the n elements. Declare an element p-present if its corresponding value is at most p. In this paper, we quantify how much the observation of the r-present (r>p) set of elements affects the probability that the set of p-present elements is contained in ${\mathcal H}$. In the context of percolation, we find that this question is closely linked to the near-critical regime. As a consequence, we show that for every r>1/2, bond percolation on the subgraph of the square lattice given by the set of r-present edges is almost surely noise sensitive at criticality, thus generalizing a result due to Benjamini, Kalai and Schramm.
For d ≥ 2, let Hd(n,p) denote a random d-uniform hypergraph with n vertices in which each of the $\binom{n}{d}$ possible edges is present with probability p=p(n) independently, and let Hd(n,m) denote a uniformly distributed d-uniform hypergraph with n vertices and m edges. Let either H=Hd(n,m) or H=Hd(n,p), where m/n and $\binom{n-1}{d-1}p$ need to be bounded away from (d−1)−1 and 0 respectively. We determine the asymptotic probability that H is connected. This yields the asymptotic number of connected d-uniform hypergraphs with given numbers of vertices and edges. We also derive a local limit theorem for the number of edges in Hd(n,p), conditioned on Hd(n,p) being connected.
Let Hd(n,p) signify a random d-uniform hypergraph with n vertices in which each of the $\binom{n}{d}$ possible edges is present with probability p=p(n) independently, and let Hd(n,m) denote a uniformly distributed d-uniform hypergraph with n vertices and m edges. We derive local limit theorems for the joint distribution of the number of vertices and the number of edges in the largest component of Hd(n,p) and Hd(n,m) in the regime $(d-1)\binom{n-1}{d-1}p>1+\varepsilon$, resp. d(d−1)m/n>1+ϵ, where ϵ>0 is arbitrarily small but fixed as n → ∞. The proofs are based on a purely probabilistic approach.
This chapter reduces the inference problem in probabilistic graphical models to an equivalent maximum weight stable set problem on a graph. We discuss methods for recognizing when the latter problem can be solved efficiently by appealing to perfect graph theory. Furthermore, practical solvers based on convex programming and message-passing are presented.
Tractability is the study of computational tasks with the goal of identifying which problem classes are tractable or, in other words, efficiently solvable. The class of tractable problems is traditionally assumed to be solvable in polynomial time by a deterministic Turing machine and is denoted by P.The class contains many natural tasks such as sorting a set of numbers, linear programming (the decision version), determining if a number is prime, and finding a maximum weight matching. Many interesting problems, however, lie in another class that generalizes P and is known as NP: the class of languages decidable in polynomial time on a non-deterministic Turing machine. We trivially have that P isasubset of NP (many researchers also believe that it is a strict subset). It is believed that many problems in the class NP are, in the worst case, intractable and do not admit efficient inference. Problems such as maximum stable set, the traveling salesman problem and graph coloring are known to be NP-hard (at least as hard as the hardest problems in NP). It is, therefore, widely suspected that there are no polynomial-time algorithms for NP-hard problems.
This chapter covers methods for identifying islands of tractability for NP-hard combinatorial problems by exploiting suitable properties of their graphical structure. Acyclic structures are considered, as well as nearly-acyclic ones identified by means of so-called structural decomposition methods. In particular, the chapter focuses on the tree decomposition method, which is the most powerful decomposition method for graphs, and on the hypertree decomposition method, which is its natural counterpart for hypergraphs. These problem-decomposition methods give rise to corresponding notions of width of an instance, namely, treewidth and hypertree width. It turns out that many NP-hard problems can be solved efficiently over classes of instances of bounded treewidth or hypertree width: deciding whether a solution exists, computing a solution, and even computing an optimal solution (if some cost function over solutions is specified) are all polynomial-time tasks. Example applications include problems from artificial intelligence, databases, game theory, and combinatorial auctions.
Many NP-hard problems in different areas such as AI [42], Database Systems [6, 81], Game theory [45, 31, 20], and Network Design [34], are known to be efficiently solvable when restricted to instances whose underlying structures can be modeled via acyclic graphs or acyclic hypergraphs. For such restricted classes of instances, solutions can usually be computed via dynamic programming. However, as a matter of fact, (graphical) structures arising from real applications are in most relevant cases not properly acyclic. Yet, they are often not very intricate and exhibit some rather limited degree of cyclicity, which suffices to retain most of the nice properties of acyclic instances.
Machine learning and data analysis have driven explosive growth in interest in the methods of large-scale optimization. Many commonly used techniques such as stochastic-gradients date back several decades, but owing to their practical success they have gained great importance in machine learning. Before interior point methods totally dominated the field of optimization, first-order methods had already been studied and theoretically analyzed in substantial detail. But interest in these techniques skyrocketed after the prolific rise of applications in machine learning, signal processing, etc. This chapter is a brief introduction to this vast and flourishing area of large-scale optimization.
Introduction
Machine Learning (ML) broadly encompasses a variety of adaptive, autonomous, and intelligent tasks where one must “learn” to predict from observations and feedback. Throughout its evolution, ML has drawn heavily and successfully on optimization algorithms; this relation to optimization is not surprising as “learning” and “adapting” ultimately involve problems where some quality function must be optimized.
But the interaction between ML and optimization is now undergoing rapid change. The increased size, complexity, and variety of ML problems, not only prompts a refinement of existing optimization techniques, but also spurs development of new methods tuned to the specific needs of ML applications.
In particular, ML applications must usually cope with large-scale data, which forces us to prefer “simpler,” perhaps less accurate but more scalable algorithms. Such methods can also crunch through more data, and may actually be better suited for learning – for a more precise characterization see [11]. The use of possibly less accurate methods is also grounded in pragmatic concerns: modeling limitations, observational noise, uncertainty, and computational errors are pervasive in real data.
Optimization problems are often hard to solve precisely. However solutions that are only nearly optimal are often good enough in practical applications. Approximation algorithms can find such solutions efficiently for many interesting problems. Profound theoretical results additionally help us understand what problems are approximable. This chapter gives an overview of existing approximation techniques, along five broad categories: greedy algorithms, linear and semi-definite programming relaxations, metric embeddings and special techniques. It concludes with an overview of the main inapproximability results.
Introduction
NP-hard optimization problems are ubiquitous, and unless P=NP, we cannot expect algorithms that find optimal solutions on all instances in polynomial time. This intractability thus forces us to relax one of the three above mentioned constraints. Approximation algorithms relax the optimality constraint, and aim to do so by as small an amount as possible. We shall concern ourselves with discrete optimization problems, where the goal is to find amongst the set of feasible solutions, the one that minimizes (or maximizes) the value of the objective function. Usually, the space of feasible solutions is defined implicitly, e.g. the set of cuts in a graph on n vertices. The objective function associates with each feasible solution a real value; this usually has a succinct representation as well, e.g. the number of edges in the cut. We measure the performance of an approximation algorithm on a given instance by the ratio of the value of the solution output by the algorithm, to that of the optimal solution.
In this chapter we will introduce submodularity and some of its generalizations, illustrate how it arises in various applications, and discuss algorithms for optimizing submodular functions.
Submodularity is a property of set functions with deep theoretical consequences and far-reaching applications. At first glance it seems very similar to concavity, in other ways it resembles convexity. It appears in a wide variety of applications: in Computer Science it has recently been identified and utilized in domains such as viral marketing [39], information gathering [44], image segmentation [10, 40, 36], document summarization [56], and speeding up satisfiability solvers [73]. Our emphasis in this chapter is on maximization; there are many important results and applications related to minimizing submodular functions that we do not cover.
As a concrete running example, we will consider the problem of deploying sensors in a drinking water distribution network (see Figure 3.1) in order to detect contamination. In this domain, we may have a model of how contaminants, accidentally or maliciously introduced into the network, spread over time. Such a model then allows to quantify the benefit f(A) of deploying sensors at a particular set A of locations (junctions or pipes in the network) in terms of the detection performance (such as average time to detection).
Based on this notion of utility, we then wish to find an optimal subset A ⊆ V of locations maximizing the utility, maxAf(A), subject to some constraints (such as bounded cost). This application requires solving a difficult real-world optimization problem, that can be handled with the techniques discussed in this chapter (Krause et al. [49] show in detail how submodular optimization can be applied in this domain.)
This chapter discusses recent advances in modern coding theory, in particular the use of popular graph-based codes and their low complexity decoding algorithms. We describe absorbing sets as the key object for characterizing the performance of iteratively-decoded graph-based codes and we propose several directions for future investigation in this thriving discipline.
Chapter Overview
Every engineered communication system, ranging from satellite communications to hard disk drives to Ethernet must operate under noisy conditions. The key to reliable communication and storage is to add an appropriate amount of redundancy to make the system reliable. The field of channel coding is concerned with constructing channel codes and their decoding algorithms: controlled redundancy is introduced into a message prior to its transmission over a noisy channel (the encoding step), and this redundancy is removed from the received noisy string to unveil the intended message (the decoding step). The encoded message is referred to as the codeword. The collection of all codewords is a channel code. Assuming all the messages have the same length, and all the codewords have the same length, the ratio of message length to codeword length is the code rate. To make coding systems implementable in practice, channel codes must provide the best possible protection to noise while their decoding algorithms must be of acceptable complexity.
There is a clear tension with this dual goal: if a channel code protects a fairly long encoded message with relatively few but carefully derived redundancy bits (necessary for high performance), the optimal, maximum likelihood decoding algorithm has exponential complexity.
Boolean Satisfiability (SAT) can be considered a success story of Computer Science. Since the mid-90s, SAT has evolved from a decision problem with theoretical interest, to a problem with key practical benefits, finding a wide range of practical applications. From the early 60s until the mid 90s, existing SAT solvers were able to solve small instances with few tens of variables and hundreds of clauses. In contrast, modern SAT solvers are able to solve practical instances with hundreds of thousands of variables and millions of clauses. This chapter describes the techniques that are implemented in SAT solvers aiming to explain why SAT solvers work (so well) in practice. These techniques range from efficient search techniques to dedicated data structures, among others. Whereas some techniques are commonly implemented in modern SAT solvers, some others are more speciic in the sense that only some instances beneit from its implementation. Furthermore, a tentative glimpse of the future is presented.
Introduction
Boolean Satisfiability (SAT) is an NP-complete decision problem [14]. SAT was the first problem to be shown NP-complete. There are no known polynomial time algorithms for SAT. Moreover, it is believed that any algorithm that solves SAT is exponential in the number of variables, in the worst-case.
Although SAT is in theory an NP-complete problem, in practice it can be seen as a success story of Computer Science. There have been remarkable improvements since the mid 90s, namely clause learning and unique implication points (UIPs) [43], search restarts [15, 26], lazy data structures [48], adaptive branching heuristics [48], clause minimization [59] and preprocessing [18].
In this chapter, we will survey recent results on the broad family of optimisation problems that can be cast as valued constraint satisfaction problems (VCSPs). We discuss general methods for analysing the complexity of such problems, and give examples of tractable cases.
Introduction
Computational problems from many different areas involve finding values for variables that satisfy certain specified restrictions and optimise certain specified criteria.
In this chapter, we will show that it is useful to abstract the general form of such problems to obtain a single generic framework. Bringing all such problems into a common framework draws attention to common aspects that they all share, and allows very general analytical approaches to be developed. We will survey some of these approaches, and the results that have been obtained by using them.
The generic framework we shall use is the valued constraint satisfaction problem (VCSP), defined formally in Section 4.3. We will show that many combinatorial optimisation problems can be conveniently expressed in this framework, and we will focus on finding restrictions to the general problem which are sufficient to ensure tractability.
An important and well-studied special case of the VCSP is the constraint satisfaction problem (CSP), which deals with combinatorial search problems which have no optimisation criteria. We give a brief introduction to the CSP in Section 4.2, before defining the more general VCSP framework in Section 4.3. Section 4.4 then presents a number of examples of problems that can be seen as special cases of the VCSP.
The remainder of the chapter discusses what happens to the complexity of the valued constraint satisfaction problem when we restrict it in various ways.
Satisfiability Modulo Theories (SMT) extends Propositional Satisfiability with logical theories that allow us to express relations over various types of variables, such as arithmetic constraints, or equalities over uninterpreted functions. SMT solvers are widely used in areas such as software verification, where they are able to solve surprisingly efficiently some problems that appear hard, when not undecidable. This chapter presents a general introduction to SMT solving. It then focuses on one important theory, equality, and gives both a detailed understanding of how it is solved, and a theoretical justiication of why the procedure is practically effective.
Introduction
Our starting point is research and experiences in the context of the state-of-the art SMT solver Z3 [13], developed by the authors at Microsoft Research. We first cover a selection of the main challenges and techniques for making SMT solving practical, integrating algorithms for tractable subproblems, and pragmatics and heuristics used in practice. We then take a proof-theoretical perspective on the power and scope of the engines used by SMT solvers. Most modern SMT solvers are built around a tight integration with efficient SAT solving. The framework is commonly referred to as DPLL(T), where T refers to a theory or a combination of theories. The theoretical result we present compares DPLL(T) with unrestricted resolution. A straightforward adaption of DPLL(T) provides a weaker proof system than unrestricted resolution, and we investigate an extension we call Conflict Directed Theory Resolution as a candidate method for bridging this gap. Our results apply to the case where T is the theory of equality.