To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Let $P_1, {\ldots}\,,P_k$ be $k$ vertex-disjoint paths in a graph $G$ where the ends of $P_i$ are $x_i$, and $y_i$. Let $H$ be the subgraph induced by the vertex sets of the paths. We find edge bounds $E_1(n)$, $E_2(n)$ such that:
if $e(H) \geq E_1(|V(H)|)$, then there exist disjoint paths $P_1', {\ldots}\,,P_k'$ where the ends of $P_i'$ are $x_i$ and $ y_i$ such that $|\bigcup_i V(P_i)| > |\bigcup_i V(P_i')|$;
if $e(H) \geq E_2(|V(H)|)$, then there exist disjoint paths $P_1', {\ldots}\,, P_k'$ where the ends of $P_i'$ are $x_i'$ and $y_i'$ such that $|\bigcup_i V(P_i)| > |\bigcup_i V(P_i')|$ and $\{ x_1, {\ldots}\,, x_k \} = \{ x_1' , {\ldots}\, , x_k' \}$ and $\{ y_1, {\ldots}\,, y_k \} = \{ y_1', {\ldots}\,, y_k'\}$.
The bounds are the best possible, in that there exist arbitrarily large graphs $H'$ with $e(H') = E_i (H') - 1$ without the properties stipulated in 1 and 2.
High-performance computation was first realized in the form of SIMD parallelism with the introduction of the Cray and Cyber computers. At first these were single processor machines, but starting with the Cray XMP series, multiprocessor vector processors gained the further advantages of MIMD parallelism. Today, vector processing can be incorporated into the architecture of the CPU chip itself as is the case with the old AltiVec processor used in the MacIntosh.
The UNIX operating system introduced a design for shared memory MIMD parallel programming. The components of the system included multitasking, time slicing, semaphores, and the fork function. If the computer itself had only one CPU, then parallel execution was only apparent, called concurrent execution, nevertheless the C programming language allowed the creation of parallel code. Later multiprocessor machines came on line, and these parallel codes executed in true parallel.
Although these tools continue to be supported by operating systems today, the fork model to parallel programming proved too “expensive” in terms of startup time, memory usage, context switching, and overhead. Threads arose in the search for a better soluton, and resulted in a software revolution. The threads model neatly solves most of the low-level hardware and software implementation issues, leaving the programmer free to concentrate on the the essential logical or synchronization issues of a parallel program design. Today, all popular operating systems support thread style concurrent/parallel processing.
In this chapter we will explore vector and parallel programming in the context of scientific and engineering numerical applications. The threads model and indeed parallel programming in general is most easily implemented on the shared memory multiprocessor architecture.
We analyse Jim Propp's $P$-machine, a simple deterministic process that simulates a random walk on ${\mathbb Z}^d$ to within a constant. The proof of the error bound relies on several estimates in the theory of simple random walks and some careful summing. We mention three intriguing conjectures concerning sign-changes and unimodality of functions in the linear span of $\{p(\cdot,{\bf x}) : {\bf x} \in {\mathbb Z}^d\}$, where $p(n,{\bf x})$ is the probability that a walk beginning from the origin arrives at ${\bf x}$ at time $n$.
This paper is devoted to an online variant of the minimum spanning tree problem in randomly weighted graphs. We assume that the input graph is complete and the edge weights are uniformly distributed over [0,1]. An algorithm receives the edges one by one and has to decide immediately whether to include the current edge into the spanning tree or to reject it. The corresponding edge sequence is determined by some adversary. We propose an algorithm which achieves $\mathbb{E}[ALG]/\mathbb{E}[OPT]=O(1)$ and $\mathbb{E}[ALG/OPT]=O(1)$ against a fair adaptive adversary, i.e., an adversary which determines the edge order online and is fair in a sense that he does not know more about the edge weights than the algorithm. Furthermore, we prove that no online algorithm performs better than $\mathbb{E}[ALG]/\mathbb{E}[OPT]=\Omega(\log n)$ if the adversary knows the edge weights in advance. This lower bound is tight, since there is an algorithm which yields $\mathbb{E}[ALG]/\mathbb{E}[OPT]=O(\log n)$ against the strongest-imaginable adversary.
We show that if a graph contains few copies of a given graph, then its edges are distributed rather unevenly.
In particular, for all $\varepsilon > 0$ and $r\geq2$, there exist $\xi =\xi (\varepsilon,r) > 0$ and $k=k (\varepsilon,r)$ such that, if $n$ is sufficiently large and $G=G(n)$ is a graph with fewer than $\xi n^{r}$$r$-cliques, then there exists a partition $V(G) =\cup_{i=0}^{k}V_{i}$ such that \[ \vert V_{i}\vert =\lfloor n/k\rfloor \quad \text{and} \quad e(W_{i}) <\varepsilon\vert V_{i}\vert ^{2}\] for every $i\in [k]$.
We deduce the following slightly stronger form of a conjecture of Erdős.
For all $c>0$ and $r\geq3$, there exist $\xi=\xi (c,r) >0$ and $\beta=\beta(c,r)>0$ such that, if $n$ is sufficiently large and $G=G(n,\lceil cn^{2} \rceil)$ is a graph with fewer than $\xi n^{r}$$r$-cliques, then there exists a partition $V(G) =V_{1}\cup V_{2}$ with $ \vert V_{1} \vert = \lfloor n/2 \rfloor $ and $\vert V_{2} \vert = \lceil n/2 \rceil $ such that \[ e(V_{1},V_{2}) > (1/2+\beta) e (G).\]
Consider the problem of searching for the extremal values of an objective function f defined on a domain Ω, and equally important, for the points x ∈ Ω, where these values occur. An extremal value is called an optimum (maximum or minimum) while a point where an optimum occurs is called an optimizer (maximizer or minimizer).
If the domain is a subset of Euclidean space, we will assume f is differentiable. In this case gradient descent (or ascent) methods are used to locate local minima (or maxima). Whether or not a global extremum has been found depends upon the starting point of the search. Each local minimum (maximum) has its own basin of attraction and so it becomes a matter of starting in the right basin. Thus there is an element of chance involved if globally extreme values are desired.
On the other hand, we allow the possibility that Ω is a discrete, and possibly large, finite set. In this case downhill/uphill directional information is nonexistent and the search is forced to make due with objective values only. As the search proceeds from one point to the next, selecting the next point to try is often best left to chance.
A search process in which the next point or next starting point to try is randomly determined and may depend on the current location is, mathematically, a finite Markov Chain. Although the full resources of that theory may be brought to bear on the problem, only general assertions will be possible without knowing the nature of the specific objective function.
Many problems in scientific computation can be solved by reducing them to a problem in linear algebra. This turns out to be an extremely successful approach. Linear algebra problems often have a rich mathematical structure, which gives rise to a variety of highly efficient and well-optimized algorithms. Consequently, scientists frequently consider linear models or linear approximations to nonlinear models simply because the machinery for solving linear problems is so well developed.
Basic linear algebraic operations are so fundamental that many current computer architectures are designed to maximize performance of linear algebraic computations. Even the list of the top 500 fastest computers in the world (maintained at www.top500.org) uses the HPL benchmark for solving dense systems of linear equations as the main performance measure for ranking computer systems.
In 1973, Hanson, Krogh, and Lawson described the advantages of adopting a set of basic routines for problems in linear algebra. These basic linear algebra subprograms are commonly referred to as the Basic Linear Algebra Subprograms (BLAS), and they are typically divided into three heirarchical levels: level 1 BLAS consists of vector–vector operations, level 2 BLAS are matrix–vector operations, and level 3 BLAS are matrix–matrix operations. The BLAS have been standardized with an application programming interface (API). This allows hardware vendors, compiler writers, and other specialists to provide programmers with access to highly optimized kernel routines adapted to specific architectures. Profiling tools indicate that many scientific computations spend most of their time in those sections of the code that call the BLAS. Thus, even small improvements in the BLAS can yield substantial speedups.
We study negative dependence properties of a sampling process due to Srinivasan to produce distributions on level sets with given marginals. We give a simple proof that the distribution satisfies negative association. We also show that under a linear match schedule it satisfies the stronger condition of conditional negative association via a non-trivial application of the Feder–Mihail theorem. This method involves the notion of a variable of positive influence. We give some results and related counter-examples which might shed some light on its role in a theory of negative dependence.
A simple first moment argument shows that in a randomly chosen $k$-SAT formula with $m$ clauses over $n$ boolean variables, the fraction of satisfiable clauses is $1-2^{-k}+o(1)$ as $m/n\rightarrow\infty$ almost surely. In this paper, we deal with the corresponding algorithmic strong refutation problem: given a random $k$-SAT formula, can we find a certificate that the fraction of satisfiable clauses is $1-2^{-k}+o(1)$ in polynomial time? We present heuristics based on spectral techniques that in the case $k=3$ and $m\geq\ln(n)^6n^{3/2}$, and in the case $k=4$ and $m\geq Cn^2$, find such certificates almost surely. In addition, we present heuristics for bounding the independence number (resp. the chromatic number) of random $k$-uniform hypergraphs from above (resp. from below) for $k=3,4$.
The problem of solving linear systems of equations is central in scientific computation. Systems of linear equations arise in many areas of science, engineering, finance, commerce, and other disciplines. They emerge directly through mathematical models in these areas or indirectly in the numerical solution of mathematical models as for example in the solution of partial differential equations. Because of the importance of linear systems, a great deal of work has gone into methods for their solution.
A system of m equations in n unknowns may be written in matrix form as Ax = b in which the coefficient matrix A is m × n, while the unknown vector x and the right-hand side b are n-dimensional. The most important case is when the coefficient matix is square, corresponding to the same number of unknowns as equations. The more general m × n case can be reduced to this.
There are two main methods for solving these systems, direct and iterative. If arithmetic were exact, a direct algorithm would solve the system exactly in a predetermined finite number of steps. In the reality of inexact computation, a direct method still stops in the same number of steps but accepts some level of numerical error. A major consideration with respect to direct methods is mitigating this error. Direct methods are typically used on moderately sized systems in which the coefficient matrix is dense, meaning most of its elements are nonzero. By contrast, iterative methods are typically used on very large, sparse systems. Iterative methods asymptotically converge to the solution and so run until the approximation is deemed acceptable.
Just as a graph is an effective way to understand a function, a directed acyclic graph is an effective way to understand a parallel computation. Such a graph shows when each calculation is done, which others can be done at the same time, what prior calculations are needed for it and into what subsequent calculations it feeds.
Starting from a directed acyclic graph and given a set of processors, then a schedule can be worked out. A schedule assigns each calculation to a specific processor to be done at a specified time.
From a schdule, the total time for a computation follows and, from this, we get the difficulty or complexity of the computation.
A Directed Acyclic Graph Defines a Computation
A computation can be accurately depicted by means of a directed acyclic graph (DAG), G = (N, A), consisting of a set of vertices N and a set of directed arcs A. In such a portrayal the vertices, or nodes, of the graph represent subtasks to be performed on the data, while the directed arcs indicate the flow of data from one subtask to the next. In particular, a directed arc (i, j) ∈ A from node i to j indicates that calculation j requires the result of calculation i.
The input data is shown at the top of the graph. We take the flow of data from the top to the bottom of the graph (or, less often, from left to right). This also becomes the flow of time; it follows that the graph can have no cycles. With this convention, we may omit the direction indicators on the arcs.