To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
As explained in more detail in Section 1.3, our next model was inspired by the popular concept of “six degrees of separation,” which is based on the notion that every one in the world is connected to everyone else through a chain of at most six mutual acquaintances. Now an Erdös–Rényi random graph for n = 6 billion people in which each individual has an average of μ = 42.62 friends would have average pairwise distance (log n)/(log μ) = 6, but would have very few triangles, while in social networks if A and B are friends and A and C are friends, then it is fairly likely that B and C are also friends.
To construct a network with small diameter and a positive density of triangles, Watts and Strogatz (1998) started from a ring lattice with n vertices and k edges per vertex, and then rewired each edge with probability p, connecting one end to a vertex chosen at random. This construction interpolates between regularity (p = 0) and disorder (p = 1). The disordered graph is not quite an Erdös–Rényi graph, since the degree of a node is the sum of a Binomial(k, 1/2) and an independent Poisson(k/2).
When I was a graduate student in the mid-1960s, the mathematical theory underlying analysis of variance and regression became clear to me after I read a draft of William Kruskal's monograph on the so-called coordinate-free, or geometric, approach to these subjects. Alas, with Kruskal's demise, this excellent treatise will never be published.
From time to time during the 1970s, 80s, and early 90s, I had the good fortune to teach the coordinate-free approach to linear models, more precisely, to Model I analysis of variance and linear regression with nonrandom predictors. While doing so, I evolved my own set of lecture notes, presented here. With regard to inspiration and content, my debt to Kruskal is clear. However, my notes differ from Kruskal's in many ways. To mention just a few, my notes are intended for a one- rather than three-quarter course. The notes are aimed at statistics graduate students who are already familiar with the basic concepts of linear algebra, such as linear subspaces and linear transformations, and who have already had some exposure to the matricial formulation of the GLM, perhaps through a methodology course, and who are interested in the underlying theory. I have also included Tjur experimental designs and some of the highlights of the optimality theory for estimation and testing in linear models under the assumption of normality, feeling that the elegant setting provided by the coordinate-free approach is a natural one in which to place these jewels of mathematical statistics.
Chapter 1 will explain what this book is about. Here I will explain why I chose to write the book, how it is written, when and where the work was done, and who helped.
Why. It would make a good story if I was inspired to write this book by an image of Paul Erdös magically appearing on a cheese quesadilla, which I later sold for thousands of dollars on eBay. However, that is not true. The three main events that led to this book were (i) the use of random graphs in the solution of a problem that was part of Nathanael Berestycki's thesis; (ii) a talk that I heard Steve Strogatz give on the CHKNS model, which inspired me to prove some rigorous results about their model; and (iii) a book review I wrote on the books by Watts and Barabási for the Notices of the American Math Society.
The subject of this book was attractive for me, since many of the papers were outside the mathematics literature, so the rigorous proofs of the results were, in some cases, interesting mathematical problems. In addition, since I had worked for a number of years on the properties of stochastic spatial models on regular lattices, there was the natural question of how the behavior of these systems changed when one introduced long-range connections between individuals or considered power law degree distributions.
In this chapter we will introduce and study the random graph model introduced by Erdös and Rényi in the late 1950s. This example has been extensively studied and a very nice account of many of the results can be found in the classic book of Bollobás (2001), so here we will give a brief account of the main results on the emergence of a giant component, in order to prepare for the analysis of more complicated examples. In contrast to other treatments, we mainly rely on methods from probability and stochastic processes rather than combinatorics.
To define the model, we begin with the set of vertices V = {1, 2, … n}. For 1 ≤ x < y ≤ n let ηx,y be independent = 1 with probability p and 0 otherwise. Let ηy,x = ηx,y. If ηx,y = 1 there is an edge from x to y. Here, we will be primarily concerned with situation p = λ/n and in particular with showing that when λ < 1 all of the components are small, with the largest O(log n), while for λ > 1 there is a giant component with ~ g(λ)n vertices. The intuition behind this result is that a site has a Binomial(n – 1, λ/n) number of neighbors, which has mean ≈ λ.
In an Erdös–Rényi random graph, vertices have degrees that have asymptotically a Poisson distribution. However, as discussed in Section 1.4, in social and communication networks, the distribution of degrees is much different from the Poisson and in many cases has a power law form, that is, the fraction of vertices of degree k, pk ~ Ck-β as k → ∞. Molloy and Reed (1995) were the first to construct graphs with specified degree distributions. We will use the approach of Newman, Strogatz, and Watts (2001, 2002) to define the model.
Let d1,…dn be independent and have P(di = k) = pk. Since we want di to be the degree of vertex i, we condition on En = {d1 + … + dn is even}. If the probability P(E1) ∊ (0, 1) then P(En) → ½ as n → ∞ so the conditioning will have little effect on the finite-dimensional distributions. If d1 is always even then P(En) = 1 for all n, while if d1 is always odd, P(E2n) = 1 and P(E2n+1) = 0 for all n.
To build the graph we think of di half-edges attached to i and then pair the half-edges at random. The picture gives an example with eight vertices.
The theory of random graphs began in the late 1950s in several papers by Erdös and Rényi. However, the introduction at the end of the twentieth century of the small world model of Watts and Strogatz (1998) and the preferential attachment model of Barabási and Albert (1999) have led to an explosion of research. Querying the Science Citation Index in early July 2005 produced 1154 citations for Watts and Strogatz (1998) and 964 for Barabási and Albert (1999). Survey articles of Albert and Barabási (2002), Dorogovstev and Mendes (2002), and Newman (2003) each have hundreds of references. A book edited by Newman, Barabási, and Watts (2006) contains some of the most important papers. Books by Watts (2003) and Barabási (2002) give popular accounts of the new science of networks, which explains “how everything is connected to everything else and what it means for science, business, and everyday life.”
While this literature is extensive, many of the papers are outside the mathematical literature, which makes writing this book a challenge and an opportunity. A number of articles have appeared in Nature and Science. These journals with their impressive impact factors are, at least in the case of random graphs, the home of 10 second sound bite science.