To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter deals with inference procedures for the models of Chapter 3: model selection and parameter estimation; estimating confidence intervals; tests of hypotheses for choosing between alternative models; evaluating goodness of fit.
Maximum likelihood methods will be emphasised but not to the exclusion of alternatives. Small samples are often encountered in spatial data analysis, boundary effects are severe and the bilateral nature of many spatial interaction models give rise to awkward numerical problems. Maximum likelihood estimation is often very sensitive to correct model specification. For these reasons inference via either exact or approximate maximum likelihood may in some cases seem more trouble than it is worth and other methods lacking some of the apparently desirable properties of maximum likelihood estimators but computationally easier may be worth exploring. Further, most of these desirable properties are asymptotic. In the case of spatial data letting n (lattice size) increase to infinity has an ambiguous interpretation: does it imply more points within a fixed area or increasing the size of the study area? The former changes the nature of inter-site relationships, the latter may be inappropriate where the study area is of fixed size with a specified border. Below we refer to ‘large’ and ‘small’ sample situations. However, there is no natural continuity between the two as in the case of a time series with an increasing number of observations.
This book is about methods of analysing spatially referenced data where quantitative observations are associated with fixed points or areas on a map. My feeling was that there was a place for a book that tried to identify the main approaches to analysing spatial data and brought them within a coherent framework. Specifically, I wanted to examine the substantive justification for certain modes of analysis, to try to lay out some of the more important parts of statistical theory as it applied to spatial data analysis and consider important problems and discuss applications. The aim was to produce a text that proceeded from data collection to preliminary or exploratory analysis to uni-variate and multi-variate data analysis.
The purpose of Chapter 2 is to try and answer questions about what one is doing in spatial data analysis. As the introduction makes clear, I am not however advocating any form of separatism and in fact I see many of the problems associated with spatial data as amenable to treatment by currently available methods widely used in other branches of data analysis.
Chapters 3 and 4 lay out those areas of theoretical statistics which I think are relevant to the problems of representing spatial variation in the social and environmental sciences. No doubt there will be those who feel that the models are far too simple and the techniques far too difficult to justify application.
This chapter examines the problems that arise in sampling a surface for purposes of estimating its properties. We shall assume that the surface is continuous or very nearly so. Often data are made available on a predetermined grid or framework such as a network of established weather stations, pixels on a remotely sensed image, areas or tracts in a census survey. In this chapter, however, we consider the situation where the analyst can choose the framework for data collection. We consider three categories of spatial sampling problem.
Category I: Problems concerned with estimating some non-spatial characteristic of a spatial population, for example the frequency distribution of areal values, or an areal mean, or a total, proportion or intensity value. So, we might wish to estimate the mean or total level of precipitation in a basin, average levels of household income, the proportion of infected individuals in an area during an epidemic, or the proportion of an area with high levels of pollution.
Category II: Problems where the spatial variation of some variable is specifically required, either in the form of a map or in the form of a summary measure (such as a variogram or correlogram) to highlight important scales of variation. Included here are situations where the objective is to ensure efficient spatial interpolation (in geology, geomorphology or meteorology for example) for purposes of converting scattered point data to map form.
Chapter 1 defined three important objectives: the presentation of spatial data analysis as a part of general data analysis while alerting the reader to the special difficulties that spatial data may create; the presentation of a wide range of models for presenting spatial variation; the examination of the role of subject matter theory in spatial analysis. Good data analysis seeks a balance between being theoretically informed and letting the data speak. Balance is essential in order to avoid the twin problems of on the one hand using data analysis merely to confirm existing prejudices and on the other reporting ambiguous data patterns. We offer some observations to conclude the book that relate to these issues.
From a technical point of view some of the most serious difficulties facing the analyst of spatial data concern the wide range of possible data models and the need to implement awkward and laborious fitting procedures. The availability of specialist software that implements the fitting and evaluation of spatial models is desirable. But important though it is to develop specialist software for confirmatory spatial data analysis it should be evident that useful progress can be made by carrying out sensible exploratory analyses with standard software. Some pre-processing of the data may be necessary but thereafter simple graphical and resistant techniques of analysis can offer useful insights into the data, indicating the types of models that should be explored and hence determining the extent to which specialist software is needed in order to carry out confirmatory analyses.
Section 7.1 considers the problem of finding a good description of a spatial surface. We begin by considering how to fit a trend surface model in which spatial variation is assumed to contain three scale components: a large scale or regional trend, a local component of continuous (spatially correlated) variation and a site or area level random (or noise) component. Comparisons with geostatistical approaches are made. Models for non-normal and presence/absence data are briefly considered. Good description of spatial variation is important for summarising data and for interpolation. If data are available on the same variable in other areas it may be of interest to compare surfaces. If data are available on the same variable in the same area but over a succession of time periods comparative study might be concerned with identifying how the surface is changing through time and to relate this to changes in conditions. Over a relatively short span of time this might take the form of a ‘before and after’ study, such as monitoring and assessing the effects of an anti-pollution campaign.
In section 7.2 spatial interpolation problems are discussed. Observations on a continuous surface are made at a number (n) of point sites in an area. These might be soil or vegetation measurements, or data from geological samples. Estimates may be required for sites that have not been visited (together with error estimates); perhaps a map is required.
In this chapter and in Chapter 5 we shall present many mathematical applications of computer graphics. In order to draw the line somewhere (pardon the pun) we shall restrict ourselves to the mathematics associated with the plane and in particular with curves in the plane. There is a whole other realm, just as fascinating, connected with objects, such as curves, polyhedra and surfaces, in three-dimensional space. Nevertheless what the computer actually draws is, in these cases too, a curve or system of curves in the plane – that of the screen. The extra complications come from taking a three-dimensional object and associating with it a curve or system of curves – for example its outline when seen from a distance, or a sequence of such outlines or a sequence of plane sections of the object. We touch on this in a discussion of the swallowtail surface in Section 4.14.
The computer can (with our help) draw curves and collections of curves which are just too complicated to attempt by hand. For some purposes a rough sketch of a curve does very well – you will probably have drawn many such sketches by hand, and we are certain that the art of curve-sketching by hand is still an art well worth acquiring. However, for some other purposes, such as the illustration or discovery of facts connected with the differential geometry of curves and families of curves (not to mention surfaces), accurate drawings are essential.
This book is intended for anyone who has some mathematical knowledge and a little experience with programming a microcomputer in BASIC (or any other language). The book shows how simple programs can be used to do significant mathematics.
To spell out our mathematical prerequisites in more detail, some of the chapters assume no more mathematical knowledge than whole numbers, but for the most part we assume some calculus, and the rudiments of algebra (polynomials, equations) and trigonometry (sines, cosines and tangents). Thus, British readers with A or A/S level in mathematics and American readers with a Freshman calculus course behind them will, we hope, have little difficulty in following most of the mathematics here. We have, naturally, included some material for the more mathematically sophisticated reader: those sections requiring closer inspection are, appropriately, printed in smaller type. (We hope that readers who do not immediately recognise the small type material will be intrigued rather than frightened. Surely one of the charms of mathematics is the glimpses it affords of mysterious and fascinating territory which is for the moment just out of reach.)
As for programming, the knowledge we assume is very small, and most programs are given full listings in the text. It seems to us that a very effective way to learn programming is to use it to solve interesting mathematical problems. We have regarded the mathematics as the pre-eminent interest, and have not tried too hard to make the sample programs beautiful or elegant, or even particularly efficient.
Sir Isaac Newton is certainly one of the greatest scientists to have ever lived. He is generally reckoned to have been one of the three most outstanding mathematicians of all time, along with Archimedes and Gauss, and his discoveries in physics are unrivalled in their width and influence. What was Newton's secret? How did he achieve as much as he did? Obviously there is no simple answer, but Newton had one secret, which he guarded jealously, and which he believed to be vital. It was ‘Data aequatione quotainque fluentes quantitoe involuente fluxions invenire et vice versa’ or in English ‘solve differential equations’.
Nowadays this ‘secret’ is entirely unremarkable; we are all aware that many processes and phenomena in the world are governed by differential equations. The very fact that Newton's secret is now common knowledge clearly indicates its worth and power. Of course his secret was rather hard won; he did have to invent differential equations before pronouncing his dictum concerning solving them!
In this chapter we shall see what the microcomputer can do for those intending to follow Newton's advice. Our eventual viewpoint will be considerably more modern than Newton's. It turns out that in certain circumstances solving differential equations is not as useful as watching them.
Differential equations and tangent segments
Much of science is devoted to the problems of predicting the future Differential behaviour of some physical system or other. Often the underlying equations and physical law will describe the rate at which the system evolves; what tangent segments we require is a description of how it evolves.
One of the most important and useful ways in which mathematics can help us to solve problems is by the solution of equations. ‘Let x be the length of the piece of string; then x satisfies the equation x2 – 2x - 3 = 0 and solving the equation gives x = 3.’ We are sure that you have solved many problems using equations; unfortunately all but the simplest equations cannot be solved exactly.
There are two reasons for this. In the first place even for a quadratic equation, unless the solutions are rational numbers (as in the above example), there is a square root such as √2 to be evaluated, and this cannot be done exactly. The decimal expansion does not terminate or recur, so we must be satisfied either with the formal ‘√2’ or with an approximation to so-many decimal places.
The second reason is more profound. Exact formulae analogous to the famous quadratic formula do exist for equations of degrees 3 and 4 – of course these formulae involve cube roots and so on, so are open to the same difficulty as we noted above for quadratics. On the other hand no algebraic formula exists at all for equations of degree 5 or more! In a precise sense, the equation x5 - 6x + 3 = 0 cannot be solved algebraically at all. This is a difficult statement and has an even more difficult proof, in which computers won't help in the least.