To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Selecting and classifying geospatial data on the basis of their location and attributes starts the process of data exploration, pattern recognition and the interpretation of spatial data. The first part of this chapter examines queries as part of the analytical process in GIS. A query is a formal request for a subset of data based on one or more selection criteria and forms a core function of GIS. The second part of this chapter then considers the subject of classification, which refers to the grouping or placing of data into categories on the basis of shared qualitative or quantitative characteristics. This chapter also discusses methods for the classification of multispectral satellite image data, which is an important process in the interpretation of remotely sensed imagery.
The query
There are three types of query performed in GIS: (i) phenomenal or attribute queries, which question the related non-spatial data tables of spatial objects (e.g. ‘select all sites that have obsidian artefacts’); (ii) topological queries, which question the geometric configuration of an object or relationship between objects (e.g. ‘select all sites within Smith County’); (iii) distance queries, which ask something about the spatial location of objects (e.g. ‘select all sites within 100 km of an obsidian source’).
In Chapter 4 the concept of the relational model was introduced for managing both attribute and spatial datasets. This is the most commonly encountered data structure in GIS.
The power of GIS, as with other computer programs, can be deceptive: visually impressive but ultimately meaningless results can appear unassailable because of the sophisticated technologies used to produce them (Eiteljorg 2000). The familiar adage ‘garbage in, garbage out’ is particularly applicable to GIS, and one of our primary aims throughout this book is to provide guidance on how to use this technology in ways to strengthen and extend our understanding of the human past, rather than to obfuscate it. In this chapter we start by providing an overview of the ‘first principles’ of GIS: the software and hardware requirements, geodetic and cartographic principles, and GIS data models. These provide the conceptual building blocks that are essential for understanding what GIS is, how it works, and what its strengths and limitations are. Although some of these ‘first principles’ may be familiar to readers who are experienced in cartography and computer graphics, we nevertheless provide a thorough review of each as they yield the foundation on which we build in later chapters.
The basics
GIS functionality
What does a GIS do? Simply providing a definition of GIS and referring to its abilities to capture and manipulate spatial data doesn't provide much insight into its functionality. More informative is to break some of the basic tasks of a GIS into five groups: data acquisition, spatial data management, database management, data visualisation and spatial analysis.
The study of geographical information systems (GIS) has now matured to the point where non-specialists can take advantage of relatively user-friendly software to help them solve real archaeological problems. No longer is it the preserve of experts who – in the eyes of cynics – chose their archaeological case studies solely to illustrate solutions to GIS problems. This is, of course, a good thing, because GIS has so much to offer archaeology. Nevertheless, the widespread adoption of GIS brings with it several attendant dangers. The most problematic is that modern GIS packages offer users a variety of powerful tools that are easily applied, without providing much guidance on their appropriateness for the data or questions at hand. For example, many current GIS software packages require just a few mouse clicks to create an elevation model from a set of contour lines, but none that we know of would warn that the application of this method to widely spaced contours is likely to produce highly unsatisfactory results that could lead to a host of interpretative errors further down the line. Conversely, there is a risk that researchers who become overdependent on the data management abilities of GIS may shy away from tackling more analytical questions simply because it is not immediately obvious which buttons to push. It is our ambition that no archaeologist who keeps this manual near his or her computer will make such mistakes, nor be hesitant about tackling the sorts of questions that can only be answered with some of the more advanced tools that GIS packages offer.
It is often appropriate to model the spatial organisation of human activity in terms of point locations and the relationships between them; for example the movement of goods between settlements or the intervisibility between forts. This chapter discusses the various network analysis tools that can be used to study such relationships. It also discusses techniques for predicting the likely path of an unknown route between point locations, as well as the flow of water and watersheds.
Given that the bulk of archaeological data is ultimately point based it is surprising that network analysis has not featured more prominently in the archaeological application of GIS. Of course, what is a point at one scale of analysis may be a region at another, and it is thus important to recognise that the applicability of network analysis is determined by the way in which the problem is framed rather than the geographical extent of a particular study. A few published archaeological network analyses have investigated subjects ranging in scale from the colonisation of new territory (Allen 1990; Zubrow 1990) and the location of ‘centres’ (Bell and Church 1985; Mackie 2001) to the connectivity of rooms in individual buildings (Foster 1989). There is no reason why this range could not be extended to even smaller extents: to, for example, investigate patterns of refitting among lithic artefacts in a single stratigraphic unit.
Surface modelling is an important analytical tool and, particularly in the case of elevation modelling, is often the final stage of GIS project development. Constructing a digital elevation model (DEM) from secondary sources such as digitised contour lines and/or spot heights, or from primary data such as LiDAR or DGPS survey, is a frequent objective of surface modelling (Atkinson 2002). Surface models can also be derived from a wide range of point-based environmental and anthropomorphic data, such as artefact counts or soil chemistry (e.g. Robinson and Zubrow 1999; Lloyd and Atkinson 2004). The derivation of a continuous surface from a set of discrete observations involves a process called interpolation and the selection of an appropriate interpolation technique depends on the structure of the sample data plus the desired outcome and characteristics of the surface model. This chapter begins by reviewing some of the more common interpolation methods and is followed by a more detailed review of techniques for building DEMs from contour data.
Interpolation
Interpolation is a mathematical technique of ‘filling in the gaps’ between observations. More precisely, interpolation can be defined as predicting data using surrounding observations. It can be contrasted to extrapolation, which is the process of predicting values beyond the limits of a distribution of known points. To use a simple example, if n and m are unknown values within the set of numbers {2, 4, n, 8, 10, m}, then using a simple model of linear change n could be interpolated as being equal to 6.
This chapter examines the different ways in which spatial datasets are acquired and structured to take advantage of the visualisation and analytical abilities of GIS. It is conventional to distinguish between primary and secondary data sources because acquisition methods, data formats and structuring processes differ considerably between the two. Primary data consist of measurements or information collected from field observations, survey, excavation and remote sensing. Secondary data refer to information that has already been processed and interpreted, available most often as paper or digital maps. Many users of GIS wish to integrate primary and secondary datasets (for example, to plot the location of primary survey data across an elevation model obtained from a data supplier). Both types of data have advantages and disadvantages, which this chapter examines in some detail. By the end of this chapter you will be familiar with the ways in which both primary and secondary data are obtained, and the issues and procedures for assessing the quality of combined datasets.
Primary geospatial data
Primary, or ‘raw’, geospatial data has not been significantly processed or transformed since the information was first captured. Archaeologists generate vast quantities of primary data during excavation and survey, such as the location of settlements, features and artefacts, geoarchaeological and palaeo-environmental data and the location of raw material sources within the landscape. Raw data may also be available from databases of information compiled by other agencies: the location of archaeological sites, for example, can be obtained from Sites and Monuments Records and published site ‘gazetteers’.
A GIS can be used to create, represent and analyse many kinds of region. Some regions have an objective reality, at least to the extent that they are widely recognised and have a readily detectable influence on aspects of human behaviour. The most obvious examples of this kind are sociopolitical regions such as the territories of modern nation states. Other regions have an objective reality in another sense: that they are defined by some natural process. A good example of a natural region is the watershed; that is, the area within which all rainfall drains to some specified point in a drainage network. A third kind of region is essentially just analytical in the sense that it is created for a specific short-lived purpose and may never be recognised by anyone other than the analyst. For example, an archaeologist might determine the region containing all land within 100 m of a proposed high-speed railway line in order to identify at-risk archaeological sites, but it is the list of sites and their locations, not the region, that is fed back into the planning process.
Regions are readily represented as polygons in a vector map, or less efficiently as cells coded in such a way as to distinguish between inside and outside a region in a raster map. Where the extent of regions are known in advance of GIS-based analysis, as is often the case with sociopolitical regions, their generation and manipulation within a GIS is mostly an issue of data capture and map query.
In this chapter we introduce a number of point and spatial operations that can be performed on continuous field data. We begin with the use of map algebra, before moving on to the calculation of derivatives (e.g. slope and aspect) and spatial filtering (e.g. smoothing and edge detection), all of which are widely used by archaeologists. In the final section we introduce more specialised techniques that have archaeological potential.
Map algebra is a point operation, whereas the other techniques discussed in this chapter are spatial operations. Point operations compute the new attribute value of a location with coordinates (x, y) from the attribute values in other maps at the same location (x, y), (Fig. 9.1b). In contrast, spatial operations compute the new attribute value of a location from the attribute values in the same map, but at other locations – those in the neighbourhood (Fig. 9.1a). The neighbourhood used in a spatial operation may or may not be spatially contiguous. For example, slope is usually calculated using the elevation values in a neighbourhood comprising the four or eight map cells immediately adjacent to the location in question (see below), but we saw in Chapter 6 how inverse distance weighting interpolates elevation values from some number of nearest spot heights, irrespective of how far away those spot heights actually are.
Cartographers have long recognised the influence that maps have on the shaping of spatial consciousness (Monmonier 1991; Wood 1992; Lewis and Wigen 1997). The purpose of this chapter is to explore the way maps, whether paper or digital, may be used to present spatial information and to highlight some design principles to maximise their effectiveness at this task. In doing so we describe a range of mapping techniques appropriate for the different sorts of data routinely handled by archaeologists. We also consider some major cartographic principles and design conventions that help make maps effective communication devices, and discuss the growing importance of the Internet and interactive mapping for the publication of spatial data.
Designing an effective map
As defined in Chapter 2, maps are traditionally divided into two categories: topographic and thematic. The former term describes maps that contain general information about features of the Earth's surface, whereas thematic maps are limited to single subjects, such as soils, geology, historic places, or some other single class of phenomena. Both types of map must contain some basic pieces of information so that the reader is able to comprehend and contextualise the data that is being presented. The most basic of these, without which a map is difficult if not impossible to understand, are: (i) a title; (ii) a scale; (iii) a legend and (iv) an orientation device, such as a north arrow (Table 12.1).
This chapter describes the way that spatial and attribute data are structured and stored for use within a GIS. It provides the necessary information about data models and database design to enable archaeologists unfamiliar with computer databases to make appropriate decisions about how best to construct a system that will work well and efficiently.
A database is a collection of information that is structured and recorded in a consistent manner. A card catalogue that records information about archaeological sites, such as their location and date, is as much a database as a full-fledged web-searchable digital sites and monuments record. Digital databases differ from their paper counterparts mainly in that they are dependent on database software for searching and retrieving records. The complexity of the data structure will also be increased as digital databases are often broken into several different related files. This reduces the amount of duplicated information in a database, improves access speed and also enables the retrieval of small subsets of data rather than complete records. Software that is used to store, manage and manipulate data is referred to as a Database Management System (DBMS). The objectives of a DBMS are to store and retrieve data records in the most efficient way possible, from both the perspective of the overall size of the database and also the speed at which that data can be accessed.
The technology of DBMS is a major research focus in computer science.
Spatial analysis lies at the core of GIS and builds on a long history of quantitative methods in archaeology. Many of the foundations of spatial analysis were established by quantitative geographers in the 1950s and 1960s, and adopted and modified by archaeologists in the 1970s and 1980s. For a variety of reasons, spatial analysis fell out of fashion both in archaeology and in the other social sciences. In part this was because of the perceived overgeneralisation of certain types of mathematical models, but also because of a shift towards more contextually orientated and relativist studies of human behaviour. Recently, however, there has been a renewed interest in the techniques of spatial analysis for understanding the spatial organisation of human behaviour that takes on board these criticisms. In the last decade there have been several advances within the social sciences, particularly geography and economics, in their ability to reveal and interpret complex patterns of human behaviour at a variety of scales, from the local to the general, using spatial statistics. Archaeology has participated somewhat less in these recent developments, although there is a growing literature that demonstrates a renewed interest in the application of these techniques to the study of past human behaviour. In this chapter we review some historically important methods (e.g. linear regression, spatial autocorrelation, cluster analysis) and also highlight more recent advances in the application of spatial analysis to archaeology (e.g. Ripley's K, kernel density estimates, linear logistic regression).
The most valuable (non-human) resource that any organisation possesses is its data. Hardware and software are easily replaceable but the loss of data can be catastrophic for an organisation. Information loss, whether full or partial, is easily avoided through the routine taking of backups and the storage of data off-site. As there is plenty of readily available information on how best to implement a backup and data-recovery procedure, we do not consider it in any detail in this book. What is less obvious, particularly to those new to GIS and digital data, is the similarly important task of data maintenance. Consider, for example, the following three scenarios:
An employee in a cultural resource management (CRM) unit is assigned the task of updating site locations from newly acquired GPS data. How should the fact that a few site locations have been updated be documented and where and how should the old data be stored?
An aerial photograph of a portion of landscape has been rectified and georeferenced, and is ready to be used to delineate features of archaeological significance. How and where should information about the degree of error in the georeferencing be documented? Where and how should the errors for the newly digitised archaeological features be documented?
A research student is collecting data on soil types for Eastern Europe from several different national agencies that each have different scales and recording systems. How is this student able to search and compare and ultimately integrate datasets in a manner that ensures the data will be appropriate for his/her needs?