Hostname: page-component-89b8bd64d-z2ts4 Total loading time: 0 Render date: 2026-05-06T11:51:57.321Z Has data issue: false hasContentIssue false

Regression with Archaeological Count Data

Published online by Cambridge University Press:  18 September 2024

Brian F. Codding*
Affiliation:
Department of Anthropology, University of Utah, Salt Lake City, UT, USA, and Archaeological Center, University of Utah, Salt Lake City, UT, USA
Simon C. Brewer
Affiliation:
Department of Geography, University of Utah, Salt Lake City, UT, USA, and Archaeological Center, University of Utah, Salt Lake City, UT, USA
*
(brian.codding@anthro.utah.edu, corresponding author)
Rights & Permissions [Opens in a new window]

Abstract

Archaeological data often come in the form of counts. Understanding why counts of artifacts, subsistence remains, or features vary across time and space is central to archaeological inquiry. A central statistical method to model such variation is through regression, yet despite sophisticated advances in computational approaches to archaeology, practitioners do not have a standard approach for building, validating, or interpreting the results of count regression. Drawing on advances in ecology, we outline a framework for evaluating regressions with archaeological count data that includes suggestions for model fitting, diagnostics, and interpreting results. We hope these suggestions provide a foundation for advancing regression with archaeological count data to further our understanding of the past.

Los datos arqueológicos a menudo vienen en forma de conteos. Comprender por qué los recuentos de artefactos, restos de subsistencia o características varían a lo largo del tiempo y el espacio es fundamental para la investigación arqueológica. Un método estadístico central para modelar dicha variación es a través de la regresión, pero a pesar de los avances sofisticados en los enfoques computacionales de la arqueología, los profesionales no tienen un enfoque estándar para construir, validar o interpretar los resultados de la regresión de conteo. Basándonos en los avances en ecología, aquí describimos un marco para evaluar regresiones con datos de conteo arqueológico que incluye sugerencias para el ajuste de modelos, diagnósticos e interpretación de resultados. Esperamos que estas sugerencias proporcionen una base para avanzar en la regresión con datos de conteos arqueológicos para mejorar nuestra comprensión del pasado.

Information

Type
How to Series
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2024. Published by Cambridge University Press on behalf of Society for American Archaeology
Figure 0

FIGURE 1. Results of (a) linear (ordinary least squares [OLS]), (b) log-linear (OLS with a logged response variable), and (c) Poisson regression predicting counts of obsidian artifacts across hypothetical archaeological sites as a function of the distance from the volcanic source. Black dots show the observed values at each site. Gray solid lines show the predicted model fit. Black vertical lines show the distance between the predicted and observed value for each site (the residuals), which the model is trying to minimize. Gray horizontal dashed lines indicate zero. See Supplemental Text 1 for a more formal comparison of model fits.

Figure 1

Table 1. Definitions.

Figure 2

FIGURE 2. Flowchart outlining recommended procedures for fitting and evaluating count regression models. At each box or node, practitioners are directed to complete a specific step. Where a decision is needed based on the results of that step, practitioners are directed to proceed whether the answer is a “yes” or “no” to the question in the box. If practitioners follow a loop and return to that node, they may be presented with an “or” option for further model revision. This is not exhaustive, but it provides guidance on model fitting and diagnostics that archaeologists are likely to encounter. An example flowchart for models with multiple predictors is available in Supplemental Text 1.

Figure 3

FIGURE 3. Examples of exploratory graphical data analysis examining the number of houses in Yurok villages and how they vary by village size (data from Cook and Treganza 1950; Waterman 1920).

Figure 4

FIGURE 4. Examples of residual by fitted plots to examine (a) acceptable pattering in Poisson residuals, (b) patterning structured by between-group variation (dashed lines show group-level mean residuals), and (c) patterning structured by a nonlinear relationship between y and x not accounted for in the model.

Figure 5

FIGURE 5. Data from Shott (2022) showing the relationship between pottery inventory size and the number of household adults in Michoacán overlaid with graphical representations of (a) null model, where only the mean of y is known; (b) the proposed fitted model; and (c) a saturated model with a parameter for every data point (i.e., a perfect fit). Generalized linear models compare how well the proposed model accounts for variation in y compared to the null (i.e., the log-likelihood of each value of y given x and the unknown parameters compared to the log-likelihood of each y value given only the y-intercept).

Figure 6

FIGURE 6. Partial response plots showing the predicted number of projectile point types per time period in Texas as a function of (a) regional precipitation inferred from sedimentary stable carbon isotope ratios and (b) global temperature inferred from atmospheric stable oxygen isotope ratios (data from Buchannan et al. 2016). Each panel shows the predicted response of projectile point type counts to the focal climate variable while holding the other climate variable constant at the minimum observed value. This is done to illustrate how regional precipitation (a) does not influence technological investment even under the coolest conditions, whereas (b) global temperatures promote increasing technological investment even under wetter conditions.

Supplementary material: File

Codding and Brewer supplementary material

Codding and Brewer supplementary material
Download Codding and Brewer supplementary material(File)
File 671.1 KB