Archaeological data come in all sizes, shapes, and quantities ranging from Egyptian pyramids (large in size, small in the number of specimens) to micro-debitage from a lithic workshop or molecular residues in a ceramic bowl. Because the questions we ask of the data are different, our representations of those data differ. One way of representing the data dominates however, because it is so flexible. That is a rectangular arrangement of data so that each row represents an observation and each column represents a measurement on that observation. Some of those measurements can be counts, and each count is a potential observation for another data table.
For example, we may have located a variety of archaeological sites in a river valley. One data table could consist of the grid units that were surveyed so that each row of the table is a grid square (e.g., 100 m on a side). The columns of the data set include the coordinates of the unit and the number of sites and isolated artifact finds discovered during the survey. There could be other columns identifying when the unit was surveyed and information about the location of the unit with respect to topographic features such as dominant soil type, major waterways, lakes, and so on. This data set would be relevant to exploring questions about site density. For example, are there more sites near water features and fewer in upland areas away from any water source?
Each of the counts in this data set is a potential row in another data set. That data set consists of a row for each site and columns for the location of the site, the area of the site, the physical characteristics around the site (e.g., slope, elevation, aspect, soil type), and the number of different kinds of artifacts and features found on the site. This data set would be relevant to questions regarding where sites are located and how the artifacts and features found on sites differ.
Each of the artifacts and features in the site data set is a potential row in another data set (or more likely multiple data sets). At this point it may make sense to create separate data sets for projectile points, flakes, cores, pottery sherds, shells, bones, and other categories of material.