Determining Provenance from Compositional Data

Pedro A. López-García; Denisse L. Argote

doi:10.1017/9781009634205

Series: Elements in Current Archaeological Tools and Techniques

Determining Provenance from Compositional Data

Published online by Cambridge University Press: 24 February 2026

Pedro A. López-García and

Denisse L. Argote

Show author details

Pedro A. López-García: Affiliation:
National Institute of Anthropology and History, Mexico
Denisse L. Argote: Affiliation:
National Institute of Anthropology and History, Mexico

Summary

Traditionally, classical multivariate statistical methods have been applied to relate cultural materials recovered at archaeological sites to their respective raw material sources. However, when reviewing published research, which usually claims to have reached a high degree of confidence in the assignment of materials, the authors have detected that those applying these methods can make serious errors that compromise the inferences made. This Element reconsiders the use of statistical methods to address the problem of provenance analysis of archaeological materials using a step-by-step procedure that allows the recognition of natural groups in the data, thus obtaining better quality classifications while avoiding the problems of total or partial overlaps in the chemical groups (common in biplots). To evaluate the methods proposed here, the challenge of group search in ceramic materials is addressed using algorithms derived from model-based clustering. For cases with partial data labeling, a semi-supervised algorithm is applied to obsidian samples.

Element contents

Summary
References

Get access

Keywords

archaeometry compositional analysis model-based clustering semi-supervised classification provenance analysis

Information

Type: Element
Information: Series: Elements in Current Archaeological Tools and Techniques

DOI: https://doi.org/10.1017/9781009634205 [Opens in a new window]

Online ISBN: 9781009634205

Publisher: Cambridge University Press

Print publication: 26 March 2026

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Element purchase

Temporarily unavailable

References

Adolfsson, A., Ackerman, M., & Brownstein, N. (2019). To cluster, or not to cluster: An analysis of clusterability methods. Pattern Recognition, 88, 13–26.CrossRef Google Scholar

Aitchison, J. (1986). The statistical analysis of compositional data. London: Chapman & Hall.CrossRef Google Scholar

Aitchison, J. (2003). A concise guide to compositional data analysis. 2nd Compositional Data Analysis Workshop (pp. 1–134). Girona, Spain: Universitat de Girona.Google Scholar

Alelyani, S., Tang, J., & Liu, H. (2014). Feature selection for clustering: A review. In Aggarwal, C. & Reddy, C. (eds.), Data clustering: Algorithms and applications (pp. 1–32). Hoboken: Chapman and Hall/CRC Press.Google Scholar

Ambrose, W., Allen, C., O’Connor, S., Spriggs, M., Vasco Oliveira, N., & Reepmeyer, C. (2009). Possible obsidian sources for artifacts from Timor: Narrowing the options using chemical data. Journal of Archaeological Science, 36(3), 607–615.CrossRef Google Scholar

Andrews, J. & McNicholas, P. (2014). Variable selection for clustering and classification. Journal of Classification, 31, 136–153.CrossRef Google Scholar

Argote, D., López-García, P., Torres-García, M., & Thrun, M. (2024). Machine learning for archaeological applications in R. Cambridge: Cambridge University Press.Google Scholar

Argote Espino, D., Solé, J., López García, P., & Sterpone, O. (2012). Obsidian sub-source identification in the Sierra de Pachuca and Otumba volcanic regions, Central Mexico, by ICP-MS and DBSCAN statistical analysis. Geoarchaeology, 27, 48–62.CrossRef Google Scholar

Bagus, A. & Pramana, S. (2016). advclust: Object oriented advanced clustering, Version 0.4. https://rdrr.io/rforge/advclust/.Google Scholar

Banfield, J. & Raftery, A. (1993). Model-based Gaussian and non-Gaussian clustering. Biometrics, 49(3), 803–821.CrossRef Google Scholar

Baudry, J., Raftery, A., Celeux, G., Lo, K., & Gottardo, R. (2010). Combining mixture components for clustering. Journal of Computational and Graphical Statistics, 9(2), 332–353.CrossRef Google Scholar PubMed

Baxter, M. (2001). Statistical modelling of artefact compositional data. Archaeometry, 43(1), 131–147.CrossRef Google Scholar

Baxter, M. (2015). Notes on quantitative archaeology and R. www.researchgate.net/publication/277931925_Notes_on_Quantitative_Archaeology_and_R.Google Scholar

Baxter, M. & Buck, C. (2000). Data handling and statistical analysis. In Ciliberto, E. S. & Spoto, G. (eds.), Modern analytical methods in art and archaeology (pp. 681–746). New York: Wiley-Interscience.Google Scholar

Baxter, M. & Cool, H. (2016). Basic statistical graphics for archaeology with R: Life beyond Excel. Nottingham: Barbican Research Associates and Nottingham Trent University.Google Scholar

Baxter, M., Beardah, C., Cool, H., & Jackson, C. (2003). Compositional data analysis in archaeometry. CoDaWork03: Compositional Data Analysis Workshop. Girona, Spain: Universitat de Girona. https://ima.udg.es/Activitats/CoDaWork03/paper_baxter_Beardah1.pdf.Google Scholar

Baxter, M., Beardah, C., Cool, H., & Jackson, C. (2005). Further studies in the compositional variability of colourless Romano-British vessel glass. Archaeometry, 47(1), 47–68.CrossRef Google Scholar

Baxter, M. & Jackson, C. (2001). Variable selection in artefact compositional studies. Archaeometry, 43(2), 253–268.CrossRef Google Scholar

Ben Dor, Y., Finkel, M., & Ben-Yosef, E. (2023). A probabilistic approach to provenance studies using whole object elemental composition: Chert (flint) as a case study. Journal of Archaeological Science, 153, 105767.CrossRef Google Scholar

Ben-Gal, I. (2005). Outlier detection. In Maimon, O. & Rokach, L. (eds.), Data mining and knowledge discovery handbook: A complete guide for practitioners and researchers (pp. 131–146). New York: Springer.CrossRef Google Scholar

Biernacki, C., Marbac, M., & Vandewalle, V. (2021). Gaussian-based visualization of Gaussian and non-Gaussian model-based clustering. Journal of Classification, 38, 129–157.CrossRef Google Scholar

Bouveyron, C., Celeux, G., Murphy, T., & Raftery, A. (2019). Model-based clustering and classification for data science with applications in R. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge: Cambridge University Press.CrossRef Google Scholar

Brock, G., Pihur, V., Datta, S., & Datta, S. (2008). clValid: An R package for cluster validation. Journal of Statistical Software, 25(4), 1–22.CrossRef Google Scholar

Carr, S. (2015). Geochemical characterization of obsidian subsources in Highland Guatemala. Unpublished BA thesis. Pennsylvania: Pennsylvania State University.Google Scholar

Cebeci, Z. (2020). fcvalid: An R package for internal validation of probabilistic and possibilistic clustering. Sakarya University Journal of Computer and Information Sciences, 3(1), 11–27.CrossRef Google Scholar

Celeux, G. & Govaert, G. (1995). Gaussian parsimonious clustering models. Pattern Recognition, 28(5), 781–793.CrossRef Google Scholar

Charrad, M., Ghazzali, N., Boiteau, V., & Niknafs, A. (2014). NbClust: An R package for determining the relevant number of clusters in a data set. Journal of Statistical Software, 61(6), 1–36.CrossRef Google Scholar

Cobean, R. (2002). A world of obsidian: The mining and trade of a volcanic glass in ancient Mexico. Mexico: INAH and Pittsburgh University.Google Scholar

Craig, N., Speakman, R., Popelka-Filcoff, R., Glascock, M., Robertson, J., Shackley, M., & Aldenderfer, M. (2007). Comparison of XRF and PXRF for analysis of archaeological obsidian from southern Perú. Journal of Archaeological Science, 34(12), 2012–2024.CrossRef Google Scholar

Cribbin, L. (2008). upclass: R package for performing updated classification rules. Unpublished MSc thesis. Dublin: University College Dublin.Google Scholar

Dagnino, J. (2014). Muestras, variabilidad y error. Revista Chilena de Anestesia, 43(2), 100–103.CrossRef Google Scholar

Dang, U., Gallaugher, M., Browne, R., & McNicholas, P. (2019). Model-based clustering and classification using mixtures of multivariate skewed power exponential distributions. Journal of Classification, 40, 145–167.CrossRef Google Scholar

Dempster, A., Laird, N., & Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1), 1–22.CrossRef Google Scholar

Desgraupes, B. (2018). Package ‘clusterCrit’: Clustering indices. https://CRAN.Rproject.org/package=clusterCrit.Google Scholar

Dolan, S. (2016). Black rocks in the borderlands: Obsidian procurement in southwestern New Mexico and northwestern Chihuahua, Mexico, AD 1000 to 1450. PhD dissertation. Oklahoma: University of Oklahoma Graduate College.Google Scholar

Dolan, S., Whalen, M., Minnis, P., & Shackley, M. (2017). Obsidian in the Casas Grandes world: Procurement, exchange, and interaction in Chihuahua, Mexico, CE 1200–1450. Journal of Archaeological Science: Reports, 11, 555–567.Google Scholar

Duda, R., Hart, P., & Stork, D. (2001). Pattern classification. New York: John Wiley & Sons.Google Scholar

Egozcue, J. & Pawlowsky-Glahn, V. (2011). Análisis composicional de datos en ciencias geoambientales. Boletín Geológico y Minero, 122(4), 439–452.CrossRef Google Scholar

Egozcue, J., Pawlowsky-Glahn, V., Mateu-Figueras, G., & Barceló-Vidal, C. (2003). Isometric logratio transformations for compositional data analysis. Mathematical Geology, 35(3), 279–300.CrossRef Google Scholar

Etherington, T. (2019). Mahalanobis distances and ecological niche modelling: Correcting a chi-squared probability error. PeerJ, 7, e6678.CrossRef Google Scholar PubMed

Everitt, B., Landau, S., Leese, M., & Stahl, D. (2011). Cluster analysis. Wiley Series in Probability and Statistics Vol. 848. Chichester: John Wiley & Sons.CrossRef Google Scholar

Filzmoser, P., Hron, K., & Reimann, C. (2009). Principal component analysis for compositional data with outliers. Environmetrics, 20(6), 621–632.CrossRef Google Scholar

Filzmoser, P., Hron, K., & Reimann, C. (2012). Interpretation of multivariate outliers for compositional data. Computers & Geosciences, 39, 77–85.CrossRef Google Scholar

Fish, P., Fish, S., Whittlesey, S., Neff, H., Glascock, M., & Elam, M. (1992). An evaluation of the production and exchange of Tanque Verde red-on-brown in southern Arizona. In Neff, H. (ed.), Chemical characterization of ceramic pastes in archaeology (pp. 62–68). Madison: Prehistory Press.Google Scholar

Fisher, R. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2), 179–188.CrossRef Google Scholar

Fop, M. & Murphy, T. (2018). Variable selection methods for model-based clustering. Statistics Surveys, 12, 18–65.CrossRef Google Scholar

Fowlkes, E. & Mallows, C. (1983). A method for comparing two hierarchical clusterings. Journal of the American Statistical Association, 78(383), 553–569.CrossRef Google Scholar

Fraley, C. & Raftery, A. (1998). How many clusters? Which clustering method? Answers via model-based cluster analysis. The Computer Journal, 41(8), 578–588.CrossRef Google Scholar

Fraley, C., Raftery, A., Murphy, T., & Scrucca, L. (2012). mclust version 4 for R: Normal mixture modeling for model-based clustering, classification, and density estimation. Technical Report no. 597. Washington: Department of Statistics, University of Washington.Google Scholar

García-Heras, M., Blackman, J., Ruiz, M., & Bishop, R. (2001). Assessing ceramic compositional data: A comparison of total reflexion X-ray fluorescence and instrumental neutron activation analysis on late Iron Age Spanish Celtiberian ceramics. Archaeometry, 43(3), 325–347.CrossRef Google Scholar

Glascock, M. D. (1992). Characterization of archaeological ceramics at MURR by neutron activation analysis and multivariate statistics. In Neff, H. (ed.), Chemical characterization of ceramic pastes in archaeology (pp. 11–26). Madison: Prehistory Press.Google Scholar

Glascock, M. D. (2002). Obsidian provenance research in the Americas. Accounts of Chemical Research, 35(8), 611–617.CrossRef Google Scholar PubMed

Glascock, M. D. (2011). Comparison and contrast between XRF and NAA: Used for characterization of obsidian sources in Central Mexico. In Shackley (ed.), M. S., X-ray fluorescence spectrometry (XRF) in geoarchaeology (pp. 161–192). New York: Springer.CrossRef Google Scholar

Glascock, M. D. (2021). MURRAP user guide. https://archaeometry.missouri.edu/downloads/MURRAP_User_Guide.pdf.Google Scholar

Glascock, M. D. (2022). GAUSS runtime download. https://archaeometry.missouri.edu/gauss.html.Google Scholar

Glascock, M. D., Braswell, G., & Cobean, R. (1998). A systematic approach to obsidian source characterization. In Shackley, M. S. (ed.), Archaeological obsidian studies (pp. 15–65). Boston: Springer.CrossRef Google Scholar

Glascock, M. D., Weigand, P., Esparza López, R., Ohnersorgen, M., Garduño Ambriz, M., Mountjoy, J. & Darling, J. (2010). Geochemical characterisation of obsidian in Western Mexico: The sources in Jalisco, Nayarit, and Zacatecas. In Kuzmin, Y. & Glascock, M. D., Crossing the straits: Prehistoric obsidian source exploitation in the North Pacific Rim (pp. 201–218). Oxford: Archaeopress.Google Scholar

Gordon, A. (1998). Cluster validation. In Hayashi, C., Ohsumi, N., Yajima, K., Tanaka, Y., Bock, H., & Bada, Y., Data science, classification, and related methods (pp. 22–39). Tokyo: Springer Tokyo.CrossRef Google Scholar

Greenacre, M. (2017). Towards a pragmatic approach to compositional data analysis. Economics Working Paper Series, Working Paper no. 1554. Barcelona: Universitat Pompeu Fabra.Google Scholar

Haferlach, T., Kohlmann, A., Wieczorek, L., Basso, G., Te Kronnie, G., Béné, M.-C., De Vos, J., Hernández, J. M., Hofmann, W. K., Mills, K. I., Gilkes, A., Chiaretti, S., Shurtleff, S.A., Kipps, T. J., Rassenti, L. Z., Yeoh, A. E., Papenhausen, P. R., Liu, W. M., Williams, P. M., & Foà, R. (2010). Clinical utility of microarray-based gene expression profiling in the diagnosis and subclassification of leukemia: Report from the International Microarray Innovations in Leukemia Study Group. Journal of Clinical Oncology, 28(15), 2529–2537.CrossRef Google Scholar PubMed

Hall, M. (2004). Pottery production during the Late Jomon period: Insights from the chemical analyses of Kasori B pottery. Journal of Archaeological Science, 31(10), 1439–1450.CrossRef Google Scholar

Hall, M. & Minyaev, S. (2002). Chemical analyses of Xiong-Nu pottery: A preliminary study of exchange and trade on the inner Asian steppes. Journal of Archaeological Science, 29(2), 135–144.CrossRef Google Scholar

Handl, J., Knowles, J., & Kell, D. (2005). Computational cluster validation in post-genomic data analysis. Bioinformatics, 21(15), 3201–3212.CrossRef Google Scholar PubMed

Harbottle, G. (1976). Neutron activation analysis in archaeology. Radiochemistry, 3, 33–72.Google Scholar

Harry, K. (1997). Ceramic production, distribution, and consumption in two Classic period Hohokam communities. Unpublished PhD thesis. Tucson, AZ: University of Arizona.Google Scholar

Harry, K., Fish, P., & Fish, S. (2002). Ceramic production and distribution in two classic period Hohokam communities. In Glowacki, D. & Neff, H., Ceramic production and circulation in the greater Southwest: Source determination by INAA and complementary mineralogical investigations (pp. 99–109). Los Angeles: The Cotsen Institute of Archaeology, UCLA.Google Scholar

Hawkins, D. (1980). Identification of outliers. London: Chapman and Hall.CrossRef Google Scholar

Heller, K. (2007). Efficient Bayesian methods for clustering. Unpublished PhD thesis. London, UK: University College London.Google Scholar

Hubert, M. & Van der Veeken, S. (2008). Outlier detection for skewed data. Journal of Chemometrics, 22(3–4), 235–246.CrossRef Google Scholar

Hubert, M., Rousseeuw, P., & Vanden Branden, K. (2005). ROBPCA: A new approach to robust principal component analysis. Technometrics, 47(1), 64–79.CrossRef Google Scholar

Iñañez, J., Speakman, R., Buxeda-i-Garrigós, J., & Glascock, M. D. (2009). Chemical characterization of tin-lead glazed pottery from the Iberian Peninsula and the Canary Islands: Initial steps toward a better understanding of Spanish colonial pottery in the Americas. Archaeometry, 51(4), 546–567.CrossRef Google Scholar

Italiano, F., Correale, A., Di Bella, M., Martin, F., Martinelli, M., Sabatino, G., & Spatafora, F. (2018). The Neolithic obsidian artifacts from Roccapalumba (Palermo, Italy): First characterization and provenance determination. Mediterranean Archaeology and Archaeometry, 18(3), 151–167.Google Scholar

Jain, A. & Dubes, R. (1988). Algorithms for clustering data. Upper Saddle River, NJ: Prentice-Hall.Google Scholar

Kaufman, L. & Rousseeuw, P. (2005). Finding groups in data: An introduction to cluster analysis. Hoboken: Wiley.Google Scholar

Krijthe, J. H. (2017). RSSL: Semi-supervised learning in R. In Kerautret, B., Colom, M., & Monasse, P. (eds.), Reproducible research in pattern recognition. RRPR 2016 (pp. 104–115). Cham: Springer.CrossRef Google Scholar

Langrognet, F., Lebret, R., Poli, C., Lovleff, S., & Auder, B. (2025). Package ‘Rmixmod’ version 2.1.10: Classification with mixture modelling. https://cloud.r-project.org/web/packages/Rmixmod/Rmixmod.pdf.Google Scholar

Lebret, R., Lovleff, S., Langrognet, F., Biernacki, C., Celeux, G., & Govaert, G. (2015). Rmixmod: The R package of the model-based unsupervised, supervised, and semi-supervised classification Mixmod library. Journal of Statistical Software, 67(6), 1–29.CrossRef Google Scholar

Lerdo de Tejada Pavón, M. (2014). Estimación de datos faltantes con el Algoritmo EM. Unpublished thesis. Mexico: Universidad Nacional Autónoma de México.Google Scholar

López-García, P., Argote, D., & Thrun, M. (2020). Projection-based classification of chemical groups for provenance analysis of archaeological materials. IEEE Access, 8, 152439–152451.CrossRef Google Scholar

López-García, P., Argote, D., Torres-García, M., & Thrun, M. (2024). Knowledge discovery from archaeological materials. Cambridge: Cambridge University Press.CrossRef Google Scholar

López-García, P., García-Gómez, V., Acosta-Ochoa, G., & Argote, D. (2024). Semi-supervised classification to determine the provenance of archaeological obsidian samples. Archaeometry, 66(1), 142–159.CrossRef Google Scholar

Maechler, M. (2023). CRAN task view: Robust statistical methods. Version 2023–07–01. https://CRAN.R-project.org/view=Robust.Google Scholar

Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., & Hornik, K. (2024). Cluster: Cluster analysis basics and extensions. R package version 2.1.8. https://CRAN.R-project.org/package=cluster.Google Scholar

Marbac, M. & Sedki, M. (2017). Variable selection for model-based clustering using the integrated complete-data likelihood. Statistics and Computing, 27(4), 1049–1063.CrossRef Google Scholar

Martín-Fernández, J., Barceló-Vidal, C., & Pawlowsky-Glahn, V. (2003). Dealing with zeros and missing values in compositional data sets using nonparametric imputation. Mathematical Geology, 35, 253–278.CrossRef Google Scholar

Martín-Fernández, J., Buxeda i Garrigós, J., & Pawlowsky-Glahn, V. (2015). Logratio analysis in archaeometry: Principles and methods. In Barcelo, J. & Bogdanovic, I. (eds.), Mathematics and archaeology (pp. 178–189). Boca Raton: CRC Press.Google Scholar

Mateu-Figueras, G., Martín-Fernández, J., Pawlowsky-Glahn, V., & Barceló-Vidal, C. (2003). El problema del análisis estadístico de datos composicionales. 27° Congreso Nacional de Estadística e Investigación Operativa (pp. 480–488). Lleida, Spain: Sociedad Española de Estadística e Investigación Operativa.Google Scholar

Maugis, C., Celeux, G., & Martin-Magniette, M. (2009a). Variable selection for clustering with Gaussian mixture models. Biometrics, 65(3), 701–709.CrossRef Google Scholar PubMed

Maugis, C., Celeux, G., & Martin-Magniette, M. (2009b). Variable selection in model-based clustering: A general variable role modeling. Computational Statistics & Data Analysis, 53(11), 3872–3882.CrossRef Google Scholar

McLachlan, G. & Peel, D. (2000). Finite mixture models. New York: Wiley.CrossRef Google Scholar

Mendelsohn, R. (2018). Obsidian sourcing and dynamic trade patterns at Izapa, Chiapas, Mexico: 100 BCE–400 CE. Journal of Archaeological Science: Reports, 20, 634–646.Google Scholar

Millhauser, J., Fargher, L., Heredia Espinoza, V., & Blanton, R. (2015). The geopolitics of obsidian supply in Postclassic Tlaxcallan: A portable X-ray fluorescence study. Journal of Archaeological Science, 58, 133–146.CrossRef Google Scholar

Millhauser, J., Rodríguez-Alegría, E., & Glascock, M. (2011). Testing the accuracy of portable X-ray fluorescence to study Aztec and Colonial obsidian supply at Xaltocan, Mexico. Journal of Archaeological Science, 38(11), 3141–3152.CrossRef Google Scholar

Milligan, G. & Cooper, M. (1988). A study of standardization of variables in cluster analysis. Journal of Classification, 5, 181–204.CrossRef Google Scholar

Moholy-Nagy, H., Meierhoff, J., Golitko, M., & Kestle, C. (2013). An analysis of pXRF obsidian source attributions from Tikal, Guatemala. Latin American Antiquity, 24(1), 72–97.CrossRef Google Scholar

Murphy, T., Dean, N., & Raftery, A. (2010). Variable selection and updating in model-based discriminant analysis for high dimensional data with food authenticity applications. The Annals of Applied Statistics, 4(1), 396–421.CrossRef Google Scholar PubMed

Neff, H. (1992). Chemical characterization of ceramic pastes in archaeology. Monographs in World Archaeology, vol. 7. Madison, WI: Prehistory Press.Google Scholar

Palarea-Albaladejo, J. & Martín-Fernández, J. (2015). zCompositions – R package for multivariate imputation of left-censored data under a compositional approach. Chemometrics and Intelligent Laboratory Systems, 143, 85–96.CrossRef Google Scholar

Pawlowsky-Glahn, V. & Buccianti, A. (2011). Compositional data analysis: Theory and applications. Chichester, UK: Wiley.CrossRef Google Scholar

Pawlowsky-Glahn, V. & Egozcue, J. (2006). Compositional data and their analysis: An introduction. In Buccianti, A., Mateu-Figueras, G., & Pawlowsky-Glahn, V. (eds.), Compositional data analysis in the geosciences: From theory to practice (pp. 1–10). Special Publication, vol. 264. London: Geological Society.Google Scholar

Pawlowsky-Glahn, V., Egozcue, J., & Tolosana-Delgado, J. (2007). Lecture notes on compositional data analysis. http://diobma.udg.edu/handle/10256/297/.Google Scholar

Pierce, D. (2015). Visual and geochemical analyses of obsidian source use at San Felipe Aztatán, Mexico. Journal of Anthropological Archaeology, 40, 266–279.CrossRef Google Scholar

R Core Team. (2020). R: A language and environment for statistical computing.https://cran.r-project.org/doc/manuals/r-release/fullrefman.pdf.Google Scholar

Raftery, A. & Dean, N. (2006). Variable selection for model-based clustering. Journal of the American Statistical Association, 101(473), 168–178.CrossRef Google Scholar

Reimann, C., Filzmoser, P., & Garrett, R. (2002). Factor analysis applied to regional geochemical data: Problems and possibilities. Applied Geochemistry, 17(3), 185–206.CrossRef Google Scholar

Rousseeuw, P. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65.CrossRef Google Scholar

Russell, N., Cribbin, L., & Murphy, T. (2013). upclass-package: Updated classification methods using unlabeled data. Version 2.0. https://rdrr.io/cran/upclass/man/upclass-package.html.Google Scholar

Salem, N. & Hussein, S. (2019). Data dimensional reduction and principal components analysis. Procedia Computer Science, 163, 292–299.CrossRef Google Scholar

Scrucca, L. & Raftery, A. (2018). clustvarsel: A package implementing variable selection for Gaussian model-based clustering in R. Journal of Statistical Software, 84(1), 1–28.CrossRef Google Scholar PubMed

Sedki, M., Celeux, G., & Maugis, C. (2014). SelvarMix: A R package for variable selection in model-based clustering and discriminant analysis with a regularization approach. Research report no. hal-01053784. Retrieved from: https://hal.inria.fr/hal-01053784.Google Scholar

Sedki, M., Celeux, G., & Maugis-Rabusseau, C. (2017). Package ‘SelvarMix’: Regularization for variable selection in model-based clustering and discriminant analysis. https://CRAN.R-project.org/package=SelvarMix.Google Scholar

Shackley, M. S. (2005). Obsidian: Geology and archaeology in the North American Southwest. Tucson, AZ: University of Arizona Press.Google Scholar

Smith, M., Burke, A., Hare, T., & Glascock, M. (2007). Sources of imported obsidian at Postclassic sites in the Yautepec Valley, Morelos: A characterization study using XRF and INAA. Latin American Antiquity, 18(4), 429–450.CrossRef Google Scholar

Thrun, M. (2018). Projection-based clustering through self-organization and swarm intelligence: Combining cluster analysis with the visualization of high-dimensional data. Heidelberg: Springer Vieweg.CrossRef Google Scholar

Thrun, M. C. (2025). Package: DatabionicSwarm (via r-universe): Swarm intelligence for self-organized clustering. https://cran.r-universe.dev/DatabionicSwarm/DatabionicSwarm.pdf.Google Scholar

Thrun, M. C. & Ultsch, A. (2018). Effects of the payout system of income taxes to municipalities in Germany. 12th Professor Aleksander Zelias International Conference on Modelling and Forecasting of Socio-Economic Phenomena, vol. 1 (pp. 533–542). Zakopane, Poland: GfKl, Data Science Society.Google Scholar

Thrun, M. C. & Ultsch, A. (2021). Swarm intelligence for self-organized clustering. Artificial Intelligence, 290, 103237.CrossRef Google Scholar

Thrun, M. C., Pape, F., Hansen-Goos, O., & Ultsch, A. (2025, 01 26). DataVisualizations: Visualizations of high-dimensional data. https://CRAN.R-project.org/package=DataVisualizations.Google Scholar

Tykot, R. (2016). Using non-destructive portable X-ray fluorescence spectrometers on stone, ceramics, metals, and other materials in museums: Advantages and limitations. Applied Spectroscopy, 70(1), 42–56.CrossRef Google Scholar

Ultsch, A. (1995). Self organizing neural networks perform different from statistical k-means clustering. Proceedings of the Society for Information and Classification (GFKL) (pp. 1–13). Basel, Switzerland.Google Scholar

Ultsch, A. & Thrun, M. C. (2017). Credible visualizations for planar projections. Proceedings of the 12th International Workshop on Self-Organizing Maps and Learning Vector Quantization, Clustering and Data Visualization (WSOM 2017) (pp. 256–260). Nancy, France: IEEE.Google Scholar

van Buuren, S. & Groothuis-Oudshoorn, K. (2011). mice: Multivariate imputation by chained equations in R. Journal of Statistical Software, 45(3), 1–67.Google Scholar

van den Boogaart, K. & Tolosana-Delgado, R. (2013). Analyzing compositional data with R. London: Springer.CrossRef Google Scholar

van den Boogaart, K., Tolosana-Delgado, R., & Bren, M. (2023). Package ‘compositions’ versión 2.0–6: Compositional data analysis. https://cran.r-project.org/web/packages/compositions/.Google Scholar

Varmuza, K. & Filzmoser, P. (2009). Introduction to multivariate statistical analysis in chemometrics. Boca Raton, FL: CRC Press.Google Scholar

Waite, D. (2020). Household economies and socioeconomic integration: An analysis of obsidian artifacts from Coba, Quintana Roo and Yaxuna, Yucatan, Mexico. MA thesis. Orlando, FL: University of Central Florida.Google Scholar

Wang, S. & Zhu, J. (2008). Variable selection for model-based high-dimensional clustering and its application to microarray data. Biometrics, 64(2), 440–448.CrossRef Google Scholar PubMed

Wehrens, R. (2011). Chemometrics with R: Multivariate data analysis in the natural and life sciences. Berlin: Springer-Verlag.CrossRef Google Scholar

Weigand, P., Harbottle, G., & Sayre, E. (1977). Turquoise sources and source analysis: Mesoamerica and the Southwestern U.S.A. In Earle, T. & Ericson, J. (eds.), Exchange systems in prehistory (pp. 15–34). New York: Academic Press.CrossRef Google Scholar

Wilkinson, L. & Friendly, M. (2009). The history of the cluster heat map. The American Statistician, 63(2), 179–184.CrossRef Google Scholar

Xie, B., Pan, W., & Shen, X. (2008). Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables. Electronic Journal of Statistics, 2, 168–212.CrossRef Google Scholar PubMed

Xu, R. & Wunsch, D. (2005). Survey of clustering algorithms. IEEE Transactions on Neural Networks, 16(3), 645–678.CrossRef Google Scholar PubMed

Zaki, M. & Meira, W. (2014). Data mining and analysis: Fundamental concepts and algorithms. New York: Cambridge University Press.CrossRef Google Scholar

Zhu, X. & Goldberg, A. (2009). Introduction to semi-supervised learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, no. 6. Cham: Springer.CrossRef Google Scholar

Accessibility standard: WCAG 2.1 AA

Why this information is here

This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

Accessibility Information

The HTML of this element complies with version 2.1 of the Web Content Accessibility Guidelines (WCAG), covering newer accessibility requirements and improved user experiences and achieves the intermediate (AA) level of WCAG compliance, covering a wider range of accessibility requirements.

Content Navigation

Table of contents navigation
Allows you to navigate directly to chapters, sections, or non‐text items through a linked table of contents, reducing the need for extensive scrolling.

Reading Order & Textual Equivalents

Single logical reading order
You will encounter all content (including footnotes, captions, etc.) in a clear, sequential flow, making it easier to follow with assistive tools like screen readers.

Short alternative textual descriptions
You get concise descriptions (for images, charts, or media clips), ensuring you do not miss crucial information when visual or audio elements are not accessible.

Full alternative textual descriptions
You get more than just short alt text: you have comprehensive text equivalents, transcripts, captions, or audio descriptions for substantial non‐text content, which is especially helpful for complex visuals or multimedia.

Visual Accessibility

Use of colour is not sole means of conveying information
You will still understand key ideas or prompts without relying solely on colour, which is especially helpful if you have colour vision deficiencies.

Structural and Technical Features

ARIA roles provided
You gain clarity from ARIA (Accessible Rich Internet Applications) roles and attributes, as they help assistive technologies interpret how each part of the content functions.

Element contents

Determining Provenance from Compositional Data

Summary

Keywords

Information

Access options

Element purchase

Temporarily unavailable

References

Accessibility standard: WCAG 2.1 AA

Why this information is here

Accessibility Information

Content Navigation

Reading Order & Textual Equivalents

Visual Accessibility

Structural and Technical Features

Save element to Kindle

Save element to Dropbox

Save element to Google Drive