Hostname: page-component-6766d58669-kl59c Total loading time: 0 Render date: 2026-05-20T17:08:52.077Z Has data issue: false hasContentIssue false

Applying a wide and deep learning model to core-scan XRF data to reconstruct mineral assemblages for Pleistocene paleolake Olduvai, Tanzania

Published online by Cambridge University Press:  27 January 2026

Lindsay J. McHenry*
Affiliation:
Department of Geosciences, University of Wisconsin-Milwaukee, Milwaukee, WI, USA
Gayantha R.L. Kodikara
Affiliation:
Department of Geosciences, University of Wisconsin-Milwaukee, Milwaukee, WI, USA
Ian G. Stanistreet
Affiliation:
Department of Earth, Ocean and Ecological Sciences, University of Liverpool, Liverpool, UK The Stone Age Institute, Bloomington, IN, USA
Harald Stollhofen
Affiliation:
GeoZentrum Nordbayern, Friedrich-Alexander-University (FAU) Erlangen-Nürnberg, Erlangen, Germany
Jackson Njau
Affiliation:
The Stone Age Institute, Bloomington, IN, USA Department of Earth and Atmospheric Sciences, Indiana University, Bloomington, IN, USA
Kathy Schick
Affiliation:
The Stone Age Institute, Bloomington, IN, USA
Nicholas Toth
Affiliation:
The Stone Age Institute, Bloomington, IN, USA
*
Corresponding author: Lindsay J. McHenry; Email: lmchenry@uwm.edu
Rights & Permissions [Opens in a new window]

Abstract

Paleolake coring initiatives result in large datasets from various proxies taken at different resolutions, ranging from continuous scans to samples collected at coarser intervals. Higher-resolution data (e.g., core-scan X-ray fluorescence [XRF]) can detect short-duration changes in the paleolake and help identify unit boundaries with precision; however, interpreting the causes of such changes may require sampling and more intensive laboratory analysis like X-ray diffraction (XRD). This study applies a published wide and deep learning model, developed for the Olduvai Gorge Coring Project (OGCP) 2014 cores from the Pleistocene Olduvai basin, Tanzania, to reconstruct the mineral assemblages from saline-alkaline paleolake Olduvai using core-scan XRF data and core lithology. A classification model (predicting mineral presence or absence) and a regression model (predicting relative abundances of minerals) yielded predictions for two OGCP cores (2A and 3A), which were compared with published XRD mineral data and detailed core sedimentological descriptions. The models were excellent at identifying dolomite-rich layers, carbonate-rich intervals, intervals of sandstone within claystone, and altered tuffs within claystone and at predicting whether illitic or smectitic clays dominate. The models struggled with less-altered tuffs and with zeolites in non-tuff sediments, especially when XRD identified chabazite and erionite (rather than phillipsite) as the dominant, non-analcime zeolite.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press on behalf of Quaternary Research Center.
Figure 0

Figure 1. Location of the Olduvai basin and positions of the 2014 Olduvai Gorge Coring Project (OGCP) cores. (A) Map of East Africa. (B) Regional map showing the geographic context of Olduvai Gorge on the shoulder of the eastern African Rift (EAR), with the Ngorongoro Volcanic Highlands (NVH) to the east and south and metamorphic basement exposures to the west and north (map after Ashley and Hay, 2002). (C) Map of the Olduvai basin, showing the locations of Olduvai Gorge, major faults, the reconstructed position of the Olduvai paleolake during the deposition of Beds I and II, and the position of the three 2014 OGCP coring locations (map after Stanistreet et al., 2020a).

Three maps, one showing the location of Olduvai Gorge in Eastern Africa, the next showing a regional map showing Olduvai’s location relative to nearby volcanoes, the Serengeti, and the East African Rift. The third map shows the outline of the modern Olduvai Gorge with its major faults, with the position of the paleolake depocenter superimposed. All three OGCP borehole locations lie within this depocenter, with Core 3A at the north end and Cores 1A and 2A closer to the modern gorge.
Figure 1

Figure 2. Stratigraphic sections for Olduvai Gorge Coring Project (OGCP) Cores 1A, 2A, and 3A. Beds I and II and the top of the Ngorongoro Formation have outcrop equivalents, while most of the Ngorongoro Formation and all of the Naibor Soit Formation are known only from these cores. The intervals for which X-ray diffraction (XRD) data are available in McHenry et al. (2020b) are indicated with red brackets. The representative sections discussed at length in this paper are indicated with red dots.

Stratigraphic sections for Cores 1A, 2A, and 3A, color coded for lithology. Core 2A is the longest, reaching ~250 meters below the surface. Core 2A has alternating intervals of fluvio-lacustrine deposits, indicated by green claystone and yellow sandstone, with volcanic and volcaniclastic materials from the Ngorongoro Formation, indicated with reds and purples. The intervals studied in this paper are indicated, and include most of the fluvio-lacustrine intervals for Cores 2A and 3A.
Figure 2

Figure 3. Classification model results for core interval 2A-23Y-(1-2), 54–56.8 m below surface (mbs), vs. lithological classes, with X-ray diffraction (XRD) phase IDs for individual samples from McHenry et al. (2020b). Solid blue lines show stratigraphic positions of XRD-analyzed samples. For the classification categories: Qc = quartz, Pl = plagioclase, Cc = calcite, Do = dolomite, Kf = K-feldspar (including anorthoclase), Ze = non-analcime zeolite (including chabazite, phillipsite, erionite, clinoptilolite), An = analcime, Sm = smectitic clay, Il = illitic clay, Fe = iron-bearing minerals. XRD results use the same abbreviations, plus Ab = albite, Ac = anorthoclase, Ar = aragonite, Ch = chabazite, Cl = clinoptilolite, Er = erionite, Gl = glass, Hb = hornblende, and Ph = phillipsite. Classification data plotted show model-calculated probability (0 to 1) that a mineral is present. Values greater than 0.5 indicate likely presence. In this interval, quartz has the highest probabilities (and analcime has lower probabilities) in sandstone and the lowest probabilities below 55.7 mbs, consistent with XRD. The dashed red lines indicate the positions of thin sandstone within the claystone-dominated lower interval, highlighting these mineralogical differences.

Plots showing the probabilities, between 0 and 1, for the presence of a specific mineral or mineral group. There are ten columns each containing a plot of 0 to 1 probability on the X axis and the depth within the core on the y axis. To the left of the columns are the XRD based mineral identifications for specific depths. To the right of the columns is a stratigraphic section for the interval, in this case consisting of interlayered sandstone, claystone, and clay-sand.
Figure 3

Figure 4. Regression model results for core interval 2A-23Y-(1-2), 54–56.8 m below surface (mbs), vs. lithological classes. QC = quartz, PK = feldspar (plagioclase and K-feldspar), CD = carbonates (calcite and dolomite), AZ = zeolites (analcime and non-analcime), and SI = clays (illitic or smectitic). Abundances are relative and do not add up to 1 (because one sample can have more than one “abundant” mineral).

The figure shows five graphs in columnar format, plotting predicted mineral abundance on the x axis and depth below surface on the y axis. The graphs show trends for quartz, feldspars, carbonates, zeolites, and clays. To the right of the graphs is the stratigraphic section for this interval, in this case consisting of interlayered sandstone, claystone, and clay-sand.
Figure 4

Figure 5. Classification model results for core interval 3A-23Y-(1-2), 54–57 m below surface (mbs), vs. lithological classes with X-ray diffraction (XRD) phase IDs for individual samples from McHenry et al. (2020b). See Figure 3 caption for mineral abbreviations and for scale description. Plots show the likely presence of calcite and K-feldspar for most of this interval, with the probability of analcime increasing at the expense of K-feldspar above 54.5 mbs, indicated by the dashed red line. Illite is more likely than smectite for most of this interval.

Plots showing the probabilities, between 0 and 1, for the presence of a specific mineral or mineral group. There are ten columns each containing a plot of 0 to 1 probability on the X axis and the depth within the core on the y axis. To the left of the columns are the XRD based mineral identifications for specific depths. To the right of the columns is a stratigraphic section for the interval, in this case consisting of mostly claystone or clay sand with few, thin layers of carbonate or sandstone.
Figure 5

Figure 6. Regression model results for core interval 3A-23Y-(1-2), 54–57 m below surface (mbs), vs. lithological classes. See Figure 4 caption for mineral abbreviations and for scale description.

The figure shows five graphs in columnar format, plotting predicted mineral abundance on the x axis and depth below surface on the y axis. The graphs show trends for quartz, feldspars, carbonates, zeolites, and clays. To the right of the graphs is the stratigraphic section for this interval, in this case consisting of mostly claystone or clay sand with few, thin layers of carbonate or sandstone.
Figure 6

Figure 7. Classification model results for core interval 2A-28Y-1 to 29Y-1, 64.9-67.4 m below surface (mbs) vs. lithological classes with X-ray diffraction (XRD) phase IDs for individual samples from McHenry et al. (2020b). See Figure 3 caption for mineral abbreviations and for scale description. This core interval includes marker Tuff IF (interval indicated by dashed red lines), identifiable here by its lower probability for containing calcite or dolomite and its higher probability of containing non-analcime zeolites, compared with enclosing claystone.

Plots showing the probabilities, between 0 and 1, for the presence of a specific mineral or mineral group. There are ten columns each containing a plot of 0 to 1 probability on the X axis and the depth within the core on the y axis. To the left of the columns are the XRD based mineral identifications for specific depths. To the right of the columns is a stratigraphic section for the interval, which contains a thick tuff (Tuff IF) in the upper half with a carbonate layer directly below. Above and below these units is claystone or sandy claystone.
Figure 7

Figure 8. Regression model results for core interval 2A-28Y-1 to 29Y-1, 64.9–67.4 m below surface (mbs) vs. lithological classes. See Figure 4 caption for mineral abbreviations and for scale description. Tuff IF is identifiable in the model results by its lower abundance of carbonate, increased abundance of zeolite, and lower abundance of clay compared with enclosing claystone.

The figure shows five graphs in columnar format, plotting predicted mineral abundance on the x axis and depth below surface on the y axis. The graphs show trends for quartz, feldspars, carbonates, zeolites, and clays. To the right of the graphs is the stratigraphic section for this interval, which contains a thick tuff (Tuff IF) in the upper half with a carbonate layer directly below. Above and below these units is claystone or sandy claystone.
Figure 8

Figure 9. Classification model results for core interval 3A-24Y-(1-2), 57–60 m below surface (mbs), vs. lithological classes with X-ray diffraction (XRD) phase IDs for individual samples from McHenry et al. (2020b). See Figure 3 caption for mineral abbreviations and for scale description. This Core 3A interval also includes marker Tuff IF, which shows up clearly with its lower probabilities for calcite, dolomite, and analcime, combined with its higher probabilities for non-analcime zeolites.

Plots showing the probabilities, between 0 and 1, for the presence of a specific mineral or mineral group. There are ten columns each containing a plot of 0 to 1 probability on the X axis and the depth within the core on the y axis. To the left of the columns are the XRD based mineral identifications for specific depths. To the right of the columns is a stratigraphic section for the interval. Tuff IF dominates the upper third of the section, with a carbonate layer beneath. The rest of the section (above and below) consistst of claystone or sandy claystone, with thin interlayers of tuff, sandsone, and carbonate in the strata below.
Figure 9

Figure 10. Regression model results for core interval 3A-24Y-(1-2), 57–60 m below surface (mbs), vs. lithological classes. See Figure 4 caption for mineral abbreviations and for scale description. Tuff IF is identifiable based on its higher predicted abundance of zeolites and feldspar and lower abundance of carbonate compared with surrounding sediments. Other thinner tuff beds in the same interval do not provide as strong a signal.

The figure shows five graphs in columnar format, plotting predicted mineral abundance on the x axis and depth below surface on the y axis. The graphs show trends for quartz, feldspars, carbonates, zeolites, and clays. To the right of the graphs is the stratigraphic section for this interval. Tuff IF dominates the upper third of the section, with a carbonate layer beneath. The rest of the section (above and below) consistst of claystone or sandy claystone, with thin interlayers of tuff, sandsone, and carbonate in the strata below.
Figure 10

Figure 11. Classification model results for core interval 3A-31Y-2, 79.5–81 m below surface (mbs), vs. lithological classes with X-ray diffraction (XRD) phase IDs for individual samples from McHenry et al. (2020b). See Figure 3 caption for mineral abbreviations and for scale description. This interval contains the top of a dolomitic marl, which can be seen in the high probability of dolomite. Increased zeolite probabilities toward the top correspond to a thin tuff bed (dashed red line).

Plots showing the probabilities, between 0 and 1, for the presence of a specific mineral or mineral group. There are ten columns each containing a plot of 0 to 1 probability on the X axis and the depth within the core on the y axis. To the left of the columns are the XRD based mineral identifications for specific depths. To the right of the columns is a stratigraphic section for the interval. The top two thirds of the strata in this section is claystone with a few thin sandstones and one very thin tuff, while the bottom third is carbonate.
Figure 11

Figure 12. Regression model results for core interval 3A-31Y-2, 79.5–81 m below surface (mbs), vs. lithological classes. See Figure 4 caption for mineral abbreviations and for scale description.

The figure shows five graphs in columnar format, plotting predicted mineral abundance on the x axis and depth below surface on the y axis. The graphs show trends for quartz, feldspars, carbonates, zeolites, and clays. To the right of the graphs is the stratigraphic section for this interval. The top two thirds of the strata in this section is claystone with a few thin sandstones and one very thin tuff, while the bottom third is carbonate.
Figure 12

Figure 13. Classification model results for core interval 3A-39Y-(1-2), 99–102 m below surface (mbs), vs. lithological classes with X-ray diffraction (XRD) phase IDs for individual samples from McHenry et al. (2020b). See Figure 3 caption for mineral abbreviations and for scale description. The increased probability of iron-bearing phases in the upper part of this interval corresponds with pyrite in the core descriptions (indicated on the lithology column by orange diamonds). The transition from illite to smectite, and the abrupt increase in carbonate, are consistent with XRD results, however the increase in zeolite abundance is not; no zeolites were identified by XRD in this part of the core.

Plots showing the probabilities, between 0 and 1, for the presence of a specific mineral or mineral group. There are ten columns each containing a plot of 0 to 1 probability on the X axis and the depth within the core on the y axis. To the left of the columns are the XRD based mineral identifications for specific depths. To the right of the columns is a stratigraphic section for the interval, which in this case is almost entirely claystone with a few thin sandstones and one thin carbonate layer in the upper quarter.
Figure 13

Figure 14. Regression model results for core interval 3A-39Y-(1-2), 99–102 m below surface (mbs), vs. lithological classes. See Figure 4 caption for mineral abbreviations and for scale description. The transition to carbonate-rich sediment is consistent with X-ray diffraction (XRD) results, but the peak in zeolite intensity is not.

The figure shows five graphs in columnar format, plotting predicted mineral abundance on the x axis and depth below surface on the y axis. The graphs show trends for quartz, feldspars, carbonates, zeolites, and clays. To the right of the graphs is the stratigraphic section for this interval, which in this case is almost entirely claystone with a few thin sandstones and one thin carbonate layer in the upper quarter.
Figure 14

Figure 15. Classification model results for core interval 2A-51Y-2, 130.45–131.9 m below surface (mbs), vs. lithological classes with X-ray diffraction (XRD) phase IDs for individual samples from McHenry et al. (2020b). See Figure 3 caption for mineral abbreviations and for scale description. This core interval is entirely composed of less altered tuff from the Ngorongoro Formation. Clay and carbonate probabilities are low, whereas feldspar, quartz, and zeolite probabilities are high. XRD did not reveal zeolite for this interval, instead showing volcanic glass (a phase not considered in this model).

Plots showing the probabilities, between 0 and 1, for the presence of a specific mineral or mineral group. There are ten columns each containing a plot of 0 to 1 probability on the X axis and the depth within the core on the y axis. To the left of the columns are the XRD based mineral identifications for specific depths. To the right of the columns is a stratigraphic section for the interval. , which in this case is entirely tuff.
Figure 15

Figure 16. Regression model results for core interval 2A-51Y-2, 130.45–131.9 m below surface (mbs), vs. lithological classes. See Figure 4 caption for mineral abbreviations and for scale description.

The figure shows five graphs in columnar format, plotting predicted mineral abundance on the x axis and depth below surface on the y axis. The graphs show trends for quartz, feldspars, carbonates, zeolites, and clays. To the right of the graphs is the stratigraphic section for this interval, which in this case is entirely tuff.
Figure 16

Figure 17. Classification model results for core interval 2A-65Y-(1-2), 168–170.9 m below surface (mbs), vs. lithological classes with X-ray diffraction (XRD) phase IDs for individual samples from McHenry et al. (2020b). See Figure 3 caption for mineral abbreviations and for scale description. This core interval represents the Naibor Soit Formation. The high level of “noise” in the probabilities for quartz, plagioclase, calcite, and zeolites likely reflects actual mineralogical variability over short time intervals, as this sample-to-sample variability is also observed in the XRD data, but it could also be attributed to lack of adequate model training. The model predicts the presence of analcime throughout much of this core section, but it was not detected in any of the XRD samples (which all happen to coincide with dips in its probability). Conversely, non-analcime zeolites are detected in all XRD samples, although the probability of their presence is on average lower than that of analcime.

Plots showing the probabilities, between 0 and 1, for the presence of a specific mineral or mineral group. There are ten columns each containing a plot of 0 to 1 probability on the X axis and the depth within the core on the y axis. To the left of the columns are the XRD based mineral identifications for specific depths. To the right of the columns is a stratigraphic section for the interval. The top and bottom of this interval are sandy claystone, while the middle two thirds are claystone.
Figure 17

Figure 18. Regression model results for core interval 2A-65Y-(1-2), 168–170.9 m below surface (mbs), vs. lithological classes. See Figure 4 caption for mineral abbreviations and for scale description.

The figure shows five graphs in columnar format, plotting predicted mineral abundance on the x axis and depth below surface on the y axis. The graphs show trends for quartz, feldspars, carbonates, zeolites, and clays. To the right of the graphs is the stratigraphic section for this interval. The top and bottom of this interval are sandy claystone, while the middle two thirds are claystone.
Supplementary material: File

McHenry et al. supplementary material

McHenry et al. supplementary material
Download McHenry et al. supplementary material(File)
File 2.9 MB