In their comment to our recently published paper (Brugger and others, Reference Brugger2018a), Festi and others (Reference Festi, Kofler and Oeggl2019) conducted a reliability test of two standardized marker tablets (Lycopodium and Eucalyptus), which in our opinion confirms our interpretations. Indeed, based on 20 replicates, their analyses show a striking linear relationship between the two markers (visible in Fig. 1 of their comment). However, we argue that statistically and methodologically it is impossible to question the reliability of individual markers solely based on a bivariate marker ratio and a test of 20 tablets:
(1) Previous studies showed that a microfossil ratio yields statistically robust relationships (95% confidence) when counting a minimum of 200 items per sample (including Lycopodium; Finsinger and Tinner, Reference Finsinger and Tinner2005; Payne and Mitchell, Reference Payne and Mitchell2009; Campbell and others, Reference Campbell, Fletcher, Hughes and Shuttleworth2016). Thus, the additional counting efforts by Festi and others (Reference Festi, Kofler and Oeggl2019) would have been more fruitfully invested in counting more replicates than increasing the counting sums. Specifically, a set of 10 or 20 replicates is too low to draw conclusions on the absolute uncertainty of the tablet contents, and particularly for assessing differences in ratios of pollen counts, as done by Festi and others (Reference Festi, Kofler and Oeggl2019). For univariate datasets a minimum of 30 observations is typically required (Bahrenberg and others, Reference Bahrenberg, Giese and Nipper1990). For bivariate datasets the size of samples is usually expected to be >100 (Mittag, Reference Mittag2011).
(2) To test the reliability of individual marker uncertainties, the authors would need an independent third marker (or reference) rather than a ratio of two variables (Maher, Reference Maher2000). Alternatively, the authors may count the total absolute content of each tablet but this is more difficult to achieve (thousands of items).
(3) Research groups in paleoecology successfully use Lycopodium and/or Eucalyptus tablets since decades (Maher, Reference Maher1981, Reference Maher2000). Recent successful Eucalyptus marker-based studies are e.g. Larochelle and others (Reference Larochelle, Lavoie, Grondin and Couillard2018) and Salonen and others (Reference Salonen2018). Thus there is no need to question the Eucalyptus marker procedure, especially given the clear linkage found by the authors that shows that counts of Eucalyptus pollen grains could be used to carefully estimate the mean Lycopodium spore concentrations, if only absolute numbers of Eucalyptus would be known (Maher, Reference Maher2000). Indeed, we agree with Festi and others (Reference Festi, Kofler and Oeggl2019) that the tablets need to derive from standardized and tested production, as all tablets (Lycopodium and Eucalyptus) did that we used in our study. In fact, they were produced following the same standardized procedure established at the University of Lund (Stockmarr, Reference Stockmarr1971; Maher, Reference Maher2000). Moreover, the reliability of tablets from a batch depends on how complete they are and how they have been stored. This is why all marker tablets in Brugger and others (Reference Brugger2018a, Reference Brugger2018b) were first checked for potential erosion.
Taken together, the comment by Festi and others (Reference Festi, Kofler and Oeggl2019) as well as our own published test with ten tablet replicates clearly show that abundance estimates based on the two used markers, Lycopodium and Eucalyptus, are strongly related, for instance the means of the pollen or spore concentrations are mutually predictable in their study. Most importantly, the majority of observations (60% of 20 tested tablets) is actually within the uncertainty estimates (one std dev. = 68% of observations) as predicted based on the estimate by the tablet producer. If the statistical population (N) contains only 20 observations, each random observation has a numerical weight of 5%, thus a divergence of 8% (60% inside the ratio limits instead of 68% as provided by the tablet producers) is well within the expected variability. It could also be that part of this difference is due to the Lycopodium tablets. Hence, Festi and others (Reference Festi, Kofler and Oeggl2019) counts clearly show that the ratio Lycopodium:Eucalyptus cannot vary between 0.3 and 1.8 as claimed by O'Rourke (Reference O'Rourke1986). We thus strongly suggest trusting the tablet uncertainties provided by the laboratory. In line with this reasoning, previous studies suggested that adding two markers (i.e. Lycopodium and Eucalyptus) as a standard to each sample may help to further reduce the uncertainty range of the markers for microfossil concentrations (Adam and Robinson, Reference Adam and Robinson1988).
Furthermore, the differences in pollen percentages between the centrifugation-based methods (FESTI, EICHLER) compared to the filtration- and evaporation-based methods indicates a clear variation in pollen composition (pollen percentages), which is independent from marker-based concentration estimates. Thus, as Festi and others (Reference Festi, Kofler and Oeggl2019) correctly repeat, the method differences have important implications for the interpretation of pollen records with a preferential loss of vesiculate pollen when using centrifugation-based methods, which is one of the fundamental conclusions in our method comparison (Brugger and others, Reference Brugger2018a).
The rather low altitude of Ortles glacier (3880 m a.s.l.) in Italy presented by Festi and others (Reference Festi, Kofler and Oeggl2019) leads to exceptionally high microfossil concentrations that allow the use of very small sample volumes (only 30 ml), which do not require multiple centrifugation steps. Our method comparison instead investigates ice archives with much lower microfossil concentrations. Therefore, it must be far more sensitive and robust at the same time. For instance, palynological Central Asian high-alpine (>4000 m a.s.l.) ice core analyses required ~200–400 ml of water to produce reliable results (Brugger and others, Reference Brugger2018b, Reference Brugger2019a), while for remote Central Greenland ice cores ~2000 ml were required (Brugger and others, Reference Brugger2019b). Clearly, we never intended to design a method specifically for Ortles glacier as assumed by Festi and others (Reference Festi, Kofler and Oeggl2019) (e.g. with a very low sample volume and very limited centrifugation steps). Our method is instead designed for typical high-altitude continental (>4000 m a.s.l.) and remote polar ice archives that have far lower microfossil concentrations (Brugger and others, Reference Brugger2018b, Reference Brugger2019a, Reference Brugger2019b).
The conclusions by Festi and others (Reference Festi, Kofler and Oeggl2019) seem therefore highly narrow in scope and solely applicable to the special, low-altitude ice archive of Ortles glacier. In summary, we feel that it is comprehensible that Festi and others (Reference Festi, Kofler and Oeggl2019) provide further details on their original method for Ortles glacier (Festi and others, Reference Festi2015). We compared this method with other methods in a standardized framework by clearly labeling it as ‘modified’. The marker test presented in the comment by Festi and others (Reference Festi, Kofler and Oeggl2019) provides firm evidence on the utility of Eucalyptus:Lycopodium marker ratios, which is correct but not a novel finding (see e.g. Adam and Robinson, Reference Adam and Robinson1988; Maher, Reference Maher2000; Brugger and others, Reference Brugger2018a). Festi and others (Reference Festi, Kofler and Oeggl2019) clarify methodical details on their approach but conclude from their Ortles view that the comparison of the six tested methods is questionable. We feel that this overall conclusion made by Festi and others (Reference Festi, Kofler and Oeggl2019) is unsupported by the available methodological and empirical evidence. Finally, Festi and others (Reference Festi, Kofler and Oeggl2019) arguments do not change that several of the evaluated approaches (e.g. BRUGGER, LIU; Liu and others, Reference Liu, Yao and Thompson1998; Brugger and others, Reference Brugger2018a) result in more stable ratios between the two markers and group closer to expected values, than others (e.g. FESTI, EICHLER; Eichler and others, Reference Eichler2011; Festi and others, Reference Festi2015).
We thank Ozan Akdogan for his advice on univariate and bivariate statistics, we are grateful to César Morales del Molino, and we acknowledge the SINERGIA project Paleo fires from high-alpine ice cores funded by the Swiss National Science Foundation (SNF grant 154450). Furthermore, we thank two anonymous reviewers for commenting and the editor Martyn Tranter for handling our reply.