EVOLUTION OF RADIOCARBON CALIBRATION

ABSTRACT In the late 1950s it was recognized that levels of atmospheric radiocarbon (14C) had not been constant over time. Since then, researchers have sought to document those changes, initially through measurements of known age tree rings and more recently using other archives to create curves to correct or calibrate radiocarbon ages to calendar ages. This paper highlights some, but by no means all, of the efforts to create and extend radiocarbon calibration curves.

samples. They also provided the first Southern Hemisphere calibration curve from measurements of Argentine trees from AD 1400-1950.
The construction of a bristlecone pine chronology at the University of Arizona Laboratory of Tree Ring Research (Ferguson 1968) made it possible to extend radiocarbon calibration. Suess published calibration corrections from 4100 BC to 1500 BC based on bristlecone pine measured at La Jolla (Suess 1967). The calibration was extended with further measurements of bristlecone pine from 5400 BC to the present (Suess 1970) and was widely used by archaeologists (Ottaway and Ottaway 1972). As noted in comments by Renfrew, calibration had a tremendous impact on the interpretation of the dates of prehistoric Europe (Bermingham and Renfrew 1972).
Suess's calibration curve was not overwhelming accepted as a number of researchers did not believe that the fine structure or "wiggles" were real (Burleigh et al. 1972;Switsur 1973). The fact that Suess said he had drawn the curves with "cosmic schwung" to demonstrate the "most probable character" of the curve did not instil confidence (Adams 1973;Olsson 2009). Switsur (1973) compared the calibration tables presented by the University of Arizona (Damon et al. 1974) and the University of Pennsylvania (Ralph et al. 1974) at the 8th International Conference on Radiocarbon Dating at Lower Hutt City, New Zealand, and found that the difference in corrected ages was fairly small. He therefore provided calibration corrections from the averages of the Arizona and Pennsylvania measurements. These corrections had what was felt to be an advantage over the Suess calibration corrections as they produced only one calibrated age for each radiocarbon measurement. Suess argued "nature is not necessarily as simple as we would wish and an attempt to draw a smooth curve amongst the measured points is probably the wrong approach" (Switsur 1973). Renfrew and Clark (1974) reviewed all the curves that had been published to date, as well as a formula for calibration (Wendland and Donley 1971) and found these were all statistically unsatisfactory. Clark (1975) then examined all the data and excluded duplicates as well as some data that did not use the NBS oxalic acid standard. He found that the variability between replicates of the same tree rings in different laboratories was in excess of the reported errors which generally only included counting statistics. Clark used a spline function to construct a curve from the data with the standard error determined from the replicates and a smoothing parameter determined by a cross-validation technique. The resulting curve was smoother than that of Suess (1970) but not as smooth as Switsur's (1973). Indeed, Clark concluded that the "kinks and wobbles" in Suess's calibration curve are "not justified statistically." Suess responded that there was no statistical justification necessary for kinks, whereas Clark responded that his method had been tested and would detect "kinks" if they were there and not swamped by measurement errors (Suess and Clark 1976). No international agreement on a calibration curve could be reached at the Ninth International Conference Los Angeles and La Jolla in 1976. It is no wonder that archaeologists began to doubt the validity of radiocarbon dating and calibration (Pilcher and Baillie 1978).
Meanwhile German and Irish oak dendrochronologies, though mostly still floating (i.e., not dendrochronologically linked to known calendar ages), were under development (Becker 1979;Pilcher and Baillie 1978). Suess presented measurements of floating German oak and California bristlecone pine to show that the 14 C "wiggles" in them were synchronous and of the same amplitude within uncertainty (Suess 1979). High-precision gas proportional Evolution of 14 C Calibration 525 counters were developed in Groningen with an overall precision of 1.3‰ for modern carbon (Tans and Mook 1978) and in Seattle with a precision of 1.5-2‰ for a Pacific coast tree-ring series dating back to 1200 AD (Stuiver et al. 1979). A 500-yr-long section of the Southern German Neolithic floating chronology oak was measured in Groningen (de Jong et al. 1979) which confirmed the existence of the Suess-type wiggles. Pearson (1979) developed high-precision liquid scintillation counters and made a thorough study of the source of errors in the data. With careful controls he was able to produce 14 C data on samples of Irish oak with an overall uncertainty of 2.5‰ (ca. 20 14 C yr).
In 1979, a meeting was held at Tucson, Arizona, with the primary objective being to construct a radiocarbon calibration correction table using only 14 C analyses funded by the U.S. National Science Foundation . A method for producing the radiocarbon calibrations was agreed based on multistage linear regression previously presented by Ralph and Klein (1978). The workshop report also stated that "the most important accomplishment of the Workshop was the participants' universal agreement to accept a single calibration curve based on the composite data set." The participants also recommended establishing an International Calibration Committee to extend the workshop to include all available data worldwide. Statistical analysis results from the workshop were reported at the 10th International Radiocarbon Conference in Bern ). The calibration curve and tables constructed as agreed were published two years later (Klein et al. 1982).
A number of new radiocarbon measurements were also presented at the 10th International Radiocarbon conference. Pearson (1980) reported dates on 20-ring blocks of Irish and Scottish oak trees from AD 1510 to AD 1840 using liquid scintillation counters which were in agreement with those from Stuiver (1978) on North American Douglas fir measured using gas proportional counters. A visual match of an older floating Irish tree-ring sequence with that of the Suess bristlecone pine provided an approximate age range of 1810-3535 BC. The data exhibited a saw tooth pattern with a period of 110 to 180 years, similar to that seen in the German oak (de Jong et al. 1979) with which there was a short overlap. Pearson concluded, however, that the variability in the bristlecone data (Suess 1979) was too high to identify wiggles. Bruns et al. (1980) measured 1-3 ring samples of Southern German Oak from AD 200-800 with gas counters at the University of Heidelberg. Using Fourier analysis, they determined a periodicity of 150 to 180 years similar to that seen by de Jong et al. (1979) and de Jong and Mook (1980) in tree rings from several millennia earlier. Likewise, after removing a sine curve with a period of 12,100 years and applying a smoothing spline, Fourier analysis showed a significant 200-yr period in the bristlecone pine data (Suess 1980). The theoretical framework for the Suess "wiggles" was set out by Stuiver and Quay (1980). They used a carbon reservoir model to calculate the 14 C production rate from precise 14 C measurements of Pacific coast tree rings. The calculated production rates were then compared to those derived from neutron flux measurements and geomagnetic indices and to historical records of sunspot numbers and auroral indices. This work showed that the natural atmospheric 14 C changes were related to Earth's geomagnetic field variation and changes in solar wind shielding properties. In later work, ocean circulation and climate changes were also shown to influence atmospheric 14 C (Stuiver et al. 1991).
The Klein et al. (1982) calibration curve was based on dendrochronologically dated trees, primarily bristlecone pine and sequoia of 20 or fewer rings measured at the Universities of Groningen, California at La Jolla, Pennsylvania, and Yale. No other selection criteria were applied and no adjustment was made for systematic differences between laboratories , however, the uncertainty from the interlaboratory standardization was included in the calibration uncertainty. The data were weighted by their measurement uncertainty and an additional term to account for increasing uncertainty with sample age and scaled logarithmically. A polynomial regression was done to derive the long period of the curve on the order of a few thousand years. Next a piecewise regression was done on the residuals around the polynomial in intervals of 500 years ("shingles") overlapping by 250 years and Fourier analysis was done to determine the short period variability. A composite calibration function with uncertainty was determined from these. A section of the curve from AD 600-1950 is shown in Figure 3 with 90% confidence limits for radiocarbon errors of 20, 100, 200, and 300 14 C yr.
Stuiver produced a high-precision decadal calibration curve for AD 1-1950 based on Douglas Fir from the Pacific Northwest and Sequoia from California (Stuiver 1982). Measurement errors were ±12 14 C yr, which were increased to ±18 14 C yr with an error multiplier derived from comparison of a set of 30 pairs of contemporaneous samples from different trees. The difference between the Seattle data and that from Belfast (Pearson 1980) for this period was 4.4 14 C years with mean age difference of 2 ± 3 14 C yr between the complete data sets of 53 sample pairs (AD 955-1840). Offsets varied between 27 for German oak measured Evolution of 14 C Calibration 527 and La Jolla and Douglas fir in Seattle, between 37 and 55 14 C yr for La Jolla bristlecone pine versus Seattle sequoia and Seattle Douglas fir, respectively. Preliminary measurements from Heidelberg on German oak were offset 58 14 C yr from Seattle sequoia.
At the business meeting of the 11th International Radiocarbon Conference in Seattle, 1982, the International Calibration Committee set up by Damon et al. (1980) "was discharged with thanks for their useful work and a smaller committee with W. G. Mook as chairman was re-constituted for the 1982-1985 interval" (Stuiver 1983).

THE STUIVER YEARS
The 1986 Calibration Curves The 12th International Radiocarbon Conference, Trondheim, 1985, saw a proliferation of calibration datasets and curves. The bi-decadal curves from 2500 BC to the present  were officially recommended at the business meeting (Mook 1986), the first time the radiocarbon community had come to an agreement on radiocarbon calibration. These curves, which were based on measurements of Irish oak (Pilcher et al. 1984) and trees from the western U.S., were published in a special calibration issue that also included a decadal curve for the same interval based on the Seattle measurements only . Additional measurements on German oak, both dendrochronologically dated and floating, and bristlecone pine were published in the calibration issue (Stuiver 1986 and references therein). Stuiver (1986) also suggested the possibility of extending the calibration curve with varved sediment data from Lake of the Clouds.
These decadal and bidecadal data, with the exception of the 14 C data from varved sediment and the older floating German oak chronology, were combined with a simple weighted average of the data within each bidecade, rather than using more complicated statistical processes, to create the curve atm20 for calibration of terrestrial samples from 7210 cal BC-AD 1950 using the computer program CALIB (Stuiver and Reimer 1986). An ocean-atmosphere box model developed by Oeschger et al. (1975) was used to construct a "global" marine radiocarbon calibration curve using the atmospheric curve as input age ). The concept of ΔR, the difference between the "global" ocean radiocarbon and that of a specific region was introduced and a table of ΔR values given based on known age, pre-bomb samples for use with the marine curve.
A comparison of different calibration methods and programs, using the 1986 datasets, was undertaken by Aitchinson et al. (1989). The theoretical basis for combining archaeological information and radiocarbon dates through a Bayesian calibration model was developed by Buck et al. (1991) and the first versions of the software packages OxCal (Bronk Ramsey 1995) and BCal (Buck et al. 1999) were made available in due course. Both BCal and OxCal utilize the ratified IntCal curves as do age-depth modeling programs such as BChron (Parnell et al. 2008) and BACON (Blaauw and Christen 2011).

The 1993 Update
The calibration curves were corrected and extended and a probability method of calibration ) was added to the CALIB program (Stuiver and Reimer 1993). A correction to earlier Seattle data had to be made due to a change in lab procedure that introduced radon into the samples which produced alpha particles during decay (Stuiver and Becker 1993). The radon correction was based on the difference between the first and last day of counting for hundreds of samples. The Belfast data was also adjusted to account for "the variation of efficiency with time" in standard counts . Additional data of German oak measured by Heidelberg and floating German pine with a tentative match to the oak (Kromer and Becker 1993) were also included in the bidecadal curve. Adjustments for laboratory offsets were also made. For the first time, U-Th and 14 C dated coral (Bard et al. 1993) were included with a 400-yr reservoir offset to extend the curve with a smoothing spline to nearly 22,000 cal BP. A ratification vote for the bidecadal IntCal93 and the corresponding Marine93 curves was not sought and the curves were not generally accepted in Europe (F. G. McCormac, pers. comm.)

IntCal98 and Marine98
In 1996 a workshop was held at the Wissenschaftsforum, Heidelberg, Germany, to address aspects of high-precision radiocarbon calibration (Kromer et al. 1996). With regards to calibration data, two main points were made. An intercomparison of the Hohenheim and the Gottingen German oak chronology (Leuschner and Delorme 1988;Becker 1993) revealed that 41 years were missing from the published Hohenheim chronology . The previous link between the German oak and Preboreal Pine Chronology (Becker 1993;Kromer and Becker 1993) was also now discarded so the pine chronology was left floating. Gerry McCormac presented evidence to support the original Irish oak 14 C data ) as opposed to the revised dates ) used in the 1993 calibration curves. Further potential calibration datasets were presented at the 16th International Radiocarbon Conference in Groningen in 1997. Among these were U-Th dated coral datasets from Barbados, Mururoa and Tahiti (Bard et al. 1998) and from Vanuatu (Burr et al. 1998) and foraminifera from varved Cariaco Basin sediments (Hughen et al. 1998). The agreement of the high resolution Cariaco Basin data (Hughen et al. 1998) with the Barbados coral data (Bard et al. 1998) convinced Minze Stuiver to include varve data in the calibration curve for the first time. At the conference a few scientists expressed their discontent to me about having their data included in the calibration curve without direct involvement. At my suggestion they went to talk to Minze Stuiver about it and he agreed that they would become co-authors on the resulting publication. The international part of the curve name "IntCal" became a reality.
A few other corrections were made to previously published data. Comparison of tree-ring samples remeasured in the Seattle lab showed that the radon correction made in 1993 needed to be reduced by 50% (Stuiver et al. 1998b). A weak section was found on reexamination of the older part of the German oak chronology and additional trees were found to bridge this section resulting in a shift of 54 yr to the older trees ). In addition, a 14 C match of the floating German pine chronology to the German oak provided a tentative link and extended the chronology to 11,857 cal BP with an error estimate for the link of 20 cal years . Other new tree-ring data included in the IntCal98 curve were German oak and pine, Sitka spruce and Douglas fir (Stuiver et al. 1998a), Irish oak (McCormac et al. , 1998b and previously published data (Kromer et al. 1986;Vogel and van der Plicht 1993).
Previously, 14 C ages on dendrodated wood covering only a few years (sub-decadal) were disregarded, but a different approach was used for IntCal98. The "decadal" values were Evolution of 14 C Calibration 529 obtained by averaging all full decadal and sub-decadal results. Samples measured on 20-yr blocks of wood were also included by assigning this age to two decades with the standard deviation multiplied by 1.4. Laboratory error multipliers (k values) and offsets were estimated for each tree-ring dataset based on comparison of results from samples of identical calendar age.
Marine reservoir offsets (R) were estimated for the coral data from Bard et al. (1998), Burr et al. (1998), andEdwards et al. (1993) by comparing the marine 14 C ages to overlapping tree-ring data. The weighted mean R value was 414 ± 31 14 C yr (n =12) for 10,000-8000 cal BP and 509 ± 25 14 C yr (n = 21) for 12,000-10,000 cal BP. While the possibility of reservoir age changes over time was acknowledged it was decided to use a value of 500 14 C yr for the Late Glacial tropical surface ocean. The coral data were found to have higher 14 C variability than the tree rings. To compensate for this the standard deviation of coral 14 C ages was based on a 2σ error in the coral 14 C measurement and an error multiplier of 1.3 was applied. Foraminifera 14 C ages from the Cariaco Basin decreased by 500 and 400 14 C yr for, respectively, 12-10 and 10-8 ka cal BP were matched to the IntCal98 tree-ring data to update the floating varve chronology. The observed scatter in the 14 C ages was similar to that of the tree rings with an error multiplier of k = 1.3. The IntCal98 curve from 12,000 to 24,000 was generated from the reservoir corrected marine data using a spline with minimum smoothing (Stuiver et al 1998a).
Marine98 curve, which extended from 0 to 12,000 cal BP, was generated from the IntCal98 curve using an atmosphere-ocean box diffusion model (Stuiver et al. 1998b) with model parameters as discussed in Stuiver and Braziunas (1993).

THE INTCAL WORKING GROUP 2004 Calibration Curves
When Minze Stuiver retired, I continued the calibration effort with encouragement from Gerry McCormac. Together we received a Leverhulme Trust networking grant to form the IntCal Working Group (IWG) in 2002. The grant funded meetings in Belfast and Woods Hole and a research assistant. Previous and potential data providers, as well as an archaeologist and a statistician, were invited to join the group. At the first meeting in Belfast in 2002 the IWG set out criteria for inclusion of data into the planned IntCal04 calibration curve . It was agreed that a new statistical method was needed, in part to get rid of what Ron Reimer christened the "pig-in-the-python" around 15,000 cal BP (Figure 4). At the second meeting in Woods Hole in 2003, a random walk model was chosen to account for the uncertainty in the calendar ages of samples as well as the uncertainty in their 14 C ages to calculate the curve ). Constant regional offsets were used to incorporate the marine data. The IntCal04 and Marine04 curves were extended to 26,000 cal BP Reimer et al. 2004) but no recommendation was given beyond that because of the disparities between datasets. However, a predictive curve called "NotCal04" (to signify that it was not for calibration purposes) was plotted based on a random effects model ). This paper also demonstrated the erroneous conclusions that could result from attempts to calibrate radiocarbon ages in this time frame with alternative datasets, such as a smooth curve proposed by van Andel (1998) based on geomagnetic records, and caution was urged. Nonetheless the NotCal04 curve was used for calibration of Palaeolithic radiocarbon dates such as those from Chauvet cave (e.g., Mellars 2006). van Andel (2005) argued that 530 P J Reimer alternative datasets and calibration methods such as CalPal (Jöris and Weninger 1998) should not be discouraged due to the need to calibrate radiocarbon ages beyond the 26,000 cal BP. He also disputed the importance of the IntCal consensus calibration to the community. Fortunately, this issue was resolved by further updates to the IntCal curves extending to close to the limits of radiocarbon dating and increasing community involvement in the process. In addition to IntCal04 and Marine04, a preliminary calibration curve for the Southern Hemisphere atmosphere, SHCal02 (McCormac et al. 2002) was extended to 11,000 cal BP using the tree-ring portion of the IntCal04 curve with a modeled offset between the Northern and Southern Hemisphere ).

Updates
In 2009 the IntCal09 and Marine09 calibration curves (Reimer et al. 2009) were extended to 50,000 cal BP using new data from U-Th dated coral (Fairbanks et al. 2005) and from marine sediment with timescales transferred through climatic correlation to the U-Th dated Hulu Cave speleothem (Wang et al. 2001). These "tie-pointed" records included the non-varved portion of Cariaco Basin (Hughen et al. 2006) and the Iberian margin ). The Marine09 curve was taken from the Marine04 curve for 0-12,500 cal BP whereas the older portion was derived from the IntCal09 curve with a constant 405 14 C yr offset. A Bayesian evaluation of the interhemispheric offset was undertaken but SHCal04 was not updated at this time ). From a statistical perspective, the random walk model for curve construction was considerably developed for the 2009 update. The IntCal09 extension was the first time a fully Bayesian approach, achieved via Markov Chain Monte Carlo (MCMC), had been used to estimate the calibration curve (Blackwell and Buck 2008;Heaton et al. 2009).

Calibration Curves
For the 2013 curves, three scientists were elected by the community to the IntCal Oversight Committee to provide independent expertise and assessment. New tree-ring measurements were included in the IntCal13 curve most importantly from the floating Late Glacial Pine Figure 4 IntCal98 calibration curve from 12,000 to 18,000 cal BP with 2σ error envelope. The "pig in the python" is around 15,000 cal BP (as published in Reimer et al. 2002: Figure 1).
Evolution of 14 C Calibration 531 chronology (Schaub et al. 2008) anchored by a 14 C wiggle-match as described by Hua et al. (2009) which extended to 14,200 cal BP. The density of data beyond the tree rings was greatly increased compared to IntCal09 due to the incorporation of 14 C measurements from speleothems from the Bahamas (Beck et al. 2001;Hoffmann et al. 2010) and Hulu Cave, China (Southon et al. 2012) and from terrestrial plant macrofossils from Lake Suigetsu, Japan (Kitagawa and van der Plicht 1998;Bronk Ramsey et al. 2012). There were also new coral data from Tahiti (Durand et al. 2013) and a "tie-pointed" record from the Pakistan margin Heaton et al. 2013). Marine reservoir ages were determined for each carbonate dataset from the offset with overlapping tree-ring data, where possible, and kept constant beyond the end of the tree rings. Comparison of Cariaco Basin and Barbados data to overlapping tree-ring data suggested an abrupt drop in reservoir ages during both the Younger Dryas and Heinrich Stadial, therefore those data were not used in IntCal13 and Marine13. The random walk model for curve construction was developed further  to account for the dependencies in the calendar age estimates of the samples within both the Lake Suigetsu and the "tie-pointed" records; and the wiggle-matched tree-ring measurements. Discrepancies between the diverse datasets used to construct the curve between 13,900-50,000 cal BP were accommodated by allowing for a potential additive increase in the uncertainty of the data. This resulted in a smoother character for the IntCal13 mean curve in this older time period. The increase in the volume of data for 2013, together with the complex dependencies in the calendar age chronologies of the constituent records, meant that the random walk approach had reached its practical limit. It was decided that, for the future, a new method that enabled more flexibility and greater potential to investigate the effect of specific modeling and dataset choices was needed.
The SHCal13 curve was constructed with additional Holocene tree-ring data including a Tasmanian Huon pine floating chronology from 12,679 to 12,073 cal BP (Hua et al. 2009). For portions of the curve without Southern Hemisphere tree-ring measurements, the IntCal13 curve with an atmospheric radiocarbon offset between the Northern and Southern hemispheres (N-S offset) of 43 ± 23 yr was used, which extended SHCal13 to 50,000 cal BP (Hogg et al. 2013).

Calibration Curves
In 2013 IntCal focus groups were instigated to promote wider engagement with the IWG and provide additional expertise on various topics. As requested by the IWG, the statistical approach to curve construction was completely redesigned for the 2020 calibration curves (Heaton et al. 2020a). The new methodology utilized Bayesian splines. MCMC was still used to provide the curves but the new Bayesian spline approach allowed much faster implementation and more reliable curve estimation than had been possible with the previous random walk.
The spline method of construction creates a calibration curve based upon a trade-off between the quality of the curve's fit to the calibration samples, and its roughness/wiggliness. This penalty on roughness aims to prevent over-fitting and the introduction of potential spurious variability. Fit to the calibration data was assessed in F 14 C where measurement uncertainties are symmetric, while roughness was penalized in Δ 14 C space. The method was refined to include an additive error term to account for any potential additional variability in observed tree-ring 14 C measurements and to explicitly recognize rapid atmospheric 14 C change events such as the AD 774-775 event (Miyake et al. 2012).
In the older portion of the curve, from 13,900-55,000 cal BP, much work was done by the group to resolve the discrepancies between the various speleothem, ocean, and lacustrine datasets (Butzin et al. 2017;Heaton et al. 2020a;Hughen and Heaton 2020). To deal with any discrepancies that may remain, "heavy-tailed" errors were used to reduce the influence of possible outliers.
A total of 220 Holocene tree-ring datasets were considered for possible inclusion in the IntCal20 curve and screened for suitability . Improvements in the linkages of the European Preboreal Pine and the Swiss chronologies were reinforced by the inclusion of single-year subfossil pine trees from the French Alps and allowed a fully atmospheric record to ca. 13,910 cal BP. The Hulu Cave speleothem dataset, which had been extended to 53,900 cal BP (Cheng et al. 2018), provided the backbone for the curve beyond the dendrochronologically dated and 14 C wiggle-matched tree-ring measurements ca. 14,000 cal BP. Floating tree rings from Northern Italy and kauri from New Zealand provided higher resolution details to the curves further back in time. The Lake Suigetsu macrofossil data on an extended and improved varve timescale were also included (Bronk . Because many of the U-Th coral datasets older than 25,000 cal BP exhibited rather high variability, these data were excluded from IntCal20. For marine data, rather than using a constant offset from the atmospheric curve as had been done in the past, the Hamburg Large Scale Geostrophic model was used to estimate the large variations in the marine reservoir offset over time for each region ) using the Hulu 14 C record as input. For further details of corrections and adjustments to the datasets, see Reimer et al. (2020) and references therein. The IntCal20 curve is more detailed than IntCal13, in part because of increased data coverage and in part due to the efforts made by the group to develop the methods that synthesized the diverse data. In particular, the work to resolve the differences between the various datasets from 13,900-55,000 cal BP has meant we are hopefully better able to discern common signal in this period. While there are some other differences between IntCal20 and IntCal13 in the Holocene, the major changes are from the Late Glacial through to the end of the curve at 55,000 cal BP. For example, radiocarbon ages calibrated with the IntCal20 curve between 34,000 and 42,000 cal BP calibrated ages may be up to 700 years older than with IntCal13 and between 42,000 to 50,000 cal BP these may be more than 1000 years younger. This has major consequences for scientific studies in this time range, for example examining the duration of the overlap between Neanderthals and anatomically modern humans in Europe ).
The SHCal20 curve was reinforced with a number of new tree-ring datasets . For time periods where direct Southern Hemisphere measurements were not available, the IntCal20 curve was used with an N-S offset model (Heaton et al. 2020a).
For the Marine20 surface ocean calibration curve the BICYCLE global carbon cycle model, which incorporates a number of components of the carbon cycle (Köhler et al. 2006), was revised to allow atmospheric CO 2 to be specified by an ice core-based reconstruction and atmospheric Δ 14 C by the IntCal20 curve (Heaton et al. 2020b). The resulting model produced a curve with variable marine reservoir offsets similar to the Hamburg Large Scale Geostrophic model. Marine20 is intended to represent the non-polar surface ocean as it has Evolution of 14 C Calibration 533 no provision for ice cover, however, Marine20 should be suitable for calibration of polar samples during the Holocene. It is worth noting that the previous model used for marine calibration curves (e.g., Marine86 through Marine13) also did not include ice cover. Because the parameters used in the model differ from those used in the previous marine curves, the average offset between the ocean and the atmosphere is larger. ΔR values used for regional corrections must be calculated with Marine20. Recalculated known age, prebomb ΔR values are given in the Marine Radiocarbon Reservoir Database (calib.org/ marine) whereas ΔR values for paired marine and terrestrial samples can be calculated with the deltar software (Reimer and Reimer 2017) available on the website.

OUTLOOK
The IntCal Working Group, which is now led by Christopher Bronk Ramsey, has already begun discussions on the next update. Kauri measurements through the Laschamp geomagnetic excursion ca. 41-42 ka cal BP have recently been published (Cooper et al. 2021) and there are new single year tree-ring data coming out soon. It is expected that the statistical methods will not have major modifications for the next update.
In the longer term, methods for calibrating sets of radiocarbon dates with a large subset of the possible realizations generated by the MCMC method are likely to improve "wiggle-matched" dates. As more tree-ring data from different regions with low measurement uncertainty become available it may be possible to provide region-specific terrestrial calibration curves. For the marine curve, the global carbon cycle model could be improved by incorporating timevarying changes to climatic parameters based on palaeoceanographic and palaeoclimatic records, and eventually, with sufficient data, we may be able provide location-specific curves, in particular for the polar regions (Heaton et al. 2020b).

ACKNOWLEDGMENTS
I would like to thank the Leverhulme Trust for a networking grant which enabled the initial IntCal Working Group meetings, the UK Natural Environment Research Council NE/ E018807/1 for funding research leading to IntCal09 and IntCal13, and IGBP PAGES (Past Global Changes) for providing funding for two palaeoscientists on the IntCal Oversight Committee to participate in meetings leading to IntCal13. Pieter Grootes provided valuable suggestions on the manuscript. I am grateful to Tim Heaton for input on the statistical methods used in IntCal09 through IntCal20.