Introduction
A report published in July 2025 by a group of Microsoft researchers, Working with AI: Measuring the Occupational Implications of Generative AI, identified historians as the professional group with the second highest ‘AI applicability’.Footnote 1 With a score of 0.48, this number – which purports to indicate the overlap between a particular job’s typical activities and the extent to which they can be replicated by AI – is an aggregate of three other scores that measure ‘coverage’, ‘completion’ and ‘scope’.Footnote 2 Historians, the report suggests, have a coverage score of 0.91, meaning that the researchers have identified that 91% of a historian’s typical occupational tasks overlap with the capabilities of AI.Footnote 3 In addition, they have a completion score of 0.85, where 85% of those tasks are successfully completed by AI to the satisfaction of the user, and a scope score of 0.56, which indicates the extent to which generative AI can meaningfully translate in practice to performing an individual’s work activities. A cursory reading of this report, therefore, could suggest that historians are heading for obsolescence in the age of AI.
A data scientist would undoubtedly provide a more robust critique of the overarching methodology employed by the report’s authors, while professionals in other fields with high AI applicability scores – which include Interpreters and Translators, Passenger Attendants, Sales Representatives, and Writers and Authors – may also contest their inclusion on the list. In this comment, though, I am specifically concerned with analysing and responding to the report’s conclusions which relate to history and historians. This is important for two reasons: first, the conclusions rest on an enormous oversimplification of a professional historian’s job, which has gone more or less unchallenged in the wider public discourse surrounding the report. However much we might be tempted to dismiss the report out of hand as a piece of deliberate provocation or AI glorification from a company with a substantial vested interested in AI, it remains important to articulate why its conclusions – about historians, at least – are fundamentally mistaken.
Secondly, the report offers an opportunity to reflect on some of the wider prevailing discourses about history and historians, of which this report is just the latest example. Indeed, the report prompts an interesting question: can AI be a historian? Writing from the perspective of a historian, our instinctive answer would almost certainly be negative. We might – as I am about to do – detail the highly specialised training, skills and expertise which professional historians must undertake and possess to recover and analyse the past sensitively, contrasted to the more generalised pre-existing knowledge which AI is able to consume and repackage for the user. Yet, at the same time, we seem increasingly comfortable with applying the moniker of ‘historian’ to a wide variety of practitioners beyond the academically trained, including the authors of popular history books whose approach is quite explicitly a repackaging of existing historical information in accessible, readable and even humorous prose. This has often been cast as an essential part of destabilising the elitist nature of academic history, alongside the drive towards histories which are co-created in participation with community groups and local researchers.
Thus, the report points us towards a curious paradox: that, in making history a participatory discipline with broad and inclusive appeal to a range of practitioners, authors, volunteers and enthusiasts, we must, as academically trained historians, necessarily efface any claim to a privileged position over the tools of historical investigation and the interpretation of the past. If the past belongs to everyone, we must, it seems, eschew any claim over the professional title of historian. On the other hand, if we want to assert the inability of AI to take on the mantle of the historian – and, moreover, if we want to defend the existence of academic historians and PhD training in universities against cuts – we must unavoidably acknowledge academic history as a highly skilled and trained discipline which has, like any profession, its own qualifications, methods and processes, and which therefore cannot simply be done by anyone.
How can we navigate this paradox? How can we highlight that academic historical training is not just important, but foundational, to the complex and pitfall-ridden task of analysing the human past, without falling back on elitist rhetoric that alienates amateur researchers or looks down upon the different, but connected, skills of public history and historical communication? And how can we extol the existence of academic historians as a professional group, and call for the maintenance and expansion of the postgraduate training that inculcates the next generation into their ranks, whilst also acknowledging that neither a PhD nor a university position is exclusively necessary for conducting historical research and possessing historical expertise? My aim here is not to resolve this paradox – for if it is a genuine one, then it cannot be resolved – and neither do I have compelling answers to the questions I have just raised. Nevertheless, acknowledging the existence of this paradox and its implications, which Microsoft’s report brings so clearly into focus, becomes increasingly urgent as the raft of funding cuts, departmental redundancies and now the rise of AI threatens to skewer the academic historical profession.
Historians and Working With AI
How, then, did Working with AI arrive at its conclusion that AI can perform 91% of a historian’s job with 85% accuracy? The answer reveals a great deal about the popular conception of the historian’s trade. The report drew on a dataset compiled from approximately 200,000 anonymised conversations between Microsoft’s AI assistant, Copilot, and its users, in which the users asked Copilot to perform various tasks or provide different types of information.Footnote 4 For the authors, this marks out their findings as significant since, unlike previous reports which only theoretically predicted the extent of AI’s encroachment into the workplace, Working with AI tracks the real-world use of Copilot and its overlap with different occupational activities.Footnote 5 Of course, the anonymisation of this data means that the report makes no distinction between users – it cannot identify, for example, whether historically related tasks requested of Copilot are being posed by a professional historian or a general user seeking historical information, making it almost impossible for the report to identify whether the AI is actually being set a task which a historian would typically perform in their daily work.
Nevertheless, the report’s authors are confident that they have accurately identified a professional historian’s skillset against which to measure Copilot requests. For this, it relies on occupational information compiled by O*NET, an American programme which works under sponsorship from the US Department of Labor to produce information about various jobs, such as their typical activities and the levels of training and experience required to work in them.Footnote 6 Because O*NET’s data are gathered from surveys of actual professionals, it seems to have a fair understanding of the work and skill level of historians. Indeed, as an occupation, ‘historian’ is classified in ‘Job Zone 5’, meaning it requires ‘extensive skill, knowledge, and experience’ with education at postgraduate level and ‘very advanced communication and organisation skills’.Footnote 7 This description suggests that O*NET takes ‘historian’ to mean specifically an academically trained historian, who might typically work in a university but could also be found in other places, such as schools, archives, government departments (as ‘official’ historians), or as independent researchers.Footnote 8
O*NET lists various work activities which constitute the core tasks of a given profession’s work – for historians, these so-called ‘task statements’ include:
Conduct historical research, and publish or present findings and theories; gather historical data from sources such as archives, court records, diaries, news files, and photographs, as well as from books, pamphlets, and periodicals; conduct historical research as a basis for the identification, conservation, and reconstruction of historic places and materials; organize data, and analyze and interpret its authenticity and relative significance; advise or consult with individuals and institutions regarding issues such as the historical authenticity of materials or the customs of a specific historical period; trace historical development in a particular field, such as social, cultural, political, or diplomatic history.Footnote 9
This description may well paint a credible picture of the historian’s trade. But these are not the activities which the Microsoft report uses. O*NET categorises occupational activities hierarchically to facilitate both specific analysis of individual jobs and broad analysis of trends within the job market. Thus, O*NET truncates its core task statements into three further categories: Detailed Work Activities (DWAs), Intermediate Work Activities (IWAs) and General Work Activities (GWAs), with each stage marking a further level of generalisation about work tasks.
Working with AI provides an explanation of what this looks like in the case of economists.Footnote 10 The occupation-specific task of ‘compile, analyze, and report data to explain economic phenomena and forecast market trends, applying mathematical models and statistical techniques’ becomes the DWA of ‘forecast economic, political, or social trends’. This, in turn, becomes the IWA of ‘analyze market or industry conditions’, and the GWA of ‘analyzing data or information’. For O*NET, hitching specific task statements to these increasingly broadening activities enables it to chart parallels between different types of profession, and to understand which general skills move in and out of demand in the workforce at large; but the core tasks remain central to understanding what an individual job entails.
Importantly, Working with AI does not measure the conversations in its Copilot dataset against the job-specific task statements – instead, it measures them against the class of activities at the more generic IWA level. According to the report’s (near-indecipherable) ‘IWA flowchart’ historians have three main IWAs which AI can replicate: ‘research historical and social issues’, ‘present research or technical info’ and ‘gather info from various sources’.Footnote 11 The report’s use of IWAs over task statements means that a great deal of detail about specific professions is lost. For historians, their highly specialised task of ‘gather historical data from sources such as archives, court records, diaries, news files, and photographs’ has been watered down to the much more general ‘gather info from various sources’. AI, of course, can certainly gather information from various sources, provided they are online; but can it gather information reliably from an undigitised archive, from a physical diary, from an oral interview, or through analysis of a material object – the kinds of things that historians actually have to work with? AI can gather information, but if it cannot gather it as a historian might, then it has no overlap with that part of a historian’s job.
In essence, therefore, Working with AI de-skills historians to construct its case. It makes its ignorance of a historian’s job particularly clear by classifying it as ‘knowledge work’ – in other words, historians know things about the past, and since AI can know things about the past too, up to 91% of their work can be completed by it.Footnote 12 Because it privileges general statements about jobs over the activities which make them unique and specialised, the report is able to override an enormous amount of the training and expertise which actually precludes AI from replicating a historian, rather than offering just the most basic facsimile. If Working with AI’s conclusions were along the lines of ‘AI can complete these very general types of workplace activities’, then that, at least, might be a more accurate statement of reality; but instead, the report insists on naming specific occupations with which AI purportedly overlaps, despite refusing to engage with the tasks they actually perform. At the most rudimentary level, Working with AI assumes that any time somebody has asked Copilot for historical information, and Copilot has been able to satisfactorily provide it, that interaction is evidence of Copilot doing the job of a historian.
Indeed, the idea of measuring user satisfaction with Copilot’s response raises a further point of contention, at least as far as historians are concerned. For the authors, ‘satisfactory’ here means both a response from the user – in the form of a ‘thumbs up or down’ to indicate if they were happy with Copilot’s response – and a ‘task completion classifier’ that detects if Copilot was able to offer a response at all, accounting for the fact that not every user gives direct feedback.Footnote 13 The task completion classifier was itself completed with AI; the authors used GPT-4o-mini to scan the conversations and determine whether or not ‘the AI chatbot completed the user’s task’ in the absence of either a thumbs up or down.Footnote 14 Thus, in the provision of AI-assisted historical information and research, not only was the validity of the result determined by the user – who, as we have already established, was not necessarily a professional historian – but the judgement of its completion was pronounced upon by a different AI.
For historians, this approach has several problems. The report makes no comment, for instance, on the accuracy of the information which Copilot provided; a ‘thumbs up’ only indicates that it placated the user, not that the knowledge produced would stand up to any further scrutiny from historical experts. Indeed, we know that AI ‘hallucinates’ false information, and can also be convinced that true information is false – but there is nothing in the report which suggests this was taken into account when determining completion.Footnote 15 Moreover, this approach casts history as a market-driven entity, the reliability of which is determined by appeals to the unspecialised reader rather than the proven expertise of the author and their peers, the traditional model for professional scholarship. Does this raise further questions, then, about whether the satisfaction of the answer was determined by its agreeability to the user? If Copilot delivers information that conflicts with the user’s political or social views but is nonetheless accurate, a ‘thumbs down’ would reflect not its falsity but the user’s discomfort. Conversely, a ‘thumbs up’ might validate inaccurate information simply because it aligns with what the user wants to believe. Not only does the vaunted 85% accuracy appear rather illusory in this context, but the broader approach is totally antithetical to history as a discipline. Working with AI has nothing to say about these kinds of details.
AI can provide historical information, of that there is no doubt – but can it provide historical information of a comparable quality to a professional historian, drawing on the same level of in-depth research, analysis and expertise which historians possess? Because Working with AI deals with anonymised conversations, uses the highly generalised IWA branch of O*NET’s occupational data and measures user satisfaction as the successful completion of a task rather than verifying the provided information’s accuracy, there is no way to properly measure AI’s actual applicability to the job of a historian. Indeed, Working with AI admits as much when it concedes that ‘Copilot is better at the writing and researching parts of knowledge than its analysis.’Footnote 16 But the report’s methods, deliberately or otherwise, de-skill historians away from a job that requires high-level and deeply human analytical skills to one that is tasked solely with the retention and provision of knowledge. Under that flawed rubric, it is little wonder that historians have a high AI applicability score.
Historians in Public
The authors of Working with AI were at pains to emphasise that AI is not currently ‘performing all of the work activities of any one occupation’, and that it ‘would be a mistake’ to conclude that ‘occupations that have high overlap with activities AI performs will be automated and thus experience job or wage loss’.Footnote 17 This admission has not, however, prevented the media from turning the report into one which identifies the jobs in line to be replaced by AI altogether. CNBC proclaimed that Microsoft has categorically determined the top-ten ‘most AI-safe careers’ and, conversely, those which are most at risk of obsolescence.Footnote 18 Articles with similar wording were published by both general news outlets, such as Sky, as well as tech-focused news outlets, such as Gizmodo.Footnote 19 An article by Forbes suggested that those jobs which are ‘at risk’ from AI are united by a ‘heavy reliance on information’, ‘tasks which require data analysis’ and ‘functions that can be performed remotely’.Footnote 20 Conversely, jobs which are ‘AI-safe’ – such as phlebotomists with an applicability score of 0.03, or ‘bridge and lock tenders’ with a score of 0.00 – rely on things which AI cannot replicate, such as ‘physical presence’, ‘specialised training’ and ‘human empathy’, a skill which the article claims only applies to patient-focused interaction, and is not apparently needed by the historian in investigating and understanding people from the past, and constructing sensitive and detached interpretations about them and their actions.
In response, Working with AI’s authors released a corollary to their report, clarifying that their conclusions were only ever meant to highlight ‘where AI might be useful in different occupations’ and how certain occupations ‘may benefit by considering how AI can be used as a tool to help improve their workflows’.Footnote 21 Doubtless, then, this configuration of the report’s findings has some degree of application to historians; indeed, there are already historians using AI for mundane tasks such as oral history transcription, handwriting-to-text conversion, or simply to take a first stab at pulling broad themes from a larger quantity of sources than one researcher might physically be able to process alone.Footnote 22 But it has, nevertheless, entered the mainstream public discourse that historians are at risk from AI, and that their profession is so deeply unspecialised that even the comparatively nascent capabilities of AI can straightforwardly replicate them.
Historians themselves will recognise the ridiculousness of this. If Forbes identified the broad qualities which protect jobs from AI, these can easily be applied to historians too. I already noted ‘human empathy’. Yet proper historical research also frequently relies on ‘physical presence’ – the act of having to go physically to an archive to review the innumerable documents and primary material not yet digitised and out of the reach of AI, conducting oral history interviews, as well as presenting ongoing research at conferences and in other academic spaces. Of course, as detailed earlier, historians undergo often years of specialised training in, for example, archival research, palaeography, and material and oral history methodologies, all to be able to tease out the significance of historical records and offer up their findings for criticism under peer review.
But, as has often been the case with history, it is not only academically trained historians to whom the task of investigating the past falls. Many other disciplines deal with investigating the past, and some encroach on the territory of historians without reference to their particular methodologies and training. AI has the potential to increase this, perhaps putting historians at risk of being left behind. Indeed, AI is already being deployed by academics without historical training to ‘solve’ the so-called mysteries of the past where ‘traditional’ historical research methods have allegedly failed to provide answers. A recent study in Nature claimed to have finally resolved a long-standing historiographical debate about the reasons behind the emergence of the Great Fear in 1789, a wave of societal unrest in rural France at the onset of the French Revolution. The study concluded that the Great Fear was the product of rational rumour-spreading rather than an emotional, spontaneous response.Footnote 23 The authors reached this assessment not by sustained archival research of contemporaneous records conducted in full light of recent historical scholarship, but by using ChatGPT to extract information from Georges Lefebvre’s 1932 The Great Fear of 1789: Rural Panic in Revolutionary France, which they then plugged into an epidemiological model to show how rumours spread between towns along certain high-traffic routes, just as a disease might.Footnote 24 ‘Historians and epidemiologists join forces’, declared Popular Science in its coverage of the article’s conclusions – but aside from Lefebvre, who passed away in 1959, no professional historians were involved in the study. The listed authors were all epidemiologists, mathematicians and economists.Footnote 25 The presentation of the article’s findings as a groundbreaking historical discovery seemingly strengthens the Microsoft report’s general argument that AI can take on the mantle of historical investigation – for now under the supervision of non-historians, but perhaps eventually by itself. And yet, the certainty with which the authors claimed to have laid this ‘mystery’ to rest on the basis of an AI-derived dataset – recovered from one ninety-year-old book – would not hold up to historical scrutiny.
For a profession already embattled by cuts to PhD funding and departmental closures, the rise of AI – and indeed the potential for non-historians to use AI to investigate the past – is a new and concerning threat.Footnote 26 But could it be that this latest challenge is part of a longer-term erosion of historians’ training, skills and expertise? The public has grown quite used to seeing the title ‘historian’ applied to any number of different individuals and practitioners who do not hold PhDs or exist in academic spaces. Gerard DeGroot, in reviewing the comedian David Mitchell’s book Unruly: A History of England’s Kings and Queens, praised Mitchell as ‘a skilled historian’, despite Mitchell himself declaring that he is ‘not a professional historian’.Footnote 27 Alice Loxton, whose trajectory to historical publication began with success on social media, has been hailed as leading the charge of ‘an exciting new generation of historians’.Footnote 28 The social media platform Reddit hosts an enormously popular forum entitled ‘r/AskHistorians’, where users pose historical questions along the line of ‘why couldn’t Napoleon defeat the Royal Navy?’, and where the anonymous replies from ‘historians’ are not moderated by proven expertise or qualification but by the respondent’s ability to conform to the site’s style guide.Footnote 29 History written in this vein is closer to the type of history which Working with AI posits; a repackaging of existing information, drawing on skills of synthesis and accessible prose rather than deep archival research and sensitive interpretation. If Mitchell and Loxton are historians, why not AI too?
In the public realm, historians take many forms. In the language of public relations – a field in which I fleetingly worked – we might identify history as having a lack of brand cohesion, and indeed one which is more acutely felt than in other disciplines. In public conceptions of science, for instance, there is a clearer (albeit sometimes tenuous) distinction between academic scientists and popular science writers and content creators, who are more typically referred to as science communicators.Footnote 30 Historians, though, are trained academics and autodidactic researchers, celebrity television broadcasters and museum curators, social media influencers and respondents to questions online. They are regularly cast as individuals who possess and communicate existing knowledge of the past, and perhaps increasingly less as those trained to generate fresh knowledge reliably via careful research and, through skilled analysis, translate it to applied expertise. It is little wonder, then, that reports of funding cuts and AI displacement do not resonate with the public as the slow and concerning erosion of an important profession; quite apart from the fact that some people actively cheer this development, in all other aspects beyond the academy, history and historians – as the public might recognise them – are many and varied, alive and well.Footnote 31 For the academic, this can cause a sense of frustration; that even as history sells well and commands public attention, the bedrock of academic historical research and expertise that so often sustains these other types of history is undermined by employment and financial insecurities.
At the same time, academic historians are caught in a paradox. The distinction between scientists and science communicators does not work for history because, as already noted, the past belongs to everyone. Laying claim to the tools and techniques of historical investigation could see an unwelcome return to the academic elitism of earlier decades. Moreover, recovering hidden and marginalised voices within our history necessarily entails serious and meaningful participation with local and community researchers who may possess specific knowledge and expertise.Footnote 32 As a recent article in the History Workshop Journal pointed out, history can and should be creatively co-produced, particularly when it coincides with regional and identity groups of which the academic possesses little understanding.Footnote 33 Sometimes this kind of participatory history can be transformative; for example, bringing the methodologies of genealogical research – previously a fringe, amateur endeavour – into the academic fold has given rise to family history as an important historiographic field which has, in turn, reshaped our understanding of topics such as British imperialism.Footnote 34
History is so tightly woven with formations of identity at personal, national and transnational levels that it quite clearly must be an expansive practice that listens to a range of voices and takes stock of more localised expertise. Margot Finn and Kate Smith’s edited volume, The East India Company at Home, perhaps offers an example of how this can work in practice. It was a central goal of their project to ‘[unite] researchers from a wider spectrum of domains – local and national archives and libraries, heritage organisations, the buoyant community of independent family historians, the museum sector and universities’.Footnote 35 The book sought to ‘enrich conventional academic narratives of empire and its domestic impact with knowledge co-produced across the artificial boundaries that typically demarcate university-based and public history’.Footnote 36 As the roster of contributors indicates – with chapters from academics, museum professionals, volunteers based at country houses, local and family history researchers, and an historical novelist – important historical knowledge exists in a wide variety of places, even if it might require the academic historian to oversee the project and edit it all together into something revelatory. This should not diminish the role which non-academics can play in shaping the historical discipline and broadening our understanding of the past; equally, it should not diminish the centrality of academic expertise to developing robust and sensitive arguments that deal more concretely in causation, significance and effect. In the public realm, at least, and clearly as far as AI is concerned, the latter set of skills seems to be conflated with merely knowing and communicating facts about the past; the profusion of historians in such varied forms, and with such varied training, attests to this. Yet history – as E. H. Carr has taught generations of historians – is more than an endless collection of facts. Not every interpretation of history is equally valid – indeed, some can be dangerously politically charged – and so it should be important to maintain a strong professional cohort of historians trained with the skills to assess them.
How, then, can we stake out the importance of academic historical training and expertise as something which, first of all, exists as a separate roster of skills from those of historical knowledge possession and communication and which, secondly, cannot be undertaken lightly by the untrained, or by AI, without also succumbing to elitist rhetoric that privileges academic historians above all others? How, too, can we amplify the foundational nature of academic historical research to our knowledge of the past without rehashing tired debates over the relative importance of academic vs public history?
In many ways, the rise of AI exposes these problems more clearly than ever, but their resolution remains elusive. Asserting the inability of AI to take on the role of the historian also necessarily elevates the expertise of the academically trained – it is to make the distinction, for example, between simply ‘gathering info from various sources’ and detailing the precise nature and manner in which that information was gathered, and unpicking its relevance to inform a cogent analysis of historical events. For academic historians, it is that specificity of their method which should shield the profession from AI. Yet that very distinction also separates them from others engaged in historically related work in ways that are never straightforward and may, at times, prompt awkward questions and conversations. Perhaps it has never been more important to grapple with them.
Acknowledgements
I am grateful to Professor Helen McCarthy, Molly Groarke, Dr Jan Machielsen and the two anonymous peer reviewers for their comments and suggestions on earlier drafts of this article.