GenAI and synthetic foresight at the brink: The future of nuclear crisis decision-making

James Johnson

doi:10.1017/eis.2026.10063

GenAI and synthetic foresight at the brink: The future of nuclear crisis decision-making

Published online by Cambridge University Press: 05 May 2026

James Johnson

Show author details

James Johnson*: Affiliation:
Department of Politics and International Relations, University of Aberdeen, King’s College, Aberdeen, UK
*: Email: james.johnson@abdn.ac.uk

Article contents

Abstract
Introduction
GenAI as a synthetic cognitive actor: Blurring the boundary between prediction and action
Strategic risks and epistemic distortions: An algorithmic procrustean bed?
Strategic stress test: Crisis scenarios in the age of synthetic foresight
Conclusion
Financial statement
References

Rights & Permissions

Abstract

This article examines how generative AI (GenAI) is reshaping strategic crisis decision-making through the emergence of ‘synthetic foresight’ – the algorithmic simulation of adversary intentions, escalation pathways, and imagined futures under conditions of uncertainty. Unlike traditional practices such as early warning, scenario planning, wargaming, or red teaming, which discipline strategic imagination through structured engagement with uncertainty, GenAI functions as a synthetic cognitive and strategic actor, shaping how leaders anticipate, interpret, and respond to crises in real time. While the implications of GenAI span multiple domains, this study focuses on nuclear crises as the most acute and consequential test of these dynamics. The article identifies three interrelated risks: the normalisation of low-probability escalation pathways, the misattribution of adversarial intent cloaked in algorithmic certainty, and the emergence of synthetic feedback loops that can transform foresight into a driver of escalation. Together, these dynamics may generate self-fulfilling escalatory prophecies, undermining crisis stability as simulated futures begin to shape the behaviours they were intended to anticipate or prevent. The article theorises synthetic foresight as a distinct epistemic force – not merely a predictive aid, but a transformative influence on how strategic futures are imagined, interpreted, and acted upon.

Keywords

crisis management decision-support systems deterrence generative artificial intelligence intelligence nuclear war strategic surprise

Information

Type: Research Article
Information: European Journal of International Security , First View , pp. 1 - 22

DOI: https://doi.org/10.1017/eis.2026.10063 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2026. Published by Cambridge University Press on behalf of The British International Studies Association.

Introduction

Advances in generative artificial intelligence (GenAI) are transforming how states anticipate and manage strategic crises.Footnote ¹ No longer limited to analysing discrete data inputs – such as radar tracks indicating a missile launch, troop movement alerts, or changes in alert status – AI-enabled decision-support systems (AI-DSS), including large language model–based tools, can synthesise complex intelligence and generate scenario-based outputs.Footnote ² GenAI-enabled systems extend these capabilities by producing complex, real-time simulations of adversary intentions, escalation pathways, and alternative courses of action, often presented in immersive and emotionally compelling formats. This capability represents more than a technical upgrade to existing nuclear command, control, and communications (NC3) networks; it signals a profound epistemic shift in how strategic planning is conceived, how risk is perceived, and how decisions are made under crisis conditions. This article examines how GenAI-DSS may transform nuclear crisis decision-making by altering strategic imagination, shaping perceptions of escalation, and influencing human–AI interaction in future high-stakes environments.

The article argues that while GenAI offers enhanced predictive power and anticipatory insight, it may also improve early threat detection, enhance situational awareness, and provide decision-makers with faster, more adaptive responses to events. However, it simultaneously creates systematic distortions that threaten crisis stability. By blurring the line between strategic forecast and foresight, and between anticipation and action, GenAI risks transforming strategic planning from a process of exploration into one of algorithmic prescription. In doing so, these systems do not merely augment human analysis and judgement; they actively shape how decision-makers perceive and interpret emerging threats. GenAI can shift strategy from a deliberative practice grounded in uncertainty, ambiguity, and human intuition to a mechanised process in which outputs are treated as definitive, narrowing the space for human deliberation, bargaining, and de-escalation.

The article identifies three key mechanisms through which GenAI-DSS can undermine crisis stability. First, normalisation of low-probability, high-impact escalation paths, whereby catastrophic scenarios – such as nuclear confrontation – become embedded in planning, doctrine, and strategic culture, rendering them increasingly ‘thinkable’. Second, misattribution of adversarial intent, as ambiguous or contradictory actions are reframed through the illusion of algorithmic certainty, leading to false interpretations of hostile intent. Third, the emergence of synthetic feedback loops, in which machine-generated scenarios begin to shape the behaviour of both sides, turning simulated crises into self-fulfilling escalation. Together, these mechanisms erode judgement, compress decision timelines, and destabilise deterrence in ways traditional nuclear theory has not anticipated.

Nuclear crisis decision-making and escalation scholarship emphasises misperception, signalling failure, crisis bargaining, and escalation under conditions of uncertainty.Footnote ³ Yet this work assumes instability arises from ambiguous and incomplete information. Today, decision-makers confront a different challenge: an abundance of AI-generated strategic futures rather than information scarcity. As GenAI simulates adversary intentions and escalation pathways, the problem shifts from interpreting uncertain signals to navigating synthetic scenarios that may reshape risk perception and strategic choice.

This article contributes to the literature on intelligence failure, nuclear deterrence, and crisis decision-making by theorising ‘synthetic foresight’ as a distinct epistemic force.Footnote ⁴ It also builds on emerging scholarship examining AI in military decision-making and nuclear stability, extending these debates by identifying the distinct epistemic risks introduced by generative AI-enabled simulation.Footnote ⁵ Prior research has examined how cognitive bias, organisational dysfunction, and technological surprise shape crisis decision-making.Footnote ⁶ This study extends that work by showing how GenAI, by disrupting core features of crisis decision-making, can reshape the psychological and institutional foundations of nuclear strategy. It also engages current debates over AI governance, emphasising that risks arise from not only technical flaws or ‘bad data’ but also the ways GenAI-DSS, embedded within a broader socio-technical context, mediates human perception and imagination in high-stakes settings.Footnote ⁷

Recent scholarship suggests that AI-enabled tools may assist crisis managers in cultivating empathy, perspective-taking, and de-escalatory policy options during nuclear confrontations.Footnote ⁸ Holmes and Wheeler argue that AI’s ‘detached and unemotional analysis’ can enhance interpretive understanding and widen the space for restraint.Footnote ⁹ By contrast, this article examines GenAI systems capable of simulating adversary intent and escalation pathways. It contends that, under conditions of acute uncertainty, such outputs may narrow rather than expand perceived options by rendering extreme escalation trajectories cognitively salient and strategically plausible, thereby producing distinct epistemic dynamics.

The article is organised into three sections. The first section distinguishes traditional forecasting from foresight and explains how and to what effect GenAI merges these distinct epistemological functions. The second develops the concept of ‘synthetic foresight’ and identifies three mechanisms by which it may destabilise decision-making during nuclear crises. The third section employs fictional scenarios to illustrate the potential implications of synthetic foresight for future crisis decision-making, offering a novel, reflexive critique of how GenAI both constructs and constrains strategic imagination.

GenAI as a synthetic cognitive actor: Blurring the boundary between prediction and action

The integration of GenAI into nuclear crisis decision-making represents more than a technical enhancement in predictive modelling; it signals a fundamental epistemic shift in how strategic futures are conceived, evaluated, and acted upon. Traditionally, forecasting has relied on structured data, probabilistic analysis, and historical modelling to estimate the likelihood and implications of near-term events.Footnote ¹⁰ In the nuclear context, this involves drawing on bargaining theory and historical patterns to produce probabilistic assessments of adversary actions and potential escalation pathways within bounded time horizons.Footnote ¹¹

Strategic foresight, by contrast, is an exploratory discipline – an imaginative effort to prepare for surprise under conditions of deep uncertainty.Footnote ¹² In the nuclear domain, where data are sparse and outcomes non-linear, this distinction is crucial: forecasting informs probabilistic anticipation, while foresight cultivates adaptive preparedness. GenAI increasingly blurs, and in some cases collapses entirely, this distinction. By generating simulations that are both statistically coherent and emotionally resonant – triggering strong affective responses in human users – GenAI systems do not merely augment exploration; they also enhance the emotional impact of the experience. Moreover, they actively shape perception and guide decisions, collapsing the boundary between prediction and prescription and fundamentally transforming the dynamics of nuclear crisis decision-making.

This section examines the epistemic consequences of that transformation, tracing how GenAI’s fusion of forecast and foresight reconfigures not only the content of strategic anticipation but also the cognitive and institutional conditions under which nuclear decisions are made. It considers how these systems may alter the function of NC3, by not simply accelerating early warning or enhancing situational awareness but also reframing how uncertainty, judgement, and strategic surprise – events that involve great consequence to the surprised party – are interpreted and enacted at the brink.Footnote ¹³

From forecasting to anticipatory systems

Although computational modelling and simulations have long been used to identify correlative patterns and make causal inferences about potential geopolitical flashpoints, the advent of big data and AI techniques – especially machine learning – has dramatically expanded their scope, speed, and fidelity.Footnote ¹⁴ AI-powered geospatial systems that provide essential spatial awareness for command-and-control operations through maps, terrain analysis, and tracking of friendly and enemy force positions now monitor global developments, analyse the underlying drivers of conflict, and detect anomalies that may escape human analysts.Footnote ¹⁵ More recently, large language models (LLMs) have been used to process social media, imagery, and historical datasets to generate real-time intelligence assessments and inform possible courses of action.Footnote ¹⁶

In the United States, for example, Palantir promotes its Artificial Intelligence Platform (AIP) as a system that integrates diverse data streams and employs machine learning and large language models to enhance commanders’ situational awareness and target recognition capabilities.Footnote ¹⁷ These outputs can be visualised in command centres through unified display interfaces, including interactive maps, dashboards, and layered visualisations integrating geographic, temporal, and threat data.Footnote ¹⁸ Open-source reporting similarly indicates that Rhombus Power has used AI-driven analysis to forecast major military developments, including Russia’s 2022 invasion of Ukraine, and has explored providing Taiwan with comparable early warning capabilities related to potential Chinese military activity.Footnote ¹⁹

In the context of NC3 systems, AI-enabled predictive analysis offers new capabilities for generating a range of threat scenarios, enhancing early threat detection, strengthening strategic warning systems, and providing decision-makers with real-time situational awareness and ‘adaptive targeting’ support to generate threat-level assessments and dynamic target-priority lists.Footnote ²⁰ As the first line of defence, strategic warning provides decision-makers with critical insights into the global threat environment, guiding strategy and resource allocation.Footnote ²¹ It relies on a network of sensors – such as over-the-horizon and early warning satellites, radar systems, and intelligence platforms – to monitor operational activity and detect anomalies, including missile launches, ballistic trajectories, or unusual shifts in adversary military posture. These data streams feed continuously into decision support systems, ensuring that early warning and situational awareness are maintained in real time. The Chinese military, for example, has developed an AI-DSS called ‘StarSee’, which integrates intelligence and situational awareness to assist commanders in making informed decisions.Footnote ²²

In recent testimony before the Senate Armed Services Committee, General Anthony J. Cotton, the former commander of US Strategic Command (STRATCOM), underscored the importance of advanced data integration and decision-support capabilities in ongoing NC3 modernisation efforts.Footnote ²³ AI-enhanced systems can process this sensor data using advanced time-series analysis and pattern recognition techniques, enabling faster and more accurate differentiation between routine activity and potential nuclear threats.Footnote ²⁴ Recent reports indicate that analytic tools were used during a recent India–Pakistan crisis to clarify activity at Pakistani military bases, reducing the risk that routine movements could be misinterpreted as nuclear escalation preparations.Footnote ²⁵

Such developments underscore the potential of AI to detect early indicators of strategic instability, flag emerging risks, filter false alarms, and extend decision-making timelines – thereby buying time for informed decisions, calibrated deterrence, effective signalling, or de-escalation under uncertainty. NATO, for example, emphasises ‘strategic anticipation’ and preparedness as core components of its deterrence posture, while Alliance initiatives on AI technologies aim to integrate AI-enabled capabilities to support crisis management and collective deterrence.Footnote ²⁶ GenAI, by contrast, represents a fundamental departure from these traditional systems. Rather than merely aggregating and displaying discrete data inputs, it actively generates synthetic scenarios and simulations that can shape perceptions and decisions, fundamentally transforming both the dynamics of crisis decision-making and the role of humans within it.

Despite its promise, scepticism persists regarding AI’s ability to accurately predict conflicts or pinpoint the precise triggers of crises – primarily due to two fundamental challenges: limited, fragmented data and the inherent difficulty of anticipating leaders’ intentions and decision-making processes.Footnote ²⁷ Whereas domains such as economics or weather forecasting lend themselves more readily to quantitative prediction, strategic crises are far less amenable. Intelligence analysts often confront fragmented or contradictory data – radar anomalies, geospatial data, satellite imagery, or social media feeds – that can sustain multiple interpretations, reinforcing preconceptions and deepening ambiguity.Footnote ²⁸ As Betts warns, intelligence failure often occurs ‘when ambiguity aggravates ambivalence’.Footnote ²⁹ In the nuclear realm, these challenges are further compounded by the scarcity of publicly available data on crisis decision-making and the absence of historical datasets on nuclear conflict.

Genai and the illusion of certainty and objectivity

GenAI blurs the boundary between forecasting and foresight, introducing several epistemic risks. One is cognitive outsourcing to AI, particularly when probabilistic rankings generate what Daniel Kahneman calls the ‘illusion of validity’, encouraging unwarranted confidence in seemingly precise estimates. Another is the erosion of critical thinking, as synthetic scenarios are mistaken for empirically validated predictions rather than exploratory constructs. A further danger lies in the manipulation of belief, especially when outputs align with political preferences or deploy emotionally compelling imagery that feels vivid, intuitive, or fear-inducing. Finally, GenAI can frame synthetic scenarios and simulations as optimal – or even inevitable – courses of action, narrowing the space for deliberation and alternative judgement.

GenAI systems both inherit and amplify these epistemic risks. Designed to enhance foresight, they can overwhelm human judgement with synthetic noise, algorithmic bias, and even deliberate deception.Footnote ³⁰ More subtly, they may embed hidden strategic assumptions within their training data or generate forecasts so polished and internally consistent that they appear beyond critique. As Richard Betts cautions, avoiding intelligence failure requires challenging preconceptions, yet leaders cannot function without them.Footnote ³¹ This creates a paradox: while analytic rigour is essential, eradicating preconceptions would deprive decision-makers of the very heuristics they need to act under the intense stress and time pressure of a crisis.Footnote ³²

In the GenAI era, this paradox becomes even more acute. GenAI systems simultaneously embed and obscure machine preconceptions, creating an illusion of neutral, scientific objectivity. This dynamic can produce epistemological dissonance in military operations, whereby leaders confront a growing tension between two incompatible ways of knowing: human intuition and experience versus the apparent precision and providence of machine-generated analysis. The illusion of precision and rationality further constrains the strategic imagination of human operators, narrowing their perceived range of options precisely when human adaptability, intuition, and creativity are most needed. In this sense, GenAI does not eliminate uncertainty or the prospect of strategic (or intelligence) failure; it merely repackages them in probabilistic form, with excess data and synthetic outputs exacerbating, rather than resolving, the ambiguity and ambivalence of intelligence assessments.

Such conditions also heighten the danger of ‘overfitting’ – when algorithms internalise training data too rigidly, they struggle to adapt to novel or unforeseen situations.Footnote ³³ A similar tendency exists among human analysts, who often ‘overlearn’ from history: in their effort to avoid repeating past mistakes, they may overcorrect and commit the opposite error.Footnote ³⁴ This dynamic compounds a fundamental trade-off at the heart of intelligence analysis: between overstating high-probability threats that never materialise (false alarms) and underestimating low-probability events that do occur (misses).Footnote ³⁵ In the nuclear domain, this could manifest as the reinforcement of specific escalation trajectories or rigid decision-making patterns, while simultaneously amplifying analysts’ cognitive biases and the complex trade-offs they must navigate under extreme uncertainty.

Perfecting the tools of the intelligence trade, whether through advanced algorithms or refined analytic procedures, is no panacea for these deep-seated cognitive and institutional pathologies.Footnote ³⁶ As Richard Betts writes, ‘making warning systems more sensitive reduces the risk of surprise, but increases the number of false alarms, which in turn reduces sensitivity; the principles of optimal analytic procedure are in many respects incompatible with the imperatives of the decision process’.Footnote ³⁷ Jervis observed that intelligence can mislead as readily as it informs, primarily when ambiguities are glossed over or dismissed and alternative interpretations excluded.Footnote ³⁸

In the GenAI era, this distortion becomes automated, with probabilistic authority ceded to a single interpretation of events at the expense of competing perspectives. Even where multiple AI systems converge on similar assessments, such agreement may reflect algorithmic conformity rather than independent validation, creating a false sense of consensus while obscuring uncertainty and increasing the risk of strategic surprise.Footnote ³⁹ Clausewitz’s warning remains prescient: ‘Many intelligence reports in war are contradictory; even more are false, and most are uncertain.’Footnote ⁴⁰ GenAI does not resolve these challenges; it merely obscures them behind probabilistic formality.

According to Betts, a central problem in intelligence assessment is the lack of reliable metrics for weighing failures against successes; there is no way to know whether frequent successes on minor issues should be reassuring when offset by even a few failures on matters of critical importance.Footnote ⁴¹ Michael Handel has termed this phenomenon the ‘paradox of the self-negating prophecy’ – when intelligence assessments succeed in preventing a threat, their effectiveness can make the danger appear illusory in hindsight, creating the impression that resources devoted to countering it were unnecessary.Footnote ⁴² Proving otherwise requires counterfactual reasoning – demonstrating that, absent preventive action, outcomes would have been far worse.Footnote ⁴³

Amy Zegart identifies what she terms the ‘seven deadly biases’ – confirmation bias, optimism bias, availability bias, fundamental attribution error, mirror imaging, framing bias, and group – and shows how the institutional culture of intelligence agencies can amplify these biases, increasing the risk of misinterpretation and failure.Footnote ⁴⁴ For example, Russia’s Soviet-inherited intelligence sclerotic culture – marked by hierarchical deference, political conformity, and a culture that discouraged critical evaluation – produced distorted assessments that underestimated Ukrainian resistance and Western resolve, contributing to Moscow’s strategic miscalculations at the outset of the Russia–Ukraine war.Footnote ⁴⁵

Philip Tetlock’s decades-long research on expert judgement echoes this concern, demonstrating that individuals vary significantly in their predictive accuracy and willingness to update their beliefs – and thus avoid overconfidence in any one theory, scenario, or AI-generated output.Footnote ⁴⁶ ‘Hedgehog-like’ types – those who rely on a single overarching theory or framework – consistently underperform compared to ‘fox-like’ types who draw on diverse perspectives and update their beliefs in light of new information. GenAI risks amplifying hedgehog-like rigidity in human–AI interactions by generating simulations that present a dominant narrative or future scenario with algorithmic authority, thereby compressing deliberation time and reinforcing closed-mindedness – rejecting dissonant alternatives – hubris, and overconfidence in AI-generated outputs.Footnote ⁴⁷ Recent studies show that even experts often cede their own judgement when AI outputs are presented with high confidence or appear qualified (or ‘objective’), exhibiting a hedgehog-like epistemic rigidity by becoming overly reliant on a single model or interpretive framework.Footnote ⁴⁸

The problem of overconfidence in AI outputs is compounded by automation bias, the well-documented tendency of human operators to defer to algorithmic recommendations even when contradictory or disconfirming evidence is available.Footnote ⁴⁹ In high-stakes contexts such as aviation, medicine, and military operations, studies show that automation bias leads users to accept false positives, ignore contradictory cues, miss anomalies, and fail to cross-check automated advice against alternative sources.Footnote ⁵⁰ These effects are especially pronounced under the intense stress and time pressure of nuclear crises.Footnote ⁵¹ Recent studies show that individuals with limited AI exposure tend to distrust it, while those with moderate familiarity often over-rely on it, creating a dangerous calibration gap in high-stakes human–AI interaction.Footnote ⁵²

Although GenAI can, in principle, mitigate users’ overconfidence – for instance, by generating alternative contingency scenarios that challenge prevailing assumptions or by providing detailed explanations and decision rationales – it simultaneously risks inducing under-confidence, encouraging excessive caution by amplifying uncertainty or cognitively overburdening users.Footnote ⁵³ Such caution will likely increase tensions because the commander needs speed in decision-making. For example, a 2024 Carnegie Endowment-hosted Taiwan crisis wargaming simulation showed that leaders delayed decision-making as they paused to scrutinise the reasoning behind AI-generated recommendations.Footnote ⁵⁴ Taken together, these opposing tendencies may disrupt the delicate calibration of trust in nuclear decision-making systems, leaving users vulnerable to both over-reliance on and undue hesitation towards GenAI outputs.

A related problem stems from the anthropomorphic design of algorithms, which mimic human reasoning, emotions, morality, and judgement.Footnote ⁵⁵ By treating machines as if they were autonomous moral agents, human operators risk placing misplaced trust in GenAI outputs in human–AI interactions.Footnote ⁵⁶ This dynamic is reinforced – and, arguably, encouraged by design – through the sycophantic tendencies of LLMs, which align their outputs with users’ beliefs or preferences even at the expense of accuracy.Footnote ⁵⁷ Such behaviour reinforces confirmation bias – the tendency to seek supporting evidence for a hypothesis – and, in extreme cases, can contribute to chat-induced delusion or even psychosis.Footnote ⁵⁸ Recent evidence that advanced LLMs can engage in deliberate deception and scheming further compounds this danger, obscuring AI’s true capabilities or intentions in human–AI interactions.Footnote ⁵⁹ These socio-technical dynamics risk not only skill atrophy in operators – by reducing opportunities for critical practice and judgement – but also eroding military ethics and destabilising the delicate calibration of confidence and trust in AI within high-stakes nuclear decision-making environments.Footnote ⁶⁰ The danger extends beyond any single flawed decision; without critical reflection, users risk subtly absorbing and reproducing the underlying biases embedded within the AI’s outputs.Footnote ⁶¹

Regardless of the quality of classified information, pinpointing the catalyst of a crisis is complicated by the inherent difficulty of accounting for randomness, opportunism, deception, and the element of surprise that shape leaders’ decisions.Footnote ⁶² Uncertainty is an intrinsic feature of strategic warning: decision-makers frequently revise their views or deviate from planned courses of action as new information emerges.Footnote ⁶³ Although AI tools are increasingly being developed within the intelligence community to help analysts identify areas of potential instability and anticipate leadership behaviour, they remain fundamentally constrained by the unpredictability of human agency and the complex, non-linear dynamics that drive crisis onset and escalation.Footnote ⁶⁴ For instance, the CIA has experimented with an AI chatbot to simulate conversations with world leaders. Similarly, Scale AI, a partner of the US Department of Defense, has developed a chatbot fine-tuned on sensitive or classified data.Footnote ⁶⁵ With GenAI, however, the central challenge lies in not prediction alone but also how foresight itself is reshaped – altering how decision-makers perceive, interpret, and respond to emerging crises, often narrowing judgement under the guise, or even the illusion, of expanded anticipation.

‘Synthetic foresight’

Strategic foresight serves as an exploratory tool for systematically imagining and anticipating (not predicting) the future.Footnote ⁶⁶ Unlike forecasting, foresight is more of an art than a science, enabling analysts to think about the unthinkable, such as nuclear war.Footnote ⁶⁷ In the context of nuclear decision-making, its appeal – like in other domains – lies in a recognition that strategic surprise in nuclear affairs carries low-probability but catastrophic risks – or ‘tail-risk’ events.Footnote ⁶⁸ During the height of the Cold War, American physicist Herman Kahn controversially argued that states were duty-bound to consider ‘how a [nuclear] war might be fought’ and won, as a means of forcing leaders to explore preferred outcomes short of nuclear annihilation.Footnote ⁶⁹ In the nuclear domain, therefore, foresight offers forward-looking insights that are critical to sustaining deterrence stability, strategic resilience, and effective crisis management.Footnote ⁷⁰ By emphasising adaptability and preparedness amid compounding uncertainty, foresight can enable decision-makers to engage with medium- and long-term risks, anticipate (rather than predict) emerging threats and disruptive trend drivers such as AI, challenge entrenched assumptions, and enhance institutional capacity for adaptation.Footnote ⁷¹

A variety of tools and techniques – such as fictional scenarios, table-top exercises, wargaming, red-teaming, and horizon scanning – are used to explore potential futures and identify early or ‘weak signal’ indicators of significant change.Footnote ⁷² These methods help analysts imagine a range of plausible outcomes, challenge prevailing assumptions, and reduce the risk of strategic surprise.Footnote ⁷³ For instance, ongoing US NC3 modernisation efforts are exploring how AI-DSS can bolster contingency planning and real-time simulation capabilities.Footnote ⁷⁴ These systems enable commanders to interact with dynamic ‘what-if’ scenarios, using algorithms that improve warhead-decoy discrimination and recommend context-specific courses of action across multiple crisis trajectories.Footnote ⁷⁵ The French defence firm Thales, for example, has developed an AI-DSS known as ANTICIPE, which combines wargaming capabilities with advanced machine learning to deliver timely, actionable insights to military commanders during critical decision-making moments.Footnote ⁷⁶

Foresight methods, particularly scenario planning, are, however, not without their critics. Residing largely outside traditional empirical social science fora, they are often faulted for their speculative nature, lack of testable hypotheses, and limited replicability.Footnote ⁷⁷ Because the future is not observable, foresight relies on imagination, extrapolation, and narrative construction rather than verifiable data – raising persistent questions about how to define or measure ‘success’.Footnote ⁷⁸ However, as Peter Schwartz retorts, scenario planning was never intended to predict outcomes with precision; instead, it was designed to discipline imagination and enhance preparedness for surprise in conditions where the stakes are high and the future inherently uncertain.Footnote ⁷⁹ Like red-teaming and wargaming, scenario planning serves as a learning tool that encourages decision-makers to think beyond conventional wisdom and challenge entrenched assumptions. Within the intelligence community, GenAI is currently being used to generate scenarios and test hypotheses to address the persistent problem of intelligence failure.Footnote ⁸⁰

Emotional dynamics and temporal distortions

This epistemic transformation, however, is not confined to how futures are imagined; it may also reconfigure the temporal rhythm of crisis decision-making – how time is experienced, interpreted, and acted upon by decision-makers.Footnote ⁸¹ Real-time, high-volume AI-generated simulations can undermine traditional command decision-making processes by triggering ‘anticipatory affect’ – emotional and physiological responses associated with imagining future events, such as dopamine-driven reward and optimism or cortisol-induced urgency and fear.Footnote ⁸² Neuroscientific research shows that elevated dopamine levels can accelerate one’s subjective perception of time, creating a sense that events are unfolding faster than they are, while cortisol heightens vigilance and stress reactivity.Footnote ⁸³ Together, these effects can impair judgement under pressure by narrowing attention, accelerating risk-taking, or amplifying perceived threat.Footnote ⁸⁴ Rather than acting as purely rational agents, decision-makers invariably operate along a continuum between deliberate calculation and emotionally driven intuition.Footnote ⁸⁵

As Tetlock observes, people tend to be adaptively ‘deterministic thinkers’, approaching problems with the assumption that outcomes unfold along fixed, linear, and predictable causal chains – a tendency to impose order on chaos and uncertainty.Footnote ⁸⁶ In the post–Iraq War investigations, David Kay, head of the Iraq Survey Group tasked with the search for weapons of mass destruction, remarked that ‘one of the hardest things to do in the world of intelligence is to discern change…when people’s behavior has been consistent, you tend to predict the future based upon the past’.Footnote ⁸⁷ This tendency reflects a broader cognitive disposition: individuals gravitate towards information that confirms existing beliefs while discounting or ignoring evidence that challenges them.Footnote ⁸⁸

This tendency is closely related to ‘hindsight bias’, the propensity to perceive past events as more predictable than they actually were, often giving people the impression that they ‘knew it all along’.Footnote ⁸⁹ This mindset is particularly problematic in nuclear strategy, where escalation is too frequently imagined as a progressive linear ladder rather than as a process shaped by misperception, irrationality, deception, and non-linear dynamics.Footnote ⁹⁰ In such crises, GenAI’s seemingly authoritative outputs risk reinforcing leaders’ search for parsimony, resistance to ambiguity, and propensity to misjudge probability and risk.Footnote ⁹¹ The Russian military, for instance, has framed AI as a tool to ‘eliminate uncertainty’ in both conventional and nuclear deterrence planning. This attitude illustrates how algorithmic outputs may be granted unwarranted epistemic authority.Footnote ⁹²

GenAI constitutes an epistemic break from traditional foresight and forecasting methods. It is not a passive analytic tool but an active simulator, capable of generating adversary intentions – and, irrespective of their veracity or verifiability – escalation pathways and complex future crisis scenarios that appear plausible, even inevitable. Thus, under the veneer of epistemic precision, they risk acquiring a prescriptive authority. In blurring the line between prediction and prescription, GenAI does not merely anticipate potential escalation pathways; it also risks shaping them into self-fulfilling prophecies, as simulated inevitabilities begin to influence the very behaviours they purport to forecast.

Strategic risks and epistemic distortions: An algorithmic procrustean bed?

This section synthesises these risks and traces their consequences for strategic crisis decision-making, showing how they parallel and may amplify long-standing vulnerabilities in strategic judgement and intelligence analysis, identifies three interrelated mechanisms with direct implications for nuclear crisis stability: the normalisation of low-probability, high-impact escalation paths; the misattribution of adversarial intent under the guise of algorithmic certainty; and the formation of synthetic feedback loops that can generate self-fulfilling escalation. While each mechanism is analytically distinct (cognitive and organisational entrenchment, signalling distortion, and systematic escalation), together they reinforce one another, producing cumulative erosion of judgement under nuclear crisis conditions. These dynamics are, however, not deterministic. Under certain institutional, organisational, and design conditions, AI-enabled systems may instead support restraint, deliberation, and crisis management.

Normalisation of low-probability, high-impact escalation path

The normalisation of low-probability, high-impact escalation paths (‘tail risks’) occurs when overconfidence – fuelled by automation bias and the anthropomorphising of AI – converges with an erosion of critical reasoning. Under these conditions, decision-makers may mistake synthetic scenarios for empirically validated predictions and abandon rigorous scrutiny, especially under acute pressure.Footnote ⁹³ This danger extends beyond any single flawed decision: over time, leaders risk internalising the very biases embedded within AI systems, allowing these hidden assumptions to shape strategic thinking and institutional doctrine in ways that may go unchallenged.Footnote ⁹⁴ Military leaders already tend to overweight tail-risk events such as nuclear use while underestimating stabilising, de-escalatory dynamics, reflecting a broader cognitive bias to misjudge probability under conditions of pressure and uncertainty.Footnote ⁹⁵ As Richard Betts observes, intelligence analysis often defaults to worst-case assumptions in strategic contingency planning while inclining towards best-case assessments in operational analysis.Footnote ⁹⁶

Although most decision-making reflects a blend of rational calculation and emotion, in an effort to maximise certainty, political and bureaucratic pressures often push analysts towards the rational/non-emotional end of the spectrum – thereby eschewing nuance and other subtleties.Footnote ⁹⁷ This dynamic has historically made intelligence officials reluctant to predict tail-risk events – since such forecasts, being improbable, are more often disproved than confirmed.Footnote ⁹⁸ Historian Roberta Wohlstetter observed that intelligence analysts’ reluctance to ‘make bold assertions’ in the lead-up to Japan’s 1941 attack on Pearl Harbor due to bureaucratic distortion was a central factor in the failure to anticipate the strike.Footnote ⁹⁹ Similarly, during the Iraq war, analysts knew that downplaying Saddam Hussein’s WMD capabilities would invite backlash from superiors, making it difficult to challenge the prevailing official consensus.Footnote ¹⁰⁰ GenAI could invert this bias: by continuously generating low-probability escalation scenarios and framing them as plausible futures, it risks normalising precisely the kinds of catastrophic contingencies that analysts once avoided staking their reputation on.

When GenAI generates low-probability escalation trajectories that are yet compellingly framed, it risks reinforcing these distortions by transforming exploratory constructs into seemingly plausible futures that, once institutionalised, acquire prescriptive authority – shaping planning, expectations, and judgements as if they were predictive truths. For instance, a GenAI model frequently simulates a Chinese nuclear response to US naval activity in the Taiwan Strait; even if implausible by historical standards, repetition of this outcome may normalise extreme responses in US planning circles.Footnote ¹⁰¹ Recent LLM-based wargame simulations underscore this danger: AI agents repeatedly escalated conflicts – even when neutrality would have been the rational course – not in response to adversary behaviour but due to their internal heuristics and training data.Footnote ¹⁰² If taken as authoritative, such outputs risk steering policy into escalation narratives divorced from real-world nuance, desensitising decision-makers to uncertainty and instilling an automated alarmist mindset.

Misattribution of adversarial intent cloaked in algorithmic certainty

A second pathway of risk lies in the misattribution of adversarial intent, where synthetic foresight transforms uncertainty into spurious certainty. In crises, ambiguous or contradictory behaviour – military manoeuvres, alerting patterns, deterrence signalling – can be algorithmically coded as decisive indicators of escalation because current AI systems are unable to reliably resolve ambiguity.Footnote ¹⁰³ When such GenAI-generated scenarios are presented as prescriptive guidance, probabilistic inference is recast as empirical fact; adversaries may read them as not exploratory models but rather declarations of intent.

Consider, for instance, the US strikes on Iran in 2025. When Iran retaliated against a US base in Qatar, its response was deliberately calibrated to avoid escalation, signalling restraint by ensuring the strike did not cause casualties.Footnote ¹⁰⁴ Human analysts were able to interpret this nuance within its broader socio-political, cultural, and strategic context; in contrast, a GenAI system might have categorised the same action simply as evidence of aggression, failing to perceive the subtlety of Iran’s signalling and thereby increasing the risk of misinterpretation and inadvertent escalation. As past intelligence failures have demonstrated, over-reliance on technical systems without sufficient consideration of political and cultural context can distort strategic judgement. While AI excels within structured, data-rich environments, it is, on its own, unable to appreciate how meaning is conditioned by factors such as historical experience, societal context, or strategic culture.Footnote ¹⁰⁵ Without human interpretation to situate machine outputs within these contexts, AI-DSS risk reinforces existing analytical blind spots and bias and misrepresents adversary intentions during crises.

GenAI synthetic foresight could either mitigate or exacerbate strategic surprise, depending on how AI-generated indicators are interpreted under stress. Even if an AI system might, in theory, reach a rational conclusion that the greater danger lies not in mutual hostility between adversaries but in failing to cooperate or de-escalate, human emotions would remain deeply entwined with machine outputs during a crisis.Footnote ¹⁰⁶ While early signals could, in principle, support de-escalation or defensive preparation, in practice, the same socio-technical dynamics – automation bias, anthropomorphism, sycophancy, and confirmation bias – often amplify fear, urgency, and misperception. Under these conditions, human and machine cognition fuse under stress, jointly driving escalation dynamics rather than offsetting or restraining them.

Filtered through the illusion of algorithmic certainty, ambiguous signals may be misread as clear evidence of malign intent – or misread genuine hostility. This transformation of uncertainty into apparent certainty compresses deliberation time and heightens the risk of inadvertent escalation or pre-emptive action. As Wohlstetter observed, it is expectations – grounded in beliefs about what is considered likely to occur – that determine which signals receive attention, a cognitive tendency that AI-generated ‘predictions’ risk reinforcing.Footnote ¹⁰⁷ Moreover, expectations shaped by prior beliefs and perceptions tend to be resistant to change; new information – unless unambiguous – is typically assimilated in ways that reinforce existing worldviews.Footnote ¹⁰⁸ GenAI may further entrench this cognitive-institutional inertia by modelling adversaries as predictable and coherent, even when their use of deception or ambiguity is deliberate. The historical record demonstrates that the inability to recognise ‘behavioral surprise’ – when an adversary’s behaviour is perceived as incompatible with the other side’s expectations – in the face of technical indicators illustrates how preconceptions can blind analysts to adversary intent.Footnote ¹⁰⁹

Once adversaries adjust their behaviour in anticipation of such outcomes, synthetic foresight becomes endogenous to the crisis – its outputs shaping expectations and actions, feeding back into escalation dynamics in ways that are both path-deterministic and self-reinforcing, thereby producing the very crises it was intended to prevent. This dynamic also risks states exploiting AI to ‘out-think’ and ‘out-imagine’ an adversary, creating a strategic landscape where the rules of engagement are undefined and in constant flux.Footnote ¹¹⁰

Synthetic feedback loops and self-fulfilling crises

When tail-risk normalisation and misattribution of intent risks converge, they can generate synthetic feedback loops in which GenAI forecasts not only interpret but also shape adversary behaviour and signalling – collapsing the boundary between anticipation and action. As low-probability outcomes are iteratively simulated, model outputs imbued with probabilistic authority, and ambiguous deterrence signals misread as malign intent (or genuine hostility discounted as otherwise), synthetic foresight begins to drive behaviour on both sides as much as it interprets it. In this context, scenarios and simulations – for example, crisis and contingency exercises, forecasting models, and LLM-generated narratives – intended initially to caution against escalation may instead be invoked to justify it, transforming contingent possibilities into seemingly necessary and rational courses of action. Under these conditions, leaders become especially vulnerable to ‘action bias’ – the cognitive tendency to favour decisive moves over restraint, even when inaction would be the more rational or stabilising choice – further accelerating escalation within these self-reinforcing feedback loops.Footnote ¹¹¹

This self-reinforcing dynamic poses especially stark dangers in nuclear crises, where compressed decision timelines, fragile deterrence signalling, and pervasive risks of misperception render the line between simulation and action exceptionally unstable.Footnote ¹¹² The effects unfold along three pathways. First, a reliance on worst-case thinking in strategic planning can foster an escalation bias, prompting states to raise alert levels, accelerate launch-on-warning postures, or adopt other pre-emptive measures – steps that may, in turn, trigger reciprocal counteractions by adversaries.Footnote ¹¹³ Recent wargames and crisis simulations suggest that GenAI itself may compound this bias.Footnote ¹¹⁴ LLMs are trained primarily on the existing corpus of scholarly work on war and strategy – a body of literature that overwhelmingly emphasises escalation dynamics while giving far less attention to de-escalation. Because ‘non-events’ such as successful crisis management or conflict avoidance are inherently harder to document and analyse, they are systematically underrepresented in historical data.Footnote ¹¹⁵ This imbalance skews the training set, making GenAI simulations more likely to generate scenarios centred on escalation rather than restraint.

Second, policy inertia can take hold: once AI-generated scenarios are incorporated into contingency planning, bureaucratic momentum may entrench them within strategic culture, posture, and doctrine – embedding these synthetic futures as default assumptions.Footnote ¹¹⁶ Finally, normalisation can become self-reinforcing, with synthetic foresight hardening into a closed feedback loop that blurs the line between simulated possibilities and perceived realities. When one side interprets GenAI simulations as proof of imminent escalation and adjusts its posture accordingly, the other may read those moves as confirmation of hostile intent and respond in kind. In this way, scenarios depicting decapitation strikes or first-use contingencies can provoke pre-emptive counter-measures, igniting a spiral of escalation that ultimately produces the very crisis the system was meant to anticipate and prevent. As Betts cautions, ‘precautionary escalation or procurement may act as self-fulfilling prophecies, either through catalytic spirals of mobilisation or an arms race that heightens tension, or doctrinal hedges that make the prospect of nuclear war more thinkable’ – often on dubious grounds.Footnote ¹¹⁷

Strategic stress test: Crisis scenarios in the age of synthetic foresight

Fictional scenarios are an essential tool for strategic foresight, enabling policymakers and military planners to explore how emerging technologies, such as GenAI, could reshape future conflicts – particularly in areas with few historical precedents, like nuclear crisis decision-making.Footnote ¹¹⁸ In this respect, the scenarios function as qualitative stress tests analogous to exploratory wargaming, enabling structured examination of decision dynamics, escalation pathways, and points of potential de-escalation.Footnote ¹¹⁹ Well-crafted scenarios challenge entrenched assumptions, expose vulnerabilities and biases – especially the hindsight bias – and overconfidence, and help decision-makers anticipate ethical, operational, and strategic dilemmas under conditions of extreme stress and uncertainty.Footnote ¹²⁰

The scenarios in this section illustrate how GenAI might be integrated into nuclear DSS. Each one highlights a different epistemic risk identified earlier in the article: the normalisation of low-probability, high-impact escalation paths; the misattribution of adversarial intent cloaked in algorithmic certainty; and the emergence of synthetic feedback loops that turn foresight itself into a self-reinforcing driver of escalation. The goal here is not to predict specific events, but rather to illuminate the epistemic blind spots and closed loops that can emerge when machine-generated probabilistic outputs collide with the political, organisational, and psychological realities of nuclear crisis management.

In addition to their heuristic value – sparking creative thinking and encouraging debate about alternative causal pathways – these scenarios represent a methodological experiment. They were developed with the assistance of LLMs, including ChatGPT, to examine how GenAI ‘thinks’ about nuclear crises: how it represents uncertainty, interprets adversary intent, frames human cognition, and reduces complex escalation dynamics into linear, internally coherent narratives. This approach responds to recent critiques that LLM-generated scenarios tend to privilege determinism and escalation, offering overly simplified portrayals of crisis behaviour and conflict resolution.Footnote ¹²¹ The scenarios highlight how GenAI systems do not merely describe crises but also actively shape strategic foresight, interpretation, and response – often in ways that diverge from the contingent, ambiguous realities of human decision-making and the messy unpredictability of real-world strategic environments.

By juxtaposing GenAI-generated narratives with the author’s own analysis, the scenarios expose the embedded assumptions, biases, and blind spots within current GenAI systems, offering a reflexive critique of the technology itself. In this way, the scenarios generate structured qualitative insights into escalation triggers, interpretive bias, and decision-making friction in human–machine interactions during crises. The purpose of employing LLMs here is not to predict events, but to demonstrate how these systems construct alternative futures and, in doing so, can influence human decision-making. This reflexive use of GenAI highlights its dual role as both an analytical tool and an epistemic actor, underscoring how GenAI can simultaneously expand and distort strategic imagination in nuclear crisis contexts.

Finally, the scenarios serve as a bridge between theory and practice. They show how the epistemic risks discussed earlier might manifest in future nuclear crises and underscore one of the article’s central arguments: governing synthetic foresight is not simply a technical challenge to be addressed through better data or model refinement. The real danger lies in how GenAI transforms the strategic ecosystem itself – reshaping how states perceive threats, make decisions under uncertainty, and interact with one another.

Scenario 1: Simulated inevitability in the Taiwan Straits

The year was 2028, and tensions in the Taiwan Strait were at their highest point in decades. A US carrier strike group moved closer to the island as a show of support after a series of Chinese military drills.

In the US Pacific Command’s operations centre, Starfire, a next-generation GenAI-DSS, was running continuous real-time scenarios. Its mission was not only to anticipate Chinese moves and recommend responses faster than any human team could but also to outpace China’s own AI tools – to get inside their observe–orient–decide–act (OODA) loop before Beijing could act.Footnote ¹²²

On the screen, a flashing red warning appeared: ‘82 per cent probability of Chinese missile strikes on Guam or Okinawa within 96 hours.’

The room went silent.

‘Eighty-two per cent?’ Admiral Lewis asked, leaning forward.

‘Yes, sir’, the analyst replied. ‘The model integrated satellite imagery, intercepts, social media chatter, and historical Chinese behaviour. It’s never given a probability this high before.’

Colonel Ryan, an intelligence officer, hesitated. ‘Sir, there are other interpretations. This could be a show of force, a large-scale drill, or deterrence signalling – not necessarily prep for a strike.’

Another officer added cautiously, ‘The AI has been wrong before, sir. There are still diplomatic backchannels open.’

Lewis glanced at the countdown clock on the screen. ‘We don’t have the luxury of doubt. If the AI’s right and we sit on our hands, we’ll be explaining to Congress why we lost Guam. We can’t afford another Pearl Harbor!’

Alternative interpretations quickly faded under the weight of the machine’s precision. Starfire didn’t just provide numbers; it generated immersive visualisations – projected missile arcs, satellite overlays, and simulated Chinese decision chains, including deployments by People’s Liberation Army Rocket Force (PLARF) units with dual-capable missile systems. Under extreme time pressure, what began as exploratory foresight of possible futures came to be treated as a definitive prediction.

US forces were ordered to raise alert levels, disperse dual-capable aircraft to hardened airfields, and surge additional assets into the region. Watching through their own AI-powered monitoring systems, Chinese commanders misread these steps as preparations for a US decapitation strike, possibly targeting their nuclear command, control, and communications (NC3) nodes.

In Beijing, their own DSS-generated report concluded: ‘Updated probability: 91 per cent – U.S. strike likely within forty-eight hours. Recommend immediate counterforce readiness.’

China mobilised its missile brigades, including transporter–erector–launcher (TEL) units. Starfire then interpreted these movements as confirmation of its initial warning, raising its estimate to 95 per cent. The machine’s growing certainty drowned out dissent. Analysts who urged caution were dismissed as ‘human noise’, their warnings eclipsed by the AI’s confident, visually compelling simulations.

This scenario illustrates how rival GenAI systems can create competitive, self-reinforcing feedback loops, with each side racing to ‘get inside’ the other’s decision-making cycle. Under extreme stress and compressed timelines, US leaders became over-reliant on Starfire’s seamless visualisations, treating its outputs as authoritative and prescriptive rather than exploratory. Simultaneously, Chinese commanders interpreted these AI-driven manoeuvres as evidence of hostile US intent. In this high-stakes environment, foresight became foreboding: simulations hardened into inevitabilities, off-ramps for de-escalation vanished, and escalation rapidly shifted from an abstract possibility to an impending reality. Dissenting voices were silenced by the machine’s confidence, as every move by one side validated the other’s worst fears, locking both sides into a cycle of escalating mistrust.

Scenario 2: Misattributed intentions at the brink in South Asia

The year was 2029. After a bombing in Kashmir, New Delhi faced fierce public pressure to respond while foreign capitals urged restraint. In the crisis room, India’s new GenAI-DSS, Trinetra, fused satellite imagery, signals intercepts, rail-freight logs, and social media feed. A red banner slid across the main display: ‘74 per cent: Pakistan preparing nuclear signalling within 72 hours.’ Intent classification: escalatory (high confidence).

‘High confidence?’ the national security advisor asked.

‘The model is weighing nighttime fuel convoys and road closures near missile brigades’, the systems engineer said. ‘It maps to prior crisis patterns.’

A senior analyst pushed back. ‘Those convoys match their annual exercise window. We’ve seen no warhead mating, no dispersal to Nasr (Hatf-IX) short-range ballistic missile units, and command-and-control traffic is routine. Hotlines are still active; backchannels say Islamabad wants to cool things.’

The room tightened. ‘And if you’re wrong?’ the defence minister asked. ‘If we downplay this and it isn’t an exercise?’

Trinetra didn’t just offer percentages; it rendered animated scenarios – transporter–erector–launcher (TEL) columns fanning out, a ladder of ‘demonstration’ launches, projected timelines to threaten Indian command-and-control nodes. Under time pressure, the visuals felt less like possibilities and more like proof.

The analyst tried again, anchoring her argument. ‘Consider the 2025 US–Iran exchange. After US strikes, Iran’s retaliation on a US base in Qatar was deliberately calibrated to avoid casualties – a signal of restraint that human analysts read correctly in context. A GenAI classifier might have tagged the very same act as “aggression” and pushed for escalation. We could be making that mistake now.’

Silence. The minister exhaled. ‘We can’t gamble. Adopt minimum-ready. Selective dispersal, air-defence reinforcement, heightened patrols – quietly.’

Across the border, Pakistan’s lighter DSS, Shaheen, flagged India’s moves:

‘Updated: 81 per cent – Indian pre-emptive decapitation posture.’

Recommendation: alert nuclear units; activate communications redundancy.

Islamabad announced a ‘readiness check’, moved a missile battery, and closed two bases ‘for safety’. To Trinetra, these were confirmatory signals; its ‘intent’ panel flipped to hostile as the confidence bar climbed.

‘Sir’, the analyst said softly, ‘their rhetoric is muted, and their air tasking is thin. This still looks like deterrence theatre.’

The machine’s clarity swallowed her caution. The prime minister glanced at the countdown clock. ‘Proceed with the alert.’

This vignette illustrates how ambiguous actions – such as military exercises, convoy movements, or base closures – can be misread as hostile intent when processed through the lens of algorithmic probabilistic determinism. The AI’s clean, binary ‘intent’ label strips away ambiguity, making human restraint appear indecisive against the backdrop of its empirically compelling simulations and steadily rising confidence alerts. As a result, Pakistan interprets India’s defensive dispersals as preparations for a pre-emptive strike, while India sees Pakistan’s routine readiness checks as confirmation of aggressive intent. Under intense stress and compressed decision timelines, these misattributions harden into assumptions. Hotlines and diplomatic off-ramps are eschewed, and the crisis edges towards dangerous nuclear signalling – not because either side’s actual intentions changed, but because the machine determined they had.

Conclusion

This article has argued that GenAI represents a transformative epistemic force in future nuclear crisis decision-making. By blurring the distinction between forecasting and foresight, as well as between anticipation and action, GenAI does more than enhance traditional strategic warning and decision-support functions. It actively generates synthetic simulations and scenarios that constitute what this article terms ‘synthetic foresight’ – a novel form of algorithmically mediated imagination that can fundamentally alter how decision-makers perceive threats, interpret signals, and respond to crises.

These dynamics produce three interlocking strategic risks for crisis stability. First, the normalisation of low-probability, high-impact escalation paths, in which catastrophic scenarios – such as nuclear use – become increasingly ‘thinkable’ and embedded within military organisations, strategic culture, defence planning, and doctrine. Second, the misattribution of adversarial intent, as ambiguous actions are reframed through the lens of algorithmic certainty, potentially leading to dangerous misinterpretations and inadvertent escalation. Third, the formation of synthetic feedback loops, where machine-generated outputs begin to shape both sides’ behaviours and perceptions, transforming simulated futures into self-fulfilling crises.

Governing these risks requires more than improving data provenance, fine-tuning algorithms, or enhancing transparency through explainable AI (XAI).Footnote ¹²³ Because of their strategic acuity and speed, and the trade-off between AI creativity and human comprehensibility, advanced AI systems are unlikely to be explainable in ways that fully convey their underlying reasoning to human users during real-time crisis decision-making. Effective governance must therefore extend beyond technical fixes to encompass institutional reforms and practices that address the cognitive, organisational, and socio-technical dimensions of human–AI interaction in crisis settings.

First, to counter cognitive biases, decision-makers can adopt ‘cognitive forcing’ techniques – methods adapted from aviation and medicine – that shift thinking from fast, intuitive responses to slower, deliberate analysis.Footnote ¹²⁴ Doctors, for instance, take ‘diagnostic timeouts’ to verify reasoning, while pilots use structured checklists before take-off. These techniques, combined with strengthening metacognitive skills – critical reflection on one’s own thinking – can help leaders resist automation bias and algorithmic overconfidence while fostering imagination and strategic creativity.Footnote ¹²⁵ This reframes scenario planning not as prediction but rather as a tool for cultivating critical thinking in the face of deep uncertainty.

Second, military professional education and training must adapt to mitigate skill atrophy, anthropomorphism, and over-reliance on machine outputs.Footnote ¹²⁶ Upskilling operators and analysts to understand both the strengths and limits of GenAI systems is essential.Footnote ¹²⁷ This includes raising awareness of biases such as automation bias, confirmation bias, action bias, and hindsight bias, and training personnel to critically interrogate algorithmic recommendations rather than deferring to them. By integrating these lessons into officer education, war colleges, and NC3 training programmes, states can preserve human judgement as a vital component of crisis decision-making – even in highly automated environments.

Third, GenAI-enhanced wargaming offers a promising tool for both research and practice.Footnote ¹²⁸ The fictional scenarios presented here illustrate how such exercises can surface hidden assumptions, decision biases, and escalation pathways before they manifest in real crises. Human–AI and AI–AI wargames can empirically test the dynamics identified in this article by observing how different configurations of human and machine decision-making interact under simulated crisis conditions. Such exercises could reveal hidden biases in GenAI systems and human operators, show how machine-generated scenarios influence human perceptions and emotions, and provide a controlled setting to evaluate mitigation strategies before they are applied to real-world NC3 contexts.

The article’s findings also have broader implications for crisis management beyond the nuclear domain. As GenAI-enabled systems are adopted in other high-stakes environments – such as cybersecurity, biosecurity, and space operations – the same epistemic distortions may emerge, reshaping how leaders interpret early warning signals and coordinate responses under intense time pressure and psychological stress.

Future research should extend these empirical studies by exploring how different political, cultural, and organisational contexts shape and mediate the effects of GenAI adoption in high-stakes settings. Comparative studies could examine how rival powers – such as the United States, China, and Russia – integrate GenAI into their conventional and nuclear command-and-control systems, and whether asymmetric adoption patterns create new pathways to instability and escalation. Only by confronting these human–machine entanglements can states hope to preserve crisis stability in the age of synthetic foresight. Without such reflection, GenAI risks becoming an algorithmic ‘Procrustean bed’, forcing complex, ambiguous, and consequential realities to conform to machine-generated futures – narrowing strategic imagination precisely when human flexibility, intuition, and emotion are most needed.Footnote ¹²⁹

Acknowledgements

The author would like to thank the anonymous EJIS reviewers for their constructive comments on the earlier versions of the manuscript.

Financial statement

This research was supported by Longview Philanthropy. The financial sponsor played no role in the research process.

James Johnson is Senior Lecturer in Strategic Studies in the Department of Politics and International Relations at the University of Aberdeen. He is the author of The AI Commander: Centaur Teaming, Command, and Ethical Dilemmas; AI and the Bomb: Nuclear Strategy and Risk in the Digital Age; and Artificial Intelligence and the Future of Warfare: USA, China & Strategic Stability.

References

¹ Generative artificial intelligence (GenAI) refers to a class of AI systems – most prominently large language models (LLMs) – capable of producing new content such as text, images, or simulations based on patterns in their training data. These models can synthesise vast amounts of structured and unstructured data, making them particularly adept for scenario generation, strategic simulation, and decision support in complex environments. Humza Naveed et al., ‘A comprehensive overview of large language models’, arXiv preprint (July 2023), available at: {https://arxiv.org/abs/2407.03453}.

² Michael Hirsh, ‘The AI doomsday machine is closer to reality than you think’, Politico Magazine (2 September 2025), available at: {https://www.politico.com/news/magazine/2025/09/02/pentagon-ai-nuclear-war-00496884}.

³ See Herman Kahn, On Escalation: Metaphors and Scenarios (Praeger, 1965); James D. Fearon, ‘Signaling foreign policy interests: Tying hands versus sinking costs’, The Journal of Conflict Resolution, 41:1 (1997), pp. 68–90; Thomas C. Schelling, Arms and Influence (Yale University Press, 1966); Thomas C. Schelling, The Strategy of Conflict (Harvard University Press, 1960); Robert Jervis, Perception and Misperception in International Politics (Princeton University Press, 1976); Glenn H. Snyder and Paul Diesing, Conflict Among Nations (Princeton University Press, 1977).

⁴ Foundational works include Richard K. Betts, Surprise Attack: Lessons for Defense Planning (Brookings Institution Press, 1982); Richard K. Betts, ‘Analysis, war, and decision: Why intelligence failures are inevitable’, World Politics, 31:1 (1978), pp. 61–89; Klaus Knorr, ‘Failures in national intelligence estimates: The case of the Cuban missiles’, World Politics, 16:3 (April 1964), pp. 455–67; Robert Jervis, ‘Reports, politics, and intelligence failures: The case of Iraq’, The Journal of Strategic Studies, 29:1 (2006), pp. 3–52. Subsequent scholarship has extended these insights considering technological change and evolving crisis dynamics; see Alex Wilner and Ryan Atkinson, Artificial Intelligence and National Defence: A Strategic Foresight Analysis, CIGI Papers No. 316 (Centre for International Governance Innovation, March 2025).

⁵ See James Johnson ‘Inadvertent escalation in the age of intelligence machines: A new model for nuclear risk in the digital age’, European Journal of International Security, 7:3 (2022), pp. 337–59; Michael Depp and Paul Scharre, ‘Artificial intelligence and nuclear stability’, War on the Rocks (16 January 2024), available at: {https://warontherocks.com/2024/01/artificial-intelligence-and-nuclear-stability/}.

⁶ See Robert Jervis, Perception and Misperception in International Politics (Princeton University Press, 1976); Graham T. Allison and Philip Zelikow, Essence of Decision: Explaining the Cuban Missile Crisis, 2nd ed. (Longman, 1999); Roberta Wohlstetter, Pearl Harbor: Warning and Decision (Stanford University Press, 1962).

⁷ See Allan Dafoe, ‘AI governance: A research agenda’, Centre for the Governance of AI Future of Humanity Institute, University of Oxford (2018), available at: {https://cdn.governance.ai/GovAI-Research-Agenda.pdf}; Madeleine Clare Elish, ‘Moral crumple zones: Cautionary tales in human–robot interaction’, Engaging Science, Technology, and Society, 5 (2019), pp. 40–60; Amna Batool, Didar Zowghi, and Muneera Bano, ‘AI governance: A systematic literature review’, AI Ethics, 5 (2025), pp. 3265–79.

⁸ Marcus Holmes and Nicholas J. Wheeler, ‘The role of artificial intelligence in nuclear crisis decision making: A complement, not a substitute’, Australian Journal of International Affairs, 78:2 (2024), pp. 164–74.

⁹ Ibid., p. 166.

¹⁰ NATO, NATO 2022 Strategic Concept (NATO, 2022), available at: {https://www.nato.int/strategic-concept/}.

¹¹ See Thomas C. Schelling, Arms and Influence (Yale University Press, 1966); James D. Fearon, ‘Rationalist explanations for war’, International Organization, 49:3 (1995), pp. 379–414; Glenn H. Snyder and Paul Diesing, Conflict Among Nations: Bargaining, Decision Making, and System Structure in International Crises (Princeton University Press, 1977); Powell, ‘Nuclear deterrence theory, nuclear proliferation, and national missile defense’; Jervis, Perception and Misperception in International Politics.

¹² Wilner and Atkinson, Artificial Intelligence and National Defence.

¹³ Strategic surprises have been a recurring feature of intelligence history, from Pearl Harbor and the German invasion of the Soviet Union in 1941 to the outbreak of the Korean War, and more recently Hamas’s attack on Israel in 2023, the collapse of the Syrian regime in 2024, and Russia’s 2022 invasion of Ukraine. Shay Hershkovitz and Ofek Riemer, ‘Introduction: Rethinking strategic warning and intelligence failure in an era of global transformation’, Intelligence and National Security, 5:40 (2025), pp. 803–816, https://doi.org/10.1080/02684527.2025.2546249.

¹⁴ See Scotty Black and Christian Darkenm, ‘Scaling artificial intelligence for digital wargaming in support of decision-making’, NATO Science and Technology Organization, 8 February 2024, available at: {https://www.turing.ac.uk/research/research-projects/computational-modelling-civil-wars}; Mirco Musolesi and Akin Ünver, ‘Computational modelling of civil wars’, The Alan Turing Institute, 2024, available at: {https://www.turing.ac.uk/research/research-projects/computational-modelling-civil-wars}.

¹⁵ Nandita Balakrishnan, Anna Knack, and Timothy Clancy, Applying AI to Strategic Warning (Center for Security and Emerging Technology, 2024), p. 9.

¹⁶ Emelia S. Probasco et al., AI for Military Decision-Making: Harnessing the Advantages and Avoiding the Risks (Center for Security and Emerging Technology, April 2025), p. 11.

¹⁷ Ibid.

¹⁸ Hans-Joachim Kolb, ‘Approaches to the evaluation of intelligence: Massive military data fusion and visualisation: Users talk with developers’, Norwegian Defence Logistics and Management College, Halden, Norway), RTO Meeting Proceedings MP-105, April 2004, available at: {https://apps.dtic.mil/sti/citations/ADA428034}.

¹⁹ Hirsh, ‘The AI doomsday machine is closer to reality than you think’.

²⁰ ‘Adaptive targeting’ refers to the real-time reprioritisation of targets in response to shifting intelligence or battlefield conditions. The 2022 US Nuclear Posture Review identifies ‘adaptive nuclear planning’ as a core NC3 function, enabling rapid adjustments during crises. Alice Saltini, Sylvia Mishra, and Philip Reiner, Nuclear Command, Control & Communications: A Primer on Strategic Warning, Decision Support, and Adaptive Targeting Subsystems (Institute for Security and Technology, July 2025), p. 13.

²¹ Johnathan P. Proctor, ‘Exploring the meaning and challenges of early warning’, Intelligence and National Security 40:5 (2025), pp. 817–837, https://doi.org/10.1080/02684527.2025.2526930.

²² Probasco et al., AI for Military Decision-Making, p. 11.

²³ Anthony J. Cotton, ‘Statement of General Anthony J. Cotton, Commander, U.S. Strategic Command, in hearing to receive testimony on United States Strategic Command and United States Space Command in review of the Defense Authorization Request for Fiscal Year 2026, before the Subcommittee on Strategic Forces of the Senate Committee on Armed Services, 119th Cong., 1st sess.’ (2025), available at: {https://www.armed-services.senate.gov/imo/media/doc/testimony_of_general_anthony_jcotton2.pdf}.

²⁴ See ‘Upgraded early warning radars’, United States Space Force, Fact Sheet (October 2020), available at: {https://www.spaceforce.mil/About-Us/Fact-Sheets/Fact-Sheet-Display/Article/2197738/upgraded-early-warning-radars/}; Zahra Zamanzadeh Darban, Geoffrey Webb, and Shirui Pan, ‘Deep learning for time series anomaly detection: A survey’, arXiv (2023), available at: {https://arxiv.org/html/2211.05244v3}.

²⁵ Hirsh, ‘The AI doomsday machine is closer to reality than you think’.

²⁶ NATO, NATO Warfighting Capstone Concept (NATO Allied Command Transformation, 2021), available at: {www.act.nato.int/wp-content/uploads/2023/06/NWCC-Glossy-18-MAY.pdf}; North Atlantic Treaty Organization, ‘Deterrence and defence’, NATO (10 December 2025), available at: {https://www.nato.int/en/what-we-do/deterrence-and-defence/deterrence-and-defence}; North Atlantic Treaty Organization, Summary of the NATO Artificial Intelligence Strategy (22 October 2021), available at: {https://www.nato.int/en/about-us/official-texts-and-resources/official-texts/2021/10/22/summary-of-the-nato-artificial-intelligence-strategy}.

²⁷ Balakrishnan, Knack, and Clancy, Applying AI to Strategic Warning, pp. 13–15.

²⁸ Betts, ‘Analysis, war, and decision’, p. 70.

²⁹ Ibid., p. 87.

³⁰ See Peter S. Park, Simon Goldstein, Aidan O’Gara, Michael Chen, and Dan Hendrycks, AI Deception: A Survey of Examples, Risks, and Potential Solutions (Center for AI Safety), arXiv preprint (2023), available at: {https://arxiv.org/abs/2302.12333}; Alexander Blanchard and Laura Bruun, Bias in Military Artificial Intelligence, SIPRI Background Paper (Stockholm International Peace Research Institute, December 2024).

³¹ Betts, ‘Analysis, war, and decision’, p. 63.

³² Daniel Kahneman, Paul Slovic, and Amos Tversky, eds., Judgment Under Uncertainty: Heuristics and Biases (Cambridge University Press, 1982).

³³ Christopher M. Bishop, Pattern Recognition and Machine Learning (Springer, 2007).

³⁴ A recent study of entry-level Israeli intelligence analysts after the 7 October 2023 Hamas attack concluded that early exposure to intelligence failure encourages ‘overlearning’, which may lead to either alarmism or a misplaced confidence and a sense of invulnerability to cognitive closure and dogmatism. Ofek Riemer, ‘Learning from mistakes: The impact of the October 7 surprise attack on the youngest generation of IDF intelligence analysts’, Intelligence and National Security (published online July 2025), https://doi.org/10.1080/02684527.2025.2526931.

³⁵ This reflects prospect theory’s account of risk attitudes, which holds that decision-makers tend to underestimate high-probability risks (being risk-averse for gains and risk-seeking for losses) while overestimating low-probability risks (being risk-averse for losses and risk-seeking for gains). Daniel Kahneman and Amos Tversky, ‘Prospect theory: An analysis of decision under risk’, Econometrica, 47:2 (1979), pp. 263–91.

³⁶ Amy Zegart, Spies, Lies, and Algorithms: The History and Future of American Intelligence (Princeton University Press, 2022).

³⁷ Betts, ‘Analysis, war, and decision’, p. 63 (emphasis added).

³⁸ Jervis, ‘Reports, politics, and intelligence failures’, p. 22.

³⁹ Joel Brenner, ‘Artificial intelligence and the problem of surprise’, International Security, 50:3 (2026), p. 146.

⁴⁰ Carl von Clausewitz, On War, ed. and translated by Michael Howard and Peter Paret (Princeton University Press, 1976), p. 117.

⁴¹ Ibid., p. 62.

⁴² Michael I. Handel, War Strategy and Intelligence (Routledge, 1989), p. 246.

⁴³ The 1973 Yom Kippur War exemplifies this paradox: Israel issued alerts and undertook logistical preparations in spring 1973 based on accurate early warnings, but when Egypt and Syria postponed their planned attack until October, those warnings came to be seen as false, prompting ‘cry wolf’ criticisms that undermined the credibility of subsequent alerts. Michael I. Handel, ‘The Yom Kippur war and the inevitability of surprise’, International Studies Quarterly, 21:3 (1977), pp. 461–502.

⁴⁴ Zegart, Spies, Lies, and Algorithms, pp. 118–30.

⁴⁵ Huw Dylan and Elena Grossfeld, ‘Unveiling Russian intelligence failures in the Ukraine conflict: A strategic culture perspective’, Intelligence and National Security 40:5 (2025), pp. 926–950, https://doi.org/10.1080/02684527.2025.2544460.

⁴⁶ Philip E. Tetlock, Expert Political Judgment (Princeton University Press, 2006).

⁴⁷ Ibid., pp. 22–3.

⁴⁸ Ariel Levy, Monica Agrawal, Arvind Satyanarayan, and David Sontag, ‘Assessing the impact of automated suggestions on decision making: Domain experts mediate model errors but take less initiative’, arXiv preprint (March 2021), https://doi.org/10.48550/arXiv.2305.01186; Nicolas Spatola, ‘The efficiency–accountability trade-off in AI integration: Effects on human performance and over-reliance’, Computers in Human Behavior: Artificial Humans, 2:2 (2024), pp. 1–10.

⁴⁹ See Mary. L. Cummings, ‘Automation bias in intelligent time critical decision support systems’, AIAA 1st Intelligent Systems Technical Conference 6313 (2004), pp. 1–7; Raja Parasuraman and Dietrich H. Manzey. ‘Complacency and bias in human use of automation: An attentional integration’, Human Factors, 52:3 (2010), pp. 381–410.

⁵⁰ Linda J. Skitka, Kathleen L. Mosier, and Mark Burdick, ‘Does automation bias decision-making?’, International Journal of Human–Computer Studies, 51:5 (1999), pp. 991–1006; Kathleen Mosier, Melisa Dunbar, Lori McDonnell, Linda J. Skitka, Mark Burdick, and Bonnie Rosenblatt, ‘Automation bias and errors: Are teams better than individuals?’, Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 42:3 (1998), pp. 201–5.

⁵¹ Flin Rhona, Eduardo Salas, Michael Straub, and Lynne Martin, eds., Decision-Making Under Stress: Emerging Themes and Applications, 1st ed (Ashgate, 1997).

⁵² Michael C. Horowitz and Lauren Kahn, Bending the Automation Bias Curve: A Study of Human and AI-based Decision Making in National Security Contexts, arXiv preprint (June 2023), https://doi.org/10.48550/arXiv.2306.16507.

⁵³ Zegart, Spies, Lies, and Algorithms, pp. 133–4.

⁵⁴ Christopher S. Chivvis and Jennifer Kavanagh, ‘How AI might affect decision-making in a national security crisis’, Carnegie Endowment for International Peace (17 June 2024), available at: {https://carnegieendowment.org/research/2024/06/artificial-intelligence-national-security-crisis?lang=en}.

⁵⁵ See Nicholas Epley, Adam Waytz, and John T. Cacioppo, ‘On seeing human: A three-factor theory of anthropomorphism’, Psychological Review, 114:4 (2007), pp. 864–86; Adam Waytz, John Cacioppo, and Nicholas Epley, ‘Who sees human? The stability and importance of individual differences in anthropomorphism’, Perspectives on Psychological Science, 5:3 (May 2010), pp. 219–32; James Johnson, ‘Finding AI faces in the moon and armies in the clouds: Anthropomorphising artificial intelligence in military human–machine interactions’, Global Society, 38:1 (2023), pp. 67–82.

⁵⁶ A recent study on human–LLM interactions found that LLMs using human-like cues, such as first-person pronouns, were perceived as more anthropomorphic and received higher ratings of trust and information accuracy. Michelle Cohn, Mahima Pushkarna, Gbolahan O. Olanubi, Joseph M. Moran, Daniel Padgett, Zion Mengesha, and Courtney Heldreth, ‘Believing anthropomorphism: Examining the role of anthropomorphic cues on user trust in large language models’, in Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (Association for Computing Machinery, 2024), pp. 1–13, https://doi.org/10.1145/1234567.1234568.

⁵⁷ Rebecca Bellan, ‘AI sycophancy isn’t just a quirk, experts consider it a “dark pattern” to turn users into profit’, TechCrunch (25 August 2025), available at: {https://techcrunch.com/2025/08/25/ai-sycophancy-isnt-just-a-quirk-experts-consider-it-a-dark-pattern-to-turn-users-into-profit/?utm_source=substack&utm_medium=email}.

⁵⁸ Jared Moore, Declan Grabb, William Agnew, Kevin Klyman, Stevie Chancellor, Desmond C. Ong, and Nick Haber, ‘Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers’, arXiv (25 April 2025), https://doi.org/10.48550/arXiv.2504.18412.

⁵⁹ Christopher Summerfield, Lennart Luettgau, Magda Dubois, Hannah Rose Kirk, Kobi Hackenburg, Catherine Fist, and Katarina Slama, ‘Lessons from a chimp: AI “scheming” and the quest for ape language’, arXiv (July 2025), https://doi.org/10.48550/arXiv.2507.03409.

⁶⁰ Matthias Klaus. ‘Transcending weapon systems: The ethical challenges of AI in military decision-support systems’, ICRC Humanitarian Law and Policy Blog (24 September 2024), available at: {https://blogs.icrc.org/law-and-policy/2024/09/24/transcending-weapon-systems-the-ethical-challenges-of-ai-in-military-decision-support-systems/}.

⁶¹ Lauren Leffer, ‘Humans absorb bias from AI – and keep it after they stop using the algorithm’, Scientific American (26 October 2023), available at: {https://www.scientificamerican.com/article/humans-absorb-bias-from-ai-and-keep-it-after-they-stop-using-the-algorithm/}.

⁶² See Benjamin F. Jones and Benjamin A. Olken, ‘Hit or miss? The effect of assassinations on institutions and war’, American Economic Journal: Macroeconomics, 1:2 (2009), pp. 55–87; Richard K. Betts, Surprise Attack: Lessons for Defense Planning (Brookings Institution Press, 1982); Jervis, ‘Reports, politics, and intelligence failures’, pp. 3–52.

⁶³ Wohlstetter, Pearl Harbor, p. 397.

⁶⁴ Julian E. Barnes, ‘CIA is developing chatbot to help track world events’, New York Times (18 January 2025), available at: {https://www.nytimes.com/2025/01/18/us/politics/cia-chatbot-technology.html}.

⁶⁵ Gerrit De Vynck, ‘Pentagon signs AI deal to help commanders plan military maneuvers’, Washington Post (5 March 2025), available at: {https://www.washingtonpost.com/technology/2025/03/05/pentagon-ai-military-scale/}.

⁶⁶ Wilner and Atkinson, Artificial Intelligence and National Defence.

⁶⁷ Herman Kahn, Thinking About the Unthinkable (Horizon Press, 1962).

⁶⁸ Tension persists between analysts who view the strategic environment as inherently capricious – anticipating low-probability, high-impact ‘Black Swan’ events and therefore dismissing probabilistic forecasting as unrealistic – and those who prioritise probabilistic models as essential for structured and rational judgement. Nassim Nicholas Taleb, The Black Swan: The Impact of the Highly Improbable (Random House, 2007); Tetlock, Expert Political Judgment.

⁶⁹ Ibid., 8.

⁷⁰ Peter Bishop, ‘A yardstick too far?’, Foresight 3:3 (2001), pp. 163–7; Matthew J. Spaniol and Nicholas J. Rowland, ‘Defining scenario’, Futures and Foresight Science, 1:1 (2018), pp. 1–13.

⁷¹ For example, the latest US National Intelligence Council’s (NIC) Global Trends report published in 2021 focuses on shifting strategic alliances, US–China power competition, disruptive technologies, non-kinetic influence operations, and nuclear proliferation as key drivers of future conflict. National Intelligence Council, Global Trends 2040: A More Contested World (Office of the Director of National Intelligence, March 2021).

⁷² Wilner and Atkinson, Artificial Intelligence and National Defence, p. 3.

⁷³ Strategic surprises are generally attributed to two causes: either the absence of timely warning information, or the failure of decision-makers to act on information that was available to them – whether due to individual shortcomings, institutional flaws, or both. Wohlstetter, Pearl Harbor, p. 338 and pp. 400–1.

⁷⁴ Saltini, Mishra, and Reiner, Nuclear Command, Control & Communications, pp. 11–12.

⁷⁵ Steven C. Gordon, ‘Decision support tools for warfighters’, Air Force Agency for Modeling and Simulation (2000), available at: {https://apps.dtic.mil/sti/tr/pdf/ADA461228.pdf}.

⁷⁶ NATO Science & Technology Organization, ‘Using artificial intelligence to enhance military decision-making’, YouTube (3 April 2024), available at: {www.youtube.com/watch?v=A2ZAHrT3UwM}.

⁷⁷ Peter Bishop, ‘A yardstick too far?’, Foresight, 3:3 (2001), pp. 163–7; Matthew J. Spaniol and Nicholas J. Rowland, ‘Defining scenario’, Futures and Foresight Science, 1:1 (2018), pp. 1–13.

⁷⁸ This challenge mirrors a broader dilemma faced by intelligence analysts: success often involves preventing an event from occurring, which makes it inherently difficult to evaluate the accuracy of past predictions – and, in turn, to improve future forecasting performance. Zegart, Spies, Lies, and Algorithms, p. 116.

⁷⁹ Peter Schwartz, The Art of the Long View: Planning for the Future in an Uncertain World (Doubleday, 1991).

⁸⁰ Balakrishnan, Knack, and Clancy, Applying AI to Strategic Warning, pp. 4–5; Knorr, ‘Failures in national intelligence estimates’, pp. 455–67; Wohlstetter, Pearl Harbor.

⁸¹ Some empirical studies of historical crises challenge the dominant view that time pressure is the defining psychological feature of crisis situations, and that the stress it induces is primarily responsible for inhibiting deliberation or the exploration of alternative options. Glenn H. Snyder and Paul Diesing, Conflict Among Nations: Bargaining, Decision-Making, and System Structure in International Crises (Princeton University Press, 1977).

⁸² Andreas B. Eder, Roland Pfister, David Dignath, and Bernhard Hommel, ‘Anticipatory affect during action preparation: Evidence from backward compatibility in dual-task performance’, Cognition and Emotion, 31:6 (2016), pp. 1211–24.

⁸³ Wolfram Schultz, ‘Dopamine reward prediction error coding’, Dialogues in Clinical Neuroscience, 18:1 (2016), pp. 23–32.

⁸⁴ See Antonio Damasio, Descartes’ Error: Emotion, Reason, and the Human Brain (Penguin, 1994); Elke U. Weber and Eric J. Johnson, ‘Mindful judgment and decision making’, Annual Review of Psychology, 60 (2009), pp. 53–85; Paul Slovic, Melissa L. Finucane, Ellen Peters, and Donald G. MacGregor, ‘Risk as analysis and risk as feelings’, Risk Analysis, 24:2 (2004), pp. 311–22.

⁸⁵ Knorr, ‘Failures in national intelligence estimates’, p. 459.

⁸⁶ Tetlock, Expert Political Judgment, p. 40.

⁸⁷ Quoted in: Jervis, ‘Reports, politics, and intelligence failures’, p. 23 (emphasis added).

⁸⁸ Raymond S. Nickerson, ‘Confirmation bias: A ubiquitous phenomenon in many guises’, Review of General Psychology, 2:2 (1998), pp. 175–220.

⁸⁹ Scott Hawkins and Reid Hastie, ‘Hindsight: Biased judgment of past events after the outcomes are known’, Psychological Bulletin, 107 (1990), pp. 311–27.

⁹⁰ During the Cold War Herman Kahn introduced the influential forty-four-rung ‘escalation ladder’ metaphor, which became a foundational concept in Cold War nuclear strategy. Herman Kahn, On Escalation: Metaphors and Scenarios (Praeger, 1965).

⁹¹ Robyn M. Dawes, ‘Behavioral decision-making & judgment’, in Daniel Gilbert, Susan Friske, and Gardner Lindzey (eds), Handbook of Social Psychology, 4th ed. (McGraw Hill, 1998), pp. 497–548.

⁹² Oleg Shakirov, ‘Russian thinking on AI integration and interaction with nuclear command and control, force structure, and decision-making’, European Leadership Network (13 November 2023), available at: {https://europeanleadershipnetwork.org/report/russian-thinking-on-ai-integration-and-interaction-with-nuclear-command-and-control-force-structure-and-decision-making/}.

⁹³ A recent cross-domain study of LLMs found that an over-reliance on machine outputs and thus a lack of scrutiny in AI outputs inhibits human problem solving and critical reasoning abilities. Eren Bilen and Justine Hervé, ‘When AI gives bad advice: Critical thinking in human–AI collaborations’, Social Science Research Network (10 July 2025), https://doi.org/10.2139/ssrn.5040466.

⁹⁴ In one study, participants were shown AI-generated images of ‘financial managers’, 85 per cent of which depicted white men – even though less than half of financial managers in the United States are men, and an even smaller proportion are white. Exposure to these skewed images increased participants’ tendency to associate the role with that demographic group, demonstrating how biased AI outputs can subtly reinforce stereotypes. Micah Glickman and Tali Sharot, ‘How human–AI feedback loops alter human perceptual, emotional and social judgements’, Nature Human Behaviour 9, (2025), pp. 345–59, https://doi.org/10.1038/s41562-024-02077-2.

⁹⁵ See Scott D. Sagan, The Limits of Safety: Organizations, Accidents, and Nuclear Weapons (Princeton University Press, 1993); Johnathan Koehler, ‘The base-rate fallacy reconsidered: Descriptive, normative, and methodological challenges’, Behavioral and Brain Sciences, 19:1 (1996), pp. 1–17.

⁹⁶ In the 1950s, for example, US Air Force intelligence overstated Soviet bomber deployments (the ‘bomber gap’), while CIA-led analysts in the 1960s understated Soviet ICBM deployments – an overcorrection, critics contend, to the erroneous 1960 ‘missile gap’. Betts, ‘Analysis, war, and decision’, p. 65.

⁹⁷ Knorr, ‘Failures in national intelligence estimates’, pp. 459–60.

⁹⁸ Jervis, ‘Reports, politics, and intelligence failures’, p. 36.

⁹⁹ Wohlstetter, Pearl Harbor, p. 395.

¹⁰⁰ To avoid the trade-off between satisfying policymakers and adhering to professional standards, analysts display ‘motivated bias’, favouring estimates that supported, or at least do not undercut, policy. Jervis, ‘Reports, politics, and intelligence failures’, p. 36.

¹⁰¹ For example, a US think-tank hosted wargame, ‘Dangerous Straits’, highlighted that, despite China’s stated ‘no first use’ policy, simulated conflict often led to brandishing or limited demonstration of nuclear capability as a deterrent to US involvement. The game also showed how swiftly escalation can cross red lines on both sides. Stacie Pettyjohn, Becca Wasser; and Chris Dougherty, Dangerous Straits: Wargaming a Future Conflict over Taiwan (Center for a New American Security, June 2022).

¹⁰² Juan-Pablo Rivera, Gabriel Mukobi, Anka Reuel, Max Lamparth, Chandler Smith, and Jacquelyn Schneider, ‘Escalation risks from language models in military and diplomatic decision-making’, arXiv preprint (January 2024), available at: {https://arxiv.org/abs/2401.03408}.

¹⁰³ Ana Davila, Jacinto Colan, and Yasuhisa Hasegawa. Beyond Single Models: Enhancing LLM Detection of Ambiguity in Requests through Debate, arXiv preprint (June 2025), available at: {https://arxiv.org/pdf/2507.12370}.

¹⁰⁴ ‘Iran informed Qatar in advance of planned strike on U.S. bases in Qatar’, Reuters (23 June 2025), available at: {https://www.reuters.com/world/middle-east/iran-informed-qatar-advance-strikes-us-bases-2025-06-23/}.

¹⁰⁵ Richard Kerr, Thomas Wolfe, Rebecca Donegan, and Aris Pappas, ‘Issues for the US intelligence community collection and analysis on Iraq’, Studies in Intelligence, 49:3 (2005), pp. 1–13.

¹⁰⁶ Recent crisis simulations using frontier LLMs indicate that de-escalation and cooperative pathways were rarely chosen; instead, models tended to escalate or, at most, reduce violence incrementally rather than pursue accommodation. Kenneth Payne, ‘AI arms and influence: Frontier models exhibit sophisticated reasoning in simulated nuclear crises’, arXiv preprint (2026), available at: {https://arxiv.org/abs/2602.14740}.

¹⁰⁷ Wohlstetter, Pearl Harbor, pp. 390 and 397.

¹⁰⁸ Glenn H. Snyder and Paul Diesing, Conflict Among Nations: Bargaining, Decision-Making, and System Structure in International Crises (Princeton University Press, 1977), pp. 287 and 494.

¹⁰⁹ For example, US intelligence underestimated Japanese capabilities in the 1940s and failed to anticipate the Soviet missile buildup in Cuba during the early 1960s. More recently, intelligence shortfalls included Western underestimation of Russia’s willingness to launch a full-scale invasion of Ukraine in 2022, and Israel’s failure to anticipate the scale and coordination of Hamas’s October 2023 attack. Lawrence Freedman, ‘How will the war in Ukraine end?’, New Statesman (30 January 2024), available at: {https://www.newstatesman.com/world/europe/ukraine/2023/12/how-will-the-war-in-ukraine-end}; Emily Harding, ‘How could Israeli intelligence miss the Hamas invasion plans?’, Center for Strategic and International Studies (CSIS) (11 October 2023), available at: {https://www.csis.org/analysis/how-could-israeli-intelligence-miss-hamas-invasion-plans}; Knorr, ‘Failures in national intelligence estimates’, pp. 462–3.

¹¹⁰ Balakrishnan, Knack, and Clancy, Applying AI to Strategic Warning, p. 26.

¹¹¹ Anthony Patt and Richard Zeckhauser, ‘Action bias and environmental decisions’, Journal of Risk and Uncertainty, 21:1 (2000), pp. 45–72.

¹¹² Jervis, ‘Reports, politics, and intelligence failures’, p. 22.

¹¹³ Betts, ‘Analysis, war, and decision’, pp. 74–5.

¹¹⁴ ‘Wargaming and crisis simulation initiative collection’, Hoover Institution, Stanford University, digital repository, Hoover Institution Library & Archives, available at: {https://wargaming.hoover.org} (accessed 17 December 2025).

¹¹⁵ Hirsh, ‘The AI doomsday machine is closer to reality than you think’.

¹¹⁶ For example, after the 1973 surprise attack, many Israeli analysts concluded that defence planning must proceed on the assumption that no reliable warning would be available, warranting precautionary mobilisation even on weak or ambiguous evidence. Similarly, during the Cold War US hawks argued that uncertainty about Soviet intentions justified assuming a worst-case scenario – that Moscow sought the ability to fight and win a nuclear war. Michael I. Handel, ‘The Yom Kippur war and the inevitability of surprise’, International Studies Quarterly, 21:3 (1977), pp. 461–502; Betts, ‘Analysis, war, and decision’, p. 74.

¹¹⁷ Betts, ‘Analysis, war, and decision’, p. 69 (emphasis added).

¹¹⁸ See Mick Ryan and Nathan K. Finney, ‘Science fiction and the strategist 2.0’, The Strategic Bridge (27 August 2018), available at: {https://thestrategybridge.org/the-bridge/2018/8/27/science-fiction-and-the-strategist-20}; August Cole and P.W. Singer, ‘Thinking the unthinkable with useful fiction’, Journal of Future Conflict, Online Journal, Issue 2 (Fall 2020), pp. 2–8, available at: {http://www.jstor.org/stable/resrep28675.3.}.

¹¹⁹ On wargaming as an exploratory tool for examining decision dynamics under uncertainty, see Peter Perla, The Art of Wargaming (Naval Institute Press, 1990); Robert C. Rubel, ‘The epistemology of war gaming’, Naval War College Review, 59:2 (2006), pp. 1–21.

¹²⁰ Tetlock, Expert Political Judgment, p. 194.

¹²¹ Rivera et al., ‘Escalation risks from language models in military and diplomatic decision-making’.

¹²² James Johnson, ‘Automating the OODA loop in the age of intelligent machines: Reaffirming the role of humans in command-and-control decision-making in the digital age’, Defence Studies, 23:1 (2022), pp. 43–67.

¹²³ Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador Garcia, Sergio Gil-Lopez, Daniel Molina, Richard Benjamins, Raja Chatila, and Francisco Herrera, ‘Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI’, arXiv preprint (2019), arXiv:1910.10045, https://arxiv.org/abs/1910.10045.

¹²⁴ Taro Shimizu, ‘DECLARE: A comprehensive, multifaceted cognitive forcing strategy to confront complex cases’, International Journal of General Medicine, 16 (2023), pp. 1505–11.

¹²⁵ Recent research suggests that individuals with well-developed metacognitive skills are more likely to experience creativity gains when using generative AI tools like ChatGPT. Shuhua Sun, Zhuyi Angelina Li, Maw-Der Foo, Jing Zhou, and Jackson G. Lu, ‘How and for whom using generative AI affects creativity: A field experiment’, Journal of Applied Psychology (2025), https://doi.org/10.1037/apl0001296.

¹²⁶ The US National Geospatial-Intelligence Agency has emerged as a pioneer in developing and implementing a structured programme to train, qualify, and certify users in the responsible application of AI systems. Probasco et al., AI for Military Decision-Making, p. 34.

¹²⁷ James Lacey, ‘Peering into the future of artificial intelligence in the military classroom’, War on the Rocks (3 April 2025), available at: {https://warontherocks.com/2025/04/peering-into-the-future-of-artificial-intelligence-in-the-military-classroom/}.

¹²⁸ Benjamin Jensen, Yasir Atalan, and Dan Tadross, ‘It is time to democratize wargaming using generative AI’, CSIS Analysis (22 February 2024), available at: {https://www.csis.org/analysis/it-time-democratize-wargaming-using-generative-ai}.

¹²⁹ The author made use of ChatGPT (GPT-5, OpenAI) to assist with the drafting of this article. GPT-5 was accessed between June and September 2025 and used without modification to support language polishing, summarisation, and the construction of fictional scenarios based on the author’s original concepts and outlines. All conceptual arguments, theoretical claims, and final interpretations are the sole work of the author, who reviewed and verified all AI-assisted text to ensure accuracy and integrity.

Article contents

GenAI and synthetic foresight at the brink: The future of nuclear crisis decision-making

Abstract

Keywords

Information

Introduction

GenAI as a synthetic cognitive actor: Blurring the boundary between prediction and action

From forecasting to anticipatory systems

Genai and the illusion of certainty and objectivity

‘Synthetic foresight’

Emotional dynamics and temporal distortions

Strategic risks and epistemic distortions: An algorithmic procrustean bed?

Normalisation of low-probability, high-impact escalation path

Misattribution of adversarial intent cloaked in algorithmic certainty

Synthetic feedback loops and self-fulfilling crises

Strategic stress test: Crisis scenarios in the age of synthetic foresight

Scenario 1: Simulated inevitability in the Taiwan Straits

Scenario 2: Misattributed intentions at the brink in South Asia

Conclusion

Acknowledgements

Financial statement

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests