Impact Statement
This work redefines the role of AI in turbulence modelling by engaging an LLM not as a tool, but as a collaborator in the scientific discovery process. Through a structured, iterative human–AI interaction, the LLM generates, reasons about and refines wall models for LES under complex non-equilibrium conditions – rediscovering known strategies and proposing new, physically interpretable formulations that outperform traditional models. The process leverages the LLM’s comprehensive knowledge of the turbulence modelling literature – far exceeding that of any individual human. The results demonstrate, for the first time, that general-purpose LLMs can contribute meaningfully to open-ended, physics-based turbulence modelling tasks, suggesting a new paradigm in which AI actively advances core areas of fluid mechanics, rather than merely accelerating existing workflows.
1. Introduction
Artificial intelligence (AI) has achieved human-like performance in tasks once thought to require uniquely human intuition – mastering the game of Go (Silver et al., Reference Silver, Schrittwieser, Simonyan, Antonoglou, Huang, Guez, Hubert, Baker, Lai, Bolton, Chen, Lillicrap, Hui, Sifre, van den Driessche, Graepel and Hassabis2017), predicting protein structures (Jumper et al., Reference Jumper, Evans, Pritzel, Green, Figurnov, Ronneberger, Tunyasuvunakool, Bates, Žídek, Potapenko, Bridgland, Meyer, Kohl, Ballard, Cowie, Romera-Paredes, Nikolov, Jain, Adler, Back, Petersen, Reiman, Clancy, Zielinski, Steinegger, Pacholska, Berghammer, Bodenstein, Silver, Vinyals, Senior, Kavukcuoglu, Kohli and Hassabis2021) and driving through the streets (Yurtsever et al., Reference Yurtsever, Lambert, Carballo and Takeda2020). These advances have fuelled speculation about the advent of general artificial intelligence (GAI), an AI that can reason, adapt and solve problems across domains. A related concept gaining traction is the AI agent – a system capable of autonomous decision-making and iterative improvement in pursuit of a human-defined goal (Buehler, Reference Buehler2024; Ni and Buehler, Reference Ni and Buehler2024; Pandey et al., Reference Pandey, Xu, Wang and Chu2025; Dong et al., Reference Dong, Lu and Yang2025). While most practical AI systems today remain narrow in scope, the emergence of large language models (LLMs) has narrowed the gap between domain-specific tools and general-purpose intelligence. These models, trained on vast corpora of human knowledge, can synthesise information, generate code and reason over complex topics (Chang et al., Reference Chang, Wang, Wang, Wu, Yang, Zhu, Chen, Yi, Wang and Wang2024). Despite the hype surrounding GAI, compelling demonstrations of AI contributing new scientific insights – particularly in the physics – remain rare.
Among the most enduring grand challenges in physics is turbulence – a chaotic, multi-scale phenomenon that resists closed-form description and predictive modelling. Despite over a century of effort, turbulence modelling remains largely empirical, guided by human intuition, physical reasoning and hard-won insights from data (Meneveau and Katz, Reference Meneveau and Katz2000; Piomelli and Balaras, Reference Piomelli and Balaras2002; Durbin, Reference Durbin2018). Yet, it underpins critical applications ranging from climate prediction (Alizadeh, Reference Alizadeh2022) and aerospace design (Mani and Dorgan, Reference Mani and Dorgan2023) to wind and energy systems (Stevens and Meneveau, Reference Stevens and Meneveau2017). The gold standard of predictive fidelity, direct numerical simulation (DNS), is limited to canonical flows at modest Reynolds numbers due to its extreme computational cost (Yang and Griffin, Reference Yang and Griffin2021; Choi and Moin, Reference Choi and Moin2012). Large-eddy simulation (LES), which resolves large-scale motions while modelling smaller ones, offers a more tractable alternative (Goc et al., Reference Goc, Lehmkuhl, Park, Bose and Moin2021, Reference Goc, Moin, Bose and Clark2024) – but its cost remains prohibitive in high-Reynolds-number applications, especially near walls where turbulent eddies scale with their distance from the wall (Marusic and Monty, Reference Marusic and Monty2019). This makes wall modelling the pacing item for extending LES to realistic flows (Bose and Park, Reference Bose and Park2018; Larsson et al., Reference Larsson, Kawai, Bodart and Bermejo-Moreno2016).

Figure 1. (a) Schematic of wall-modelled LES (WMLES). A wall model predicts the wall fluxes – shear stress
$\tau _w$
and heat flux
$q_w$
– based on LES-resolved flow quantities at a distance
$h_{\textrm wm}$
from the wall. (b) Blades in a turbine, illustrating flows subjected to non-equilibrium effects such as APGs (red box), system rotation (orange box) and surface roughness (purple box). (c) Model problems. From left to right: channel subjected to a suddenly imposed APG, channel with system rotation and channel with roughness on the bottom wall.
Figure 1a schematically illustrates wall modelling in the context of LES. The LES grid in the near-wall region typically scales with the outer layer rather than the local eddies, leaving the wall layer unresolved. A wall model is therefore used to reconstruct the near-wall turbulence and predict wall fluxes, such as the wall shear stress
$\tau _w$
and heat flux
$q_w$
, based on LES-resolved flow quantities at a distance
$h_{\textrm wm}$
from the wall – often referred to as the LES/wall-model matching location. The most widely used wall model is the equilibrium wall model (EWM) (Schumann, Reference Schumann1975; Kawai and Larsson, Reference Kawai and Larsson2012; Yang et al., Reference Yang, Park and Moin2017), which assumes the law of the wall (LoW) holds between the wall and
$h_{\textrm wm}$
locally and instantaneously. However, the LoW is valid only under equilibrium conditions for the mean flow. Consequently, the EWM falls short when applied to the non-equilibrium boundary layers. Figure 1b highlights a representative case in which near-wall turbulence is affected by non-equilibrium effects, including APGs, system rotation and surface roughness. Efforts have been made to address these non-equilibrium effects in the contexts of wall models. In the following, we highlight a few of these developments. Park and Moin (Reference Park and Moin2014) proposed a dynamic non-equilibrium wall model that accounts for the temporal lag between outer-layer changes and the wall shear stress response. Yang et al. (Reference Yang, Sadique, Mittal and Meneveau2015) developed an integral wall model based on the momentum integral equation, enabling improved representation of mean velocity profiles. Bose and Moin (Reference Bose and Moin2014) introduced a dynamic slip boundary condition that allows the wall shear stress to adapt to the large-scale structures resolved by LES. Building on this concept, Bae et al. (Reference Bae, Lozano-Durán, Bose and Moin2019) formulated a dynamic slip wall model with a self-consistent treatment of the slip length. Fowler et al. (Reference Fowler, Zaki and Meneveau2022) introduced a Lagrangian relaxation wall model that does not impose the LoW instantaneously, but instead allows the modelled stress to evolve towards equilibrium over time, thereby better capturing non-equilibrium effects such as pressure gradients. A more comprehensive review of recent progress in wall modelling is available from Fowler et al. (Reference Fowler, Zaki and Meneveau2022), Bose and Park (Reference Bose and Park2018) and Yang et al. (Reference Yang, Chen, Zhang and Kunz2024
b), and is not repeated here for brevity. While these advances have significantly expanded the applicability of LES, the process of model development remains human-driven – rooted in hypothesis generation, expert intuition and iterative refinement. In this context, near-wall turbulence not only remains a persistent bottleneck, but also serves as an ideal testbed for exploring whether AI can drive physical model development.
Recent years have seen a surge of interest in applying AI to turbulence modelling. In particular, machine learning (ML) tools have been used to develop turbulence closures by training on data from high-fidelity simulations and experiments (Duraisamy et al., Reference Duraisamy, Iaccarino and Xiao2019; Pandey et al., Reference Pandey, Schumacher and Sreenivasan2020; Shan et al., Reference Shan, Liu, Cao, Sun and Zhang2023; Bin et al., Reference Bin, Chen, Huang and Yang2022, Reference Bin, Huang and Yang2023). These efforts have yielded subgrid-scale models for LES (Maulik et al., Reference Maulik, San, Jacob and Crick2019; Cheng et al., Reference Cheng, Giometto, Kauffmann, Lin, Cao, Zupnick, Li, Li, Huang, Abernathey and Gentine2022; Xie et al., Reference Xie, Wang and W.2020), Reynolds stress closures for Reynolds-averaged Navier–Stokes (RANS) (Ling et al., Reference Ling, Kurzawski and Templeton2016; Parish and Duraisamy, Reference Parish and Duraisamy2016; Wang et al., Reference Wang, Wu and Xiao2017; Bin et al., Reference Bin, Hu, Li, Grauer and Yang2024 a, b; Wu et al., Reference Wu, Zhang and Zhang2025) and wall models for LES (Yang et al., Reference Yang, Zafar, Wang and Xiao2019; Bae and Koumoutsakos, Reference Bae and Koumoutsakos2022; Vadrot et al., Reference Vadrot, Yang, Bae and Abkar2023 b; Zhou et al., Reference Zhou, He and Yang2021; Ma and Lozano-Durán, Reference Ma and Lozano-Durán2025). Given the focus of this work on near-wall turbulence modelling, we briefly review several notable data-driven wall modelling efforts. Yang et al. (Reference Yang, Zafar, Wang and Xiao2019) proposed a predictive wall model based on supervised training of neural networks. The model learns a direct mapping between resolved flow quantities at the matching location and wall fluxes, with the training guided by the LoW. Zhou et al. (Reference Zhou, He and Yang2021) adopted a similar supervised learning approach and demonstrated its effectiveness in LES of periodic hill flows, accurately capturing flow separation and reattachment. Bae and Koumoutsakos (Reference Bae and Koumoutsakos2022) introduced a multi-agent reinforcement learning (RL) framework, in which local agents infer wall stress through trial-and-error interactions with the LES environment, with rewards based on physical performance metrics such as velocity field accuracy. Vadrot et al. (Reference Vadrot, Yang, Bae and Abkar2023 b) extended this RL approach and demonstrated that the trained model can recover the logarithmic law of the wall across a range of Reynolds numbers. Lozano-Durán and Bae (Reference Lozano-Durán and Bae2023) proposed a machine learning wall model based on the principle of composability. Rather than training on complex geometries directly, the model is trained on simple ‘building-block’ flows – canonical configurations such as turbulent channel and Couette flows – with the assumption that more complex boundary layer behaviours can be composed from these elementary patterns. Additional reviews of recent developments in data-driven wall modelling can be found from Vadrot et al. (Reference Vadrot, Yang and Abkar2023 a) and are not repeated here for brevity. Despite these advances, the development process remains fundamentally human-guided. Researchers define the modelling objectives, curate training datasets, select architectures and tune loss functions. The resulting models are typically evaluated on predefined benchmarks. This raises a broader question: can AI be tasked with open-ended problem solving in fluids engineering, where the objective is to iteratively generate and refine physically meaningful models?
In this study, we present a new paradigm in which a general-purpose LLM is tasked with solving problems in turbulence modelling. Unlike conventional applications where AI systems are trained to fit existing data within predefined architectures, we employ DeepSeek-R1, an open-weight LLM, as an autonomous agent operating within a closed-loop modelling framework. The LLM is prompted to generate strategies for near-wall turbulence modelling, which are then assessed both a priori and a posteriori. Based on performance feedback, the LLM revises its models through multiple iterations – mirroring the hypothesis generation, testing and refinement cycles traditionally performed by human researchers. This approach departs from standard one-shot benchmark-driven machine learning workflows such as reported by Pröhl et al. (Reference Pröhl, Putzier and Zarnekow2024) and Jiang et al. (Reference Jiang, Gao and Karniadakis2025). The present framework enables the model to engage in long-chain reasoning and to propose modelling strategies that exhibit both interpretability and performance. We note that the choice of DeepSeek-R1 here over more widely used commercial models, such as ChatGPT or Claude, is motivated by two key considerations. First, DeepSeek-R1 is open-source, which enables full control over prompt design, reproducibility and model deployment. Second, recent benchmarks suggest that DeepSeek-R1 exhibits strong reasoning capabilities, particularly in physics and mathematics domains (Gao et al., Reference Gao, Jin, Ke and Moryoussef2025).
This article focuses on three longstanding challenges in near-wall turbulence modelling: the effects of APGs, system rotation and surface roughness. These physical mechanisms often appear simultaneously in engineering applications such as turbomachinery and aerospace flows. However, modelling their combined effects directly in complex configurations – such as those illustrated in Figure 1b – is ill-advised. The intertwined influence of multiple non-equilibrium effects makes it difficult to assess modelling errors. Here, we adopt a classical strategy in turbulence research: decomposing a complex modelling problem into a set of well-defined model problems. As sketched in Figure 1c, we study each effect individually in the context of periodic channel flows, which provide controlled environments for isolating the impact of APG, rotation or roughness. This serves multiple purposes. First, it enables scientific clarity – allowing us to assess whether the AI can identify the dominant mechanisms associated with each effect. Second, it facilitates validation against available high-fidelity data from DNS, which exists for these canonical flows – at least for mean velocity profiles and wall stresses (Yang et al., Reference Yang, Xia, Lee, Lv and Yuan2020 b; Chen et al., Reference Chen, Wu, Griffin, Shi and Yang2023; Huang and Yang, Reference Huang and Yang2021; Yang et al., Reference Yang, Zhang, Yuan and Kunz2023; Nair et al., Reference Nair, Wadhai, Kunz and Yang2024). Third, it mirrors the historical development of turbulence models, where insight is gained from simplified configurations before generalisation to more complex flow scenarios. For each model problem, DeepSeek-R1 is prompted to develop near-wall models. The proposed models are then evaluated in canonical LES settings and their performance is used to iteratively refine subsequent formulations. In doing so, we shall see that the AI demonstrates an emerging capability for open-ended problem solving in fluids engineering.
The rest of the paper is organised as follows. In § 2, the methodology is described. Results are presented in § 3, followed by concluding remarks in § 4.
2. Methodology
2.1. Large language model
Our AI capabilities are driven by DeepSeek-R1, leveraging the computational power and infrastructure of the SiliconCloud API for seamless integration and performance. DeepSeek-R1 is an LLM released in January 2025 by DeepSeek AI (Guo et al., Reference Guo, Yang, Zhang, Song, Zhang, Xu, Zhu, Ma, Wang and Bi2025). It is an open-weight, transformer-based model designed to support long-form reasoning with a focus on handling complex, multi-step problems through iterative token generation and context management. The model is built on a dense decoder-only transformer architecture with 61 layers and approximately 671 billion parameters (Guo et al., Reference Guo, Yang, Zhang, Song, Zhang, Xu, Zhu, Ma, Wang and Bi2025), making it one of the largest openly accessible models in terms of parameter count. It supports a 32 000-token context window, facilitating the processing of long documents and maintaining coherence across extended dialogues or analytical tasks. The model was pretrained on a diverse mixture of high-quality data sources, including curated web documents, source code from multiple programming languages and scholarly articles from various scientific domains, which enhances its capability in both general and technical contexts. This was followed by extensive instruction tuning using supervised fine-tuning and reinforcement learning from human feedback (RLHF) to align its outputs with user intentions while preserving reasoning depth. The reasoning capabilities of Deepseek-R1 have been benchmarked against state-of-the-art models across scientific question-answering tasks, maths word problems and theorem proving. It consistently showing competitive performance, often matching or exceeding the performance of other leading models in complex reasoning scenarios.

Figure 2. A flow diagram representing the interaction between the engineers and LLM.
Figure 2 illustrates the collaborative workflow between the engineers and the LLM, outlining the iterative process of proposal, evaluation and refinement of candidate models. When interacting with the LLM, we play the role of a fluids engineer, responsible for defining the modelling objectives and evaluating candidate solutions. The LLM, in turn, acts as the modeller, tasked with generating modelling strategies and outlining potential implementation pathways. The interaction begins by presenting the LLM with a flow configuration and the associated modelling challenge. It is then prompted to propose candidate solutions, often followed by clarification prompts to elicit additional details or justification needed for evaluation and practical implementation. In contrast to benchmark-style evaluations, which typically involve a single round of interaction focused on verifying the correctness of LLM responses, our process seeks to mimic the communication between a turbulence modeller in academia and a fluids engineer in industry. This communication is inherently iterative, involving multiple rounds of feedback and refinement. Here, multiple rounds of interaction are critical: they provide opportunities for both the user and the LLM to correct omissions or misunderstandings. For example, the instructions we initially provide may be incomplete or the models proposed by the LLM may contain errors. These shortcomings are often corrected through subsequent interactions, without explicit acknowledgment from either party. Once a potentially viable solution is provided, we proceed to formal evaluation. Two types of evaluation are employed. The first is based on feasibility: some solutions may require unavailable training data or demand impractical levels of manual coding, and such proposals are discarded. The second concerns accuracy, assessed through both a priori evaluations – focused on consistency with empirical knowledge – and a posteriori assessment via implementation and testing in CFD simulations. Each candidate model must be rigorously evaluated through high-fidelity numerical simulation before it can be considered a viable solution. Even if a model fails the a posteriori test, the simulation results still offer valuable performance feedback. The feedback is subsequently used to guide the LLM in refining the candidate model. The iterative process terminates once a candidate model achieves satisfactory accuracy in the target flow configurations. Further details regarding the iterative interaction and evaluation process are provided in § 3.
2.2. Computational fluid dynamics
We solve the filtered incompressible Navier–Stokes equations, written in index notation with summation over repeated indices:
where
$\tilde {u}_i$
and
$\tilde {p}$
are the resolved velocity and pressure, respectively;
$\rho$
is the fluid density,
$\nu$
is the kinematic viscosity and
$\Omega _j$
denotes system rotation. The subgrid-scale (SGS) stress tensor
$\tau _{ij}$
is modelled according to the Boussinesq hypothesis:
where
$\tilde {S}_{ij} = \frac {1}{2} \left ( \partial _j \tilde {u}_i + \partial _i \tilde {u}_j \right )$
is the resolved rate-of-strain tensor and
$\delta _{ij}$
is the Kronecker delta. The eddy viscosity
$\nu _t$
is computed using the Vreman model, whose details could be found from Vreman (Reference Vreman2004). Near-wall turbulence is not resolved and is modelled using a wall model. Two types are considered. The first is the EWM (Schumann, Reference Schumann1975; Piomelli et al., Reference Piomelli, Ferziger, Moin and Kim1989; Kawai and Larsson, Reference Kawai and Larsson2012; Yang et al., Reference Yang, Park and Moin2017), which imposes the LoW between the wall and the LES/wall-model matching location
$h_{ wm}$
:
\begin{equation} \tau _{w,x}/\rho =-\left [\frac {{U}^+_{||}}{f(y^+)}\right ]^2\frac {\tilde {u}_{ LES}}{{U}_{||}},\quad \tau _{w,z}/\rho =-\left [\frac {{U}^+_{||}}{f(y^+)}\right ]^2\frac {\tilde {w}_{ LES}}{{ U}_{||}}, \end{equation}
where the superscript
$+$
denotes normalisation by wall units,
${\textbf U}_{||}$
is the wall-parallel velocity,
$\tilde {u}_{ LES}$
and
$\tilde {w}_{ LES}$
are the LES velocity in the streamwise and the transverse directions at the LES/wall-model matching location. In all of our WMLESs,
$h_{wm}/h\approx 0.2$
and
$f(y^+)$
is Spalding’s LoW. The algebraic EWM in (2.4) has an ODE counterpart:
where
is the wall model eddy viscosity. The models in both (2.4) and (2.5) are equivalent. They both impose the LoW between the matching location and the wall. In the logarithmic region, the LoW is
with von Kármán constant
$\kappa \approx 0.41$
and intercept
$B \approx 5.2$
(Pope, Reference Pope2001). For rough-wall flows, a roughness function
$\Delta U^+$
is subtracted to reflect the downward shift in the velocity profile:
In addition to the EWM, the second type of wall model includes those proposed from DeepSeek-R1. These LLM-driven closures are presented in § 3. While the details of the models differ, their implementations are rather alike, and we place the LES/wall-model matching location at
$h_{ wm}/h\approx 0.2$
as well.
We focus on the model problems in Figure 1c. The flow configuration is a periodic channel. The flow is periodic in the streamwise and spanwise directions. The wall boundary condition is supplied by a wall model. Three flow effects are studied. First, a fully developed plane channel subjected to a suddenly imposed APG. Second, a plane channel subjected to spanwise system rotation. Third, channel flow with roughness on the bottom wall. Reference data are available from Chen et al. (Reference Chen, Wu, Griffin, Shi and Yang2023); Xia et al. (Reference Xia, Shi and Chen2016); Flack et al. (Reference Flack, Schultz, Barros and Kim2016); Medjnoun et al. (Reference Medjnoun, Rodriguez-Lopez, Ferreira, Griffiths, Meyers and Ganapathisubramani2021); Womack et al. (Reference Womack, Volino, Meneveau and Schultz2022); Flack and Schultz (Reference Flack and Schultz2023). The computational grid is uniformly spaced in all directions following the standard practice (Bose and Moin, Reference Bose and Moin2014; Park and Moin, Reference Park and Moin2014; Yang et al., Reference Yang, Pirozzoli and Abkar2020 a). Key simulation parameters, including domain size, Reynolds number and grid resolution, are listed in Table 1. The domain size, grid spacing and aspect ratios closely match those used in prior studies (Anderson et al., Reference Anderson, Yang, Shrestha and Awasthi2018; Abkar et al., Reference Abkar, Bae and Moin2016; Martinez-Tossas et al., Reference Martinez-Tossas, Churchfield, Yilmaz, Sarlak, Johnson, Sørensen, Meyers and Meneveau2018; Yang et al., Reference Yang, Abkar and Park2024 a), ensuring the credibility of performance evaluation and cross-study comparisons.
Table 1. Details of WMLESs. The friction Reynolds number is defined as
$Re_\tau = u_\tau h / \nu$
, where
$u_\tau$
is the friction velocity,
$h$
is the channel half-height and
$\nu$
is the kinematic viscosity. For cases involving a suddenly imposed APG,
$Re_\tau$
is the baseline value, i.e. at the time instant when the APG is applied. The APG strength is characterised by
$\Pi _0 = (h / \tau _{w,0}) (dp/dx)$
, where
$\tau _{w,0}$
is the wall shear stress prior to APG application. For cases with system rotation, the dimensionless rotation number is defined as
$Ro_\tau = 2 h \Omega / u_\tau$
, where
$\Omega$
is the rotation rate. Surface roughness is parametrised using the non-dimensional equivalent sandgrain roughness height,
$k_s^+ = k_s u_\tau / \nu$
. Case labels use the abbreviations APG, ROT and RW to indicate the presence of APG, rotation and roughness, respectively. The prefix ‘R’ denotes the nominal Reynolds number and is followed by
$Re_\tau / 100$
.

All simulations are performed using a finite-volume solver adapted for wall-modelled LES. The solver builds on the SUPES-cwm code base developed by Lv et al. (Reference Lv, Huang, Yang and Yang2021) and Gao and Lv (Reference Gao and Lv2025), which has been validated for channel flow calculations. It employs a fully conservative finite-volume formulation, nominal third-order spatial accuracy via characteristic-variable-based reconstruction and an explicit strong-stability-preserving Runge–Kutta scheme for time integration. Numerical fluxes are computed using a hybrid scheme blending central differencing and HLLC fluxes, with dissipation controlled via a blending factor. Further details of the code’s numerics could be found from Lv et al. (Reference Lv, Huang, Yang and Yang2021) and Gao and Lv (Reference Gao and Lv2025) along with a validation, and are not repeated here for brevity.
3. Results
3.1. Adverse pressure gradient
The presence of a pressure gradient – whether favourable or adverse – modifies the velocity distribution and the shear stress within the boundary layer, leading to history effects (Bobke et al., Reference Bobke, Vinuesa, Örlü and Schlatter2017; Chen et al., Reference Chen, Wu, Griffin, Shi and Yang2023) and significant departures from the canonical LoW behaviour observed under zero-pressure-gradient conditions (Perry et al., Reference Perry, Marusic and Jones2002; Volino, Reference Volino2020; Pozuelo et al., Reference Pozuelo, Li, Schlatter and Vinuesa2022). This deviation complicates near-wall turbulence modelling, which often relies on equilibrium assumptions.
In this subsection, we summarise our interactions with the LLM regarding this challenge, present the LLM-driven modelling solutions and assess its performance. The modelling challenge was presented to the LLM using the following prompt:
I am a fluid engineer. I use wall-modelled large-eddy simulation. I have the following flow problem: a fully developed channel flow with a suddenly imposed APG. The imposed APG is kept constant. You can imagine that the friction on the wall will decrease. The logarithmic law based wall model is not effective in simulating this flow. Please propose new models, taking into account simplicity and generality.
An overview of the interaction between the LLM and the user is illustrated in Figure 3, with the initial prompt summarised and labelled as A0. In the following, we document the communication process, highlighting how the LLM refines its reasoning, iteratively breaks down the problem and ultimately converges on a viable solution. The communication here uses approximately 7000 tokens.

Figure 3. Schematic overview of the interaction between the LLM and the user for the APG modelling problem.
The LLM proposes five potential modelling strategies, labelled A11 to A15 in Figure 3. Here, the four less successful strategies are discussed first, followed by the more promising one. The first strategy, A11, modifies the wall function. The LLM introduces the concept of an adjusted friction velocity, on which we prompt it to elaborate. The model takes the following form:
where
$\tau _{w,0}$
is the prediction of the EWM,
${\rm d}P^+$
is the dimensionless pressure gradient and
$\alpha$
is a model constant. Although the intermediate model forms are not shown here for brevity, we note that the initial model form was dimensionally inconsistent. Nonetheless, this error was corrected in subsequent responses without explicit prompting – an encouraging behaviour akin to human model development. Upon implementation, we observe an abrupt and unphysical change in wall shear stress upon the application of the APG, regardless of the choice of
$\alpha$
. Due to this fundamental misrepresentation of the flow physics, this strategy is discarded. The third strategy, A13, introduces a relaxation time scale to the wall stress evolution:
where
$\tau _w^n$
and
$\tau _w^{n+1}$
denote the wall shear stress at time steps
$n$
and
$n+1$
,
$\Delta t$
is the time step size,
$\tau _{w,0}$
is the EWM prediction, and
$T_r$
is a relaxation time scale. Analysis shows that this model acts as a low-pass temporal filter for wall shear stress, but does not effectively capture pressure gradient effects. When this limitation is communicated to the LLM, it reformulates the model by adding a source term accounting for the APG:
where
$\gamma$
is a model constant. A posteriori tests show that the model suffers from the same issue as that in (3.1) and is therefore discarded as well. The fourth strategy, A14, is an integral model:
Upon requesting clarification of the variables in the equation, we find that the model requires information at the edge of the boundary layer, which is generally not available in WMLES and is not desirable for practical implementation. The fifth strategy, A15, suggests a data-driven modelling approach. However, the lack of high-fidelity training data renders this strategy infeasible within the scope of the current study. The second strategy, A12, ultimately yields a viable solution. Upon follow-up discussions regarding the treatment of convective terms, the LLM refines the model from A12 to A22. The model is based on the thin boundary layer approximation, thus assuming
$\partial _y P = 0$
and neglecting the wall-normal velocity. The final model takes the following form:
\begin{equation} \begin{split} \frac {{\textrm D} u_{ wm}}{{\textrm D}t} = -\frac {1}{\rho }\frac {\partial P_{ LES} }{\partial x}+ \frac {\partial }{\partial y} \left [ (\nu +\nu _{t,0})\frac {\partial u_{ wm}}{\partial y} \right ],\\[6pt] \frac {{\textrm D} w_{ wm}}{{\textrm D}t} = -\frac {1}{\rho }\frac {\partial P_{ LES} }{\partial z}+ \frac {\partial }{\partial y} \left [ (\nu +\nu _{t,0})\frac {\partial w_{ wm}}{\partial y} \right ]. \end{split} \end{equation}
Here,
$u_{ wm}$
and
$w_{ wm}$
are the velocity components in the
$x$
and
$z$
directions, respectively;
$\nu$
is the molecular viscosity and
$\nu _{t,0}$
is the eddy viscosity, modelled in the same way as in the EWM. The model solves the velocity profiles between the LES/WM matching location and the wall in both wall-parallel directions,
$x$
and
$z$
, according to (3.5). The pressure gradients in both the
$x$
and
$z$
directions are obtained directly from the LES. Note that the pressure gradient and the material derivative terms require no closure, although storing the velocity profiles is necessary to compute the unsteady term. The rationale here is that history effects are largely captured by the material derivative and the pressure gradient, and no further modelling is required. In addition to the reasoning and the model formulation, the LLM also supplies an initial Python implementation, which – although not directly portable to our CFD code – provides a useful starting point. It is worth noting that some candidate modifications – for instance, enhancing A11 through the introduction of a relaxation time, A13 – could, in principle, address deficiencies such as unphysical abrupt increases in subsequent iterations. However, during the second round of dialogue, a viable formulation, A22, emerged. Consequently, the iterative process was terminated.
We proceed to the a posteriori test. Figure 4 presents the results; the computational set-up has already been summarised in Table 1. Figures 4a–4c show inner-scaled mean velocity profiles at various time instants following the application of the APG, under mild (panel a) and strong (panels b, c) APG conditions at two Reynolds numbers. As expected, the EWM performs adequately under mild APG, but deteriorates under strong APG, regardless of Reynolds number. In contrast, the model in (3.5) accurately tracks the evolution of the mean flow. Figures 4d–4f show the evolution of wall shear stress over time for the three cases shown in Figures 4a–4c. Again, while the EWM is reasonably accurate under mild APG (panel d), it fails to capture the rapid decline in wall shear stress observed in the strong APG cases (panels e, f). By comparison, the DeepSeek model in (3.5) consistently produces more accurate wall shear stress predictions across all three cases. Table 2 further quantifies the time to incipient separation for each case. It is evident that the new model substantially reduces the prediction error compared with the baseline EWM.
Table 2. Time to incipient separation (normalised by
$h/U_{c,0}$
). Relative errors compared to DNS are also listed.


Figure 4. (a–c) Inner-scaled mean velocity profiles following the imposition of an APG: (a) R5APG1; (b) R5APG100; (c) R10APG100. Time
$T$
is normalised by
$h/U_{c,0}$
, where
$U_{c,0}$
is the channel centreline velocity at
$t=0$
. (d–f) Evolution of the wall shear stress: (d) R5APG1; (e) R5APG100; (f) R10APG100. The dashed line corresponds to the results for the EWM. DNS reference data are shown in colour and predictions from the model in (3.5) are labelled ‘LLM-A’.
3.2. System rotation
System rotation profoundly modifies the dynamics of wall-bounded turbulent flows and arises in a wide range of applications, including turbomachinery, atmospheric flows and rotating devices. A special case that has received much attention is when the flow is subjected to spanwise system rotation which rotates along the
$z$
axis with the Coriolis force parallel to
$x{-}y$
plane, where the Coriolis force induces a wall-normal pressure gradient that leads to the formation of a ‘pressure side’ and a ‘suction side’, with turbulence suppressed and enhanced, respectively (Johnston et al., Reference Johnston, Halleent and Lezius1972; Xia et al., Reference Xia, Shi and Chen2016). This redistribution of turbulence intensity results in substantial deviations from the classical law-of-the-wall behaviour: on the pressure side, the mean velocity exhibits a near-linear profile instead of the canonical logarithmic scaling (Yang et al., Reference Yang, Xia, Lee, Lv and Yuan2020
b; Brethouwer, Reference Brethouwer2017). Consequently, the EWM no longer applies.
In this section, we engage the LLM to address this challenge. The modelling task was presented to DeepSeek-R1 using the following prompt:
I am a fluid engineer using wall-modelled large-eddy simulation. Turbulent flow in a rotating system has long been a challenging problem. I am currently considering a simplified case: a fully developed plane channel flow between two infinitely large plates, where
$x$
,
$y$
and
$z$
denote the streamwise, wall-normal and spanwise directions, respectively. A uniform volume force is applied along the
$x$
direction and the system rotates about the
$z$
axis at an angular velocity
$\Omega$
. Are there any existing wall models that can handle this type of flow?
An overview of the interaction between the LLM and the user is illustrated in Figure 5.

Figure 5. Schematic overview of the interaction between the LLM and the user for the spanwise rotation modelling problem.
Before attempting a solution, DeepSeek-R1 undertakes an extended chain of reasoning. It first analyses the mean flow by invoking and simplifying the RANS equations, concluding that spanwise mean velocities or streamwise vortices may emerge due to Coriolis–Reynolds stress interactions. It correctly concludes that spanwise system rotation modifies turbulence anisotropically: turbulence is suppressed on the stable (suction) side and enhanced on the unstable (pressure) side. Additionally, the altered Reynolds stresses redistribute momentum, fundamentally modifying the mean flow structure. DeepSeek-R1 then analyses the limitations of existing wall models. It reasons that the Coriolis force disrupts the equilibrium state assumed in traditional EWMs. Rotation selectively suppresses or enhances turbulent fluctuations and conventional models are unable to capture such anisotropic effects. Due to this extended reasoning chain, including the discussion on modelling strategies, the conversation uses approximately 21 000 tokens.
Following this analysis, DeepSeek-R1 proposes several possible solutions. The overarching principles are similar to those observed in the APG problem. A data-driven approach (labelled B12 in Figure 5) is proposed but ultimately discarded due to the lack of high-fidelity training data. An integral formulation, B13, is also suggested but rejected because it requires information at the boundary-layer edge. What we find particularly interesting in this interaction is that the LLM identifies transport-equation-based models with rotation corrections, B14, of which we were not previously aware and which could potentially form the basis of an effective two-layer model, but is not selected due to the significant amount of coding required. The selected solution, B11, involves modifying the law of the wall. Initially, DeepSeek-R1 recommends altering the logarithmic law. Upon examining the mean flow data, DeepSeek-R1 proposes a modified law of the wall:
where
$Ro^+$
is the rotation number based on wall units. Inverting the mean flow scaling in (3.7) yields a near-wall model:
Note that, due to the use of
$u_\tau$
for non-dimensionalisation, evaluating the right-hand side requires knowledge of the wall shear stress, making (3.7) an implicit equation for
$\tau _w$
. Nonetheless, one can avoid an iterative solution by using the wall shear stress from the previous time step to evaluate the right-hand side, following the practice of Yang et al. (Reference Yang, Park and Moin2017), Yang et al. (Reference Yang, Sadique, Mittal and Meneveau2015), among others.
We test the model in (3.7) within the WMLES framework for plane channel flows subjected to various levels of spanwise rotation. Figure 6 presents the mean velocity profiles for four representative cases. For comparison, we include results from the EWM and reference DNS data. The model in (3.7) consistently outperforms the baseline EWM and accurately captures the mean flow behaviour in spanwise rotating channels. Additional results are provided in Table 3, listing the predicted bulk velocities normalised by
$u_\tau$
. Note that a pressure gradient is imposed, the friction Reynolds number is fixed and the bulk velocity emerges as a prediction of the model. Across all cases, the proposed model significantly improves bulk velocity predictions relative to the EWM over a broad range of rotation numbers.
Table 3. Bulk velocity predictions normalised by
$u_\tau$
. Relative errors compared with DNS are also listed.


Figure 6. Mean velocity profiles in spanwise rotating channels: (a) R2ROT10; (b) R2ROT120; (c) R4ROT20; (d) R4ROT32. Label ‘LLM-B’ corresponds to the model in (3.7).
3.3. Surface roughness
Surface roughness significantly impacts boundary-layer flows, altering momentum transfer and introducing substantial drag penalties across a wide range of engineering applications, including ships, aircraft, turbomachinery and atmospheric boundary layers (Barlow and Coceal, Reference Barlow and Coceal2008; Bons, Reference Bons2010; Schultz, Reference Schultz2007). Predicting the influence of roughness has been a longstanding challenge due to the diversity of roughness topographies and their complex interactions with near-wall turbulence (Chung et al., Reference Chung, Hutchins, Schultz and Flack2021). The objective of rough-wall modelling is typically to predict quantities such as the equivalent sand-grain roughness height
$ k_s$
or the roughness function
$ \Delta U^+$
, and significant advances have been made since the seminal work of Nikuradse (Nikuradse, Reference Nikuradse1950; Yang et al., Reference Yang, Zhang, Yuan and Kunz2023; Colebrook, Reference Colebrook1939; Moody, Reference Moody1944; Flack and Schultz, Reference Flack and Schultz2014). However, incorporating these insights directly into wall modelling for LES remains limited (Bose and Park, Reference Bose and Park2018). Here, we engage DeepSeek-R1 to explore new modelling strategies for rough-wall effects in the context of WMLES. We initiated the discussion with DeepSeek-R1 by posing a general question:
I am a fluid engineer and I use wall-modelled large-eddy simulation. Modelling surface roughness has always been a difficult problem. What are the main approaches to rough-wall modelling? How can we accommodate surface roughness in wall models?

Figure 7. Schematic overview of the interaction between the LLM and the user for the roughness modelling problem.
DeepSeek-R1 has an awareness of the extensive literature on rough-wall turbulence and highlights several key points. First, it notes that the effect of surface roughness is often parametrised by a roughness function, which manifests as a downward shift of the log-law velocity profile relative to smooth-wall flows. Second, it recognises that explicit geometric resolution of roughness in CFD simulations can accurately capture roughness effects, but doing that can be computationally costly. Third, it identifies machine-learning approaches that relate near-wall flow information to wall shear stress.
The full conversation involves approximately 24 000 tokens and the portion relevant to modelling is summarised schematically in Figure 7. In this dialogue, DeepSeek-R1 identifies three key variables for rough-wall modelling: the equivalent sandgrain roughness
$ k_s$
(labelled C11), wall shear stress
$ \tau _w$
(labelled C12), and parameterizations of the roughness geometry (labelled C13). When prompted on how to determine these quantities, the LLM recommends a data-driven approach, C14. The training data include drag measurements, roughness geometric statistics and equivalent sandgrain roughness, which are available from public roughness datasets (Yang et al., Reference Yang, Zhang, Yuan and Kunz2023). Upon confirming the feasibility of this data-driven approach, DeepSeek-R1 provides detailed guidance on constructing a neural network, selecting input and output features, and performing the training. The resulting wall model takes the form:
where the parametrisation of roughness geometry involves single-point statistics such as the peak-to-trough height, second, third and fourth-order moments of the roughness height, as well as combinations of these variables. We note that comparing the model in (3.8) to the EWM is inappropriate, as the EWM assumes a smooth flat plate. As a baseline for comparison, we adopt the rough-wall model proposed by Forooghi et al. (Reference Forooghi, Stroh, Magagnato, Jakirlić and Frohnapfel2017), which is otherwise identical to (3.8) except that
$ k_s$
is given by
where
$ k_{ rms}$
is the root-mean-square of the roughness height and SK denotes the skewness.
We evaluate the performance of the proposed model within the WMLES framework. Here, the surface roughness is not explicitly resolved; instead, its effects on the flow are entirely modelled. Figure 8 presents comparisons of mean velocity profiles for several rough-wall cases. Results are compared against experimental reference data from Flack et al. (Reference Flack, Schultz, Barros and Kim2016), Flack and Schultz (Reference Flack and Schultz2023), Medjnoun et al. (Reference Medjnoun, Rodriguez-Lopez, Ferreira, Griffiths, Meyers and Ganapathisubramani2021), Womack et al. (Reference Womack, Volino, Meneveau and Schultz2022), as well as against the baseline model. The roughness morphologies considered range from random Gaussian surfaces to regular truncated cones. We observe that the model proposed by DeepSeek-R1 consistently yields more accurate mean velocity predictions, particularly in the logarithmic layer. More quantitative comparisons are provided in Table 4, which lists the predicted roughness function
$ \Delta U^+$
for all rough surfaces considered. The corresponding roughness morphologies are already detailed in Table 1. Across all cases, the present model outperforms the reference model of Forooghi et al. (Reference Forooghi, Stroh, Magagnato, Jakirlić and Frohnapfel2017), highlighting the effectiveness of the LLM-guided modelling approach.
Table 4. Predicted roughness function
$\Delta U^{+}$
and the roughness functions measured from experiments. Relative errors compared with experimental measurements are also listed.


Figure 8. Predicted mean velocity profiles for rough-wall flows: (a) R60RW1; (b) R60RW3; (c) R25RW7; (d) R50RW8. ‘Ref’ corresponds to the roughness model of Forooghi et al. (Reference Forooghi, Stroh, Magagnato, Jakirlić and Frohnapfel2017). ‘LLM-C’ corresponds to the model in (3.8). The dashed black line indicates the LoW. The shaded regions represents the uncertainty in the training data.
4. Concluding remarks
In this study, we explored the utility of LLMs for turbulence modelling by engaging DeepSeek-R1 in a closed-loop, iterative framework. Within this framework, DeepSeek-R1 engages a human engineer, and proposes and refines wall models to address three challenges in near-wall turbulence modelling: APGs, system rotation and surface roughness. Our results demonstrate that the LLM-driven models not only rival, but in many cases outperform, baseline wall models.
A key distinction between the present work and existing data-driven turbulence modelling efforts lies in the role played by the AI. Conventional data-driven turbulence models are in fact human-driven: researchers design network architectures, curate training data, and employ machine learning to optimise weights and biases. The resulting models often function as black boxes, offering limited interpretability or physical reasoning. In contrast, the models developed in this study were truly AI-driven. DeepSeek-R1 autonomously provided physical reasoning, logical model structures and complete modelling strategies, resulting in models that are transparent, interpretable and grounded in physical arguments. It should be emphasised that the a priori knowledge introduced does not predefine the model itself, but rather establishes the constraints of the problem – defining relevant variables, enforcing dimensional consistency and ensuring adherence to fundamental physical laws. Within this constrained framework, the LLM acts as a hypothesis-generation engine, enabling the systematic exploration of physically plausible model forms.
Equally important is the paradigm shift in how the AI is treated during the process. In most prior applications, AI tools are treated as subordinate assistants, tasked with executing narrowly defined objectives with minimal feedback loops from a human. Here, we treat the LLM as an equal partner, engaging in multiple rounds of iterative dialogue analogous to collaboration among human researchers. Recent studies in human–AI co-creation suggest that enabling reciprocal communication significantly enhances collaboration quality and creative outcomes (Rezwana & Maher, Reference Rezwana and Maher2022). Our experience supports this view: through sustained interaction, DeepSeek-R1 is able to correct its own errors, refine incomplete formulations and meaningfully contribute to model development without requiring explicit correction at each step.
The results also highlight a limitation of conventional benchmark evaluations of LLMs, which typically assess models based on a single-response correctness (Pröhl et al., Reference Pröhl, Putzier and Zarnekow2024; Jiang et al., Reference Jiang, Gao and Karniadakis2025). As shown here, when allowed to participate in multi-turn, collaborative interactions, LLMs demonstrate capabilities for adaptive reasoning and creative problem-solving that more closely resemble scientific inquiry than simple information retrieval.
Despite these promising results, several limitations of the present investigation must be acknowledged. First, in real engineering applications, these three effects – pressure gradient, rotation and surface roughness – typically occur simultaneously. Current models fail in this regard. Second, the field of LLM development is advancing rapidly. Since the release of DeepSeek-R1, several newer models have demonstrated even stronger reasoning capabilities, making the present work an initial but already dated baseline. Second, while DeepSeek-R1 proposed models with strong performance and clear logic, it did not invent entirely novel modelling paradigms. The strategies it identified – including data-driven approaches, integral methods and modified laws of the wall – have precedents in the turbulence modelling literature. Nevertheless, it is important to recognise that individual researchers are typically familiar with only a subset of these methods. The ability of the LLM to synthesise diverse approaches and reason across multiple frameworks represents a significant augmentation of human expertise.
In all, our experiences point to a broader opportunity in applied fluid mechanics: LLMs can serve as powerful collaborators in engineering model developments. They offer a platform for brainstorming ideas, synthesising knowledge beyond the reach of any individual researcher and refining concepts through interactive reasoning.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/flo.2025.10032.
Data availability
Raw data are available from the author upon reasonable request.
Funding
Zhongxin Yang and Yipeng Shi acknowledge NSFC11988102 for financial support.
Competing interests
The authors declare no conflict of interest.

























