Hostname: page-component-89b8bd64d-mmrw7 Total loading time: 0 Render date: 2026-05-06T09:10:29.183Z Has data issue: false hasContentIssue false

Physics-constrained convolutional neural networks for inverse problems in spatiotemporal partial differential equations

Published online by Cambridge University Press:  20 December 2024

Daniel Kelshaw
Affiliation:
Department of Aeronautics, Imperial College London, London, UK
Luca Magri*
Affiliation:
Department of Aeronautics, Imperial College London, London, UK The Alan Turing Institute, The British Library, London, UK DIMEAS, Politecnico di Torino, Torino, Italy.
*
Corresponding author: Luca Magri; Email: l.magri@imperial.ac.uk

Abstract

We propose a physics-constrained convolutional neural network (PC-CNN) to solve two types of inverse problems in partial differential equations (PDEs), which are nonlinear and vary both in space and time. In the first inverse problem, we are given data that is offset by spatially varying systematic error (i.e., the bias, also known as the epistemic uncertainty). The task is to uncover the true state, which is the solution of the PDE, from the biased data. In the second inverse problem, we are given sparse information on the solution of a PDE. The task is to reconstruct the solution in space with high resolution. First, we present the PC-CNN, which constrains the PDE with a time-windowing scheme to handle sequential data. Second, we analyze the performance of the PC-CNN to uncover solutions from biased data. We analyze both linear and nonlinear convection-diffusion equations, and the Navier–Stokes equations, which govern the spatiotemporally chaotic dynamics of turbulent flows. We find that the PC-CNN correctly recovers the true solution for a variety of biases, which are parameterized as non-convex functions. Third, we analyze the performance of the PC-CNN for reconstructing solutions from sparse information for the turbulent flow. We reconstruct the spatiotemporal chaotic solution on a high-resolution grid from only 1% of the information contained in it. For both tasks, we further analyze the Navier–Stokes solutions. We find that the inferred solutions have a physical spectral energy content, whereas traditional methods, such as interpolation, do not. This work opens opportunities for solving inverse problems with partial differential equations.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Figure 1. Inverse problems investigated in this paper. (a) Uncovering solutions from biased data. The model $ {\boldsymbol{\eta}}_{\boldsymbol{\theta}} $ is responsible for recovering the solution (true state), $ \boldsymbol{u}\left(\varOmega, t\right) $, from the biased data, $ \boldsymbol{\zeta} \left(\varOmega, t\right) $. The bias (systematic error), $ \boldsymbol{\phi} \left(\boldsymbol{x}\right) $, is the difference between the biased data and the solution. (b) Reconstructing a solution from sparse information. The model $ {\boldsymbol{f}}_{\boldsymbol{\theta}} $ is responsible for mapping the sparse field $ \boldsymbol{u}\left({\varOmega}_L,t\right) $ to the high-resolution field $ \boldsymbol{u}\left({\varOmega}_H,t\right) $. The term $ \tau $ in both cases denotes the number of contiguous time-steps passed to the network, required for computing temporal derivatives. An explanation of the proposed physics-constrained convolutional neural network (PC-CNN), which is the ansatz for both mappings $ {\boldsymbol{\eta}}_{\theta } $ and $ {\boldsymbol{f}}_{\theta } $, is provided in Section 4.

Figure 1

Figure 2. Time-windowing scheme. Time-steps are first grouped into non-overlapping subsets of successive elements of length $ \tau $. Each of these subsets can be taken for either training or validation. Subsets are treated as minibatches and passed through the network to evaluate their output. The temporal derivative is then approximated using a forward-Euler approximation across adjacent time steps.

Figure 2

Figure 3. Uncovering solutions from biased data. (a) Linear convection-diffusion case $ {k}_{\phi }=3 $ and $ \mathcal{M}=0.5 $. (b) Nonlinear convection-diffusion case with $ {k}_{\phi }=5 $ and $ \mathcal{M}=0.5 $. (c) Two-dimensional turbulent flow case with $ {k}_{\phi }=7 $ and $ \mathcal{M}=0.5 $. Panel (i) shows the biased data, $ \boldsymbol{\zeta} $; (ii) shows the true state, $ \boldsymbol{u} $, which we wish to uncover from the biased data; (iii) shows the bias, which represents the bias (i.e., systematic error), $ \boldsymbol{\phi} $; (iv) shows the network predictions, $ {\boldsymbol{\eta}}_{\theta } $; and (v) shows the predicted bias, $ \boldsymbol{\zeta} $ - $ {\boldsymbol{\eta}}_{\theta } $.

Figure 3

Figure 4. Unconvering solutions from biased data: robustness analysis through relative error, $ e $. (a) Linear convection-diffusion case. (b) Nonlinear convection-diffusion case. (c) Two-dimensional turbulent flow case. Orange-bars denote results for case $ (i) $: fixing the magnitude and varying the Rastrigin wavenumber. Blue-bars denote results for case $ (ii) $: fixing the Rastrigin wavenumber and varying the magnitude.

Figure 4

Figure 5. Uncovering solution from biased data. Temporal evolution of the two-dimensional turbulent flow with $ {k}_{\varphi }=7,\mathcal{M}=0.5 $. $ {T}_t $ denotes the length of the transient. (i) Biased data, $ \boldsymbol{\zeta} $; (ii) true state, $ \boldsymbol{u} $; (iii) predicted solution, $ {\boldsymbol{\eta}}_{\theta } $; and (iv) squared error, $ {\left\Vert {\boldsymbol{\eta}}_{\theta }-\boldsymbol{u}\right\Vert}^2 $.

Figure 5

Figure 6. Uncovering solution from biased data: analysis of the solutions in the turbulent flow with $ {k}_{\varphi }=7,\mathcal{M}=0.5 $. (a) Kinetic energy for the two-dimensional turbulent flow. (b) Energy spectrum for the two-dimensional turbulent flow. $ {T}_t $ denotes the length of the transient.

Figure 6

Figure 7. Reconstruction from sparse information: physics-constrained convolutional neural network (PC-CNN) compared with traditional interpolation methods to reconstruct a solution from a sparse grid $ {\varOmega}_L\in {\mathrm{\mathbb{R}}}^{10\times 10} $ (100 points) to a high-resolution grid $ {\varOmega}_H\in {\mathrm{\mathbb{R}}}^{70\times 70} $ (4900 points). Panel (i) shows the low-resolution input, $ \boldsymbol{u}\left({\varOmega}_L,t\right) $; (ii) bi-linear interpolation, $ BL\left(\boldsymbol{u}\left({\varOmega}_L,t\right)\right) $; (iii) bi-cubic interpolation, $ BC\left(\boldsymbol{u}\left({\varOmega}_L,t\right)\right) $; (iv) true high-resolution field, $ \boldsymbol{u}\left({\varOmega}_H,t\right) $; (v) model prediction of the high-resolution field, $ {\boldsymbol{f}}_{\boldsymbol{\theta}}\left(\boldsymbol{u}\left({\varOmega}_L,t\right)\right) $; and (vi) energy spectra for each of the predictions. Vertical lines $ {f}^n $ denote the Nyquist frequencies for spectral grid, $ {f}_{{\hat{\varOmega}}_{\boldsymbol{k}}}^n $, and for the low-resolution grid, $ {f}_{\varOmega_{\boldsymbol{L}}}^n $.

Figure 7

Figure 8. Uncovering solutions from biased data. The model $ {\boldsymbol{\eta}}_{\boldsymbol{\theta}} $ is responsible for mapping the biased state $ \boldsymbol{\zeta} \left(\varOmega, t\right) $ to the true solution $ \boldsymbol{u}\left(\varOmega, t\right) $. Convolutional layers are parameterized as torch.Conv2D(c_{in}, c_{out}, k), where c_{in}, c_{out} denote the number of input and output channels, respectively; and k is the spatial extent of the filter. The term $ h $ represents the activation, tanh in this case. The terms $ {\unicode{x1D543}}_D,{\unicode{x1D543}}_P,{\unicode{x1D543}}_C $ denote the data loss, physics loss, and constraint loss respectively. The combination of these losses forms the objective loss $ {\mathcal{L}}_{\theta } $, which is used to update the network’s parameters. The term $ \tau $ denotes the number of contiguous time-steps passed to the network, required for computing temporal derivatives. We provide an in-depth explanation of these losses in Section 4.1.

Figure 8

Figure 9. Reconstruction from sparse information. The model $ {\boldsymbol{f}}_{\boldsymbol{\theta}} $ is responsible for mapping the low-resolution field $ \boldsymbol{u}\left({\varOmega}_L,t\right) $ to the high-resolution field $ \boldsymbol{u}\left({\varOmega}_H,t\right) $. The upsampling layer (blue) performs bi-cubic upsampling to obtain the correct spatial dimensions. Convolutional layers are parameterized as torch.Conv2D(c_{in}, c_{out}, k), where c_{in}, c_{out} denote the number of input and output channels, respectively; and k is the spatial extent of the filter. The term $ h $ represents the activation, tanh in this case. The terms $ {\unicode{x1D543}}_D,{\unicode{x1D543}}_P $ denote the data loss and physics loss, respectively, the combination of which forms the objective loss $ {\mathcal{L}}_{\theta } $, which is used to update the network’s parameters, $ \boldsymbol{\theta} $. The term $ \tau $ denotes the number of contiguous time-steps passed to the network, required for computing temporal derivatives. We provide an in-depth explanation of these losses in Section 4.1.

Figure 9

Figure 10. Removal of a stationary bias field containing multiple wave numbers. The first three columns show the biased, original, and predicted flow fields respectively. The rightmost column depicts the predicted additive bias. Results are shown for multiple time-steps in the simulation trajectory.

Figure 10

Figure 11. Reconstruction from sparse information. Performance for the reconstruction task is shown for a range of scale factors $ \kappa $, measuring performance with the relative $ {\mathrm{\ell}}^2 $-errors, $ e $, between the predicted high-resolution fields $ {\boldsymbol{f}}_{\boldsymbol{\theta}}\left(\boldsymbol{u}\left({\varOmega}_{\mathbf{L}},t\right)\right) $, and the true high-resolution fields $ \boldsymbol{u}\left({\varOmega}_{\boldsymbol{H}},t\right) $.

Submit a response

Comments

No Comments have been published for this article.