Hostname: page-component-89b8bd64d-n8gtw Total loading time: 0 Render date: 2026-05-07T14:26:19.746Z Has data issue: false hasContentIssue false

On the reproducibility of fully convolutional neural networks for modeling time–space-evolving physical systems

Published online by Cambridge University Press:  28 April 2022

Wagner G. Pinto
Affiliation:
Aerodynamics, Energetics and Propulsion Department, ISAE-SUPAERO, Université de Toulouse, 10 Avenue Edouard Belin, 31055 Toulouse, France
Antonio Alguacil
Affiliation:
Aerodynamics, Energetics and Propulsion Department, ISAE-SUPAERO, Université de Toulouse, 10 Avenue Edouard Belin, 31055 Toulouse, France Mechanical Engineering Department, Université de Sherbrooke, 2500 Boulevard de l’Université, Sherbrooke QC J1K 2R1, Canada
Michaël Bauerheim*
Affiliation:
Aerodynamics, Energetics and Propulsion Department, ISAE-SUPAERO, Université de Toulouse, 10 Avenue Edouard Belin, 31055 Toulouse, France
*
*Corresponding author. E-mail: michael.bauerheim@isae-supaero.fr

Abstract

Reproducibility of a deep-learning fully convolutional neural network is evaluated by training several times the same network on identical conditions (database, hyperparameters, and hardware) with nondeterministic graphics processing unit operations. The network is trained to model three typical time–space-evolving physical systems in two dimensions: heat, Burgers’, and wave equations. The behavior of the networks is evaluated on both recursive and nonrecursive tasks. Significant changes in models’ properties (weights and feature fields) are observed. When tested on various benchmarks, these models systematically return estimations with a high level of deviation, especially for the recurrent analysis which strongly amplifies variability due to the nondeterminism. Trainings performed with double floating-point precision provide slightly better estimations and a significant reduction of the variability of both the network parameters and its testing error range.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Open Practices
Open materials
Copyright
© The Author(s), 2022. Published by Cambridge University Press
Figure 0

Figure 1. Illustration of the variability induced solely by training nondeterminism with the results obtained with neural networks trained in identical conditions for the modeling of the heat equation, single precision. “Best” and “worst” refer to the runs with the smallest and the highest average loss for the test dataset (see Figure 14). Contour lines are stepped by 1/20 of the color bar limits, null isocontour is omitted, and the amplitude $ \varepsilon $ and the simulation details are available in Section 2.1.1.

Figure 1

Figure 2. Simplified diagram of the multiscale neural network. Arrows between boxes indicate a two-dimensional convolution operation, and boxes’ height and breadth are proportional to the frame dimension and the number of layers, respectively; interpolation operations are presented by dashed lines for downsampling and continuous lines for upsampling.

Figure 2

Table 1. Network architecture.

Figure 3

Figure 3. Samples of a datapoint (five frames) for the three equations (more details in Section 2.1). Initial four frames are used as input, and the last is the target. Contour lines are stepped by 1/20 of the color bar limits, and null isocontour is omitted.

Figure 4

Figure 4. Fields for the benchmark cases at the start of the simulation. Refer to Figure 3 for colormap and contour properties.

Figure 5

Figure 5. Evolution of the normalized average root-mean-square error with the number of models for the wave equation benchmarks, trainings in single (top) and double (bottom) precisions. Reference is the average error at the given number of recurrences for the totality of trained models, indicated by the continuous horizontal line; the hashed lines delineate a range of $ \pm $10% of that value.

Figure 6

Figure 6. Evolution of the absolute (top) and relative (ratio to first run, at bottom) training loss for the different physical systems. For visualization purposes, only 1 in every 10 values is shown for the relative loss.

Figure 7

Figure 7. Two-dimensional visualization of the loss surface for the first run at different precisions ($ \mathrm{FP}32 $ and $ \mathrm{FP}64 $), wave and heat equations. The optimizer trajectory is superposed as a blue line with dots, indicating the training checkpoints. Isocontours are in log scale, from $ {10}^{-4} $ to $ {10}^{-1} $.

Figure 8

Table 2. Statistics of the final total losses considering all runs in single ($ \mathrm{FP}32 $) and double ($ \mathrm{FP}64 $) precisions for the best model obtained on each training.

Figure 9

Figure 8. Weights, standard deviation, and deviation criterion for most contrasting convolution kernel for the sets of heat (left) and Burgers’ (right) equation models trained with single (top) and double (bottom) precisions.

Figure 10

Figure 9. Probability density of the deviation of the convolution weights for the single and double precision runs.

Figure 11

Figure 10. Pixelwise deviation of the feature fields (last field of each scale) for a model input corresponding to the starting simulation of the Gaussian pulse benchmark, for runs with single and double precisions. Note that the magnitudes do not correspond to any physical quantity, and colormap ranges are different for each field, so the shapes of the features are visible.

Figure 12

Figure 11. Pixelwise deviation of the feature fields (last field of each scale) for a model input corresponding to the 99th frame of the Gaussian pulse benchmark, for runs with single and double precisions. Note that the magnitudes do not correspond to any physical quantity, and colormap ranges are different for each field, so the shapes of the features are visible.

Figure 13

Figure 12. Evolution of total loss for the recurrent test of benchmarks for models trained with single (left) and double (right) precisions; number of curves for double precision are reduced due to superposition of identical responses.

Figure 14

Figure 13. Evolution of the root-mean-square error normalized by the pulse amplitude for the recurrent test of benchmarks for models trained with single (dotted line) and double (full line) precisions; central line represents the average among the runs, and band indicates the minimum to maximum range.

Figure 15

Table 3. Minimum, maximum, and average of the max/min ratio of the recursive root-mean-square errors considering the benchmarks from 0 to 100 recurrences.

Figure 16

Figure 14. Box plots of the total loss for the different models obtained with single precision (top) and double precision (bottom), considering the random pulse databases at multiple number of recurrences. The central lines indicate the median value, the box limits represent first and third quartiles, the whiskers represent the median $ \pm $1.5 times the interquartile range, and continuous lines connect the average for each run.

Figure 17

Figure 15. Average recurrent test error versus validation loss for all simulations in random pulse databases at multiple numbers of recurrences (0, 10, and 50), filled markers for single precision and empty markers for double precision; error bars indicate minimum to maximum range.

Figure 18

Figure 16. Evolution of the absolute disparity of the root-mean-square error for the recurrent test of benchmarks for models when comparing inference with proper and flipped precisions (Run 1): trained in single and inference in double (FP32 as FP64, full line) and for trained in double and inference in single (FP64 as FP32, dotted line), for the Burgers’ (left) equation and the wave (right) equation.

Submit a response

Comments

No Comments have been published for this article.