Convolutional-network models to predict wall-bounded turbulence from wall quantities

Abstract Two models based on convolutional neural networks are trained to predict the two-dimensional instantaneous velocity-fluctuation fields at different wall-normal locations in a turbulent open-channel flow, using the wall-shear-stress components and the wall pressure as inputs. The first model is a fully convolutional neural network (FCN) which directly predicts the fluctuations, while the second one reconstructs the flow fields using a linear combination of orthonormal basis functions, obtained through proper orthogonal decomposition (POD), and is hence named FCN-POD. Both models are trained using data from direct numerical simulations at friction Reynolds numbers $Re_{\tau } = 180$ and 550. Being able to predict the nonlinear interactions in the flow, both models show better predictions than the extended proper orthogonal decomposition (EPOD), which establishes a linear relation between the input and output fields. The performance of the models is compared based on predictions of the instantaneous fluctuation fields, turbulence statistics and power-spectral densities. FCN exhibits the best predictions closer to the wall, whereas FCN-POD provides better predictions at larger wall-normal distances. We also assessed the feasibility of transfer learning for the FCN model, using the model parameters learned from the $Re_{\tau }=180$ dataset to initialize those of the model that is trained on the $Re_{\tau }=550$ dataset. After training the initialized model at the new $Re_{\tau }$, our results indicate the possibility of matching the reference-model performance up to $y^{+}=50$, with $50\,\%$ and $25\,\%$ of the original training data. We expect that these non-intrusive sensing models will play an important role in applications related to closed-loop control of wall-bounded turbulence.


Introduction
The advent of new powerful deep neural networks (DNNs, see LeCun et al. 2015) has fostered their application in many research areas (Jean et al. 2016;De Fauw et al. 2018;Norouzzadeh et al. 2018;Ham et al. 2019;Udrescu & Tegmark 2020;Vinuesa et al. 2020). Due to their potential applications in flow modelling, identification of turbulence features and flow control, DNNs have recently received extensive attention in the fluid-mechanics research community (Kutz 2017;Jiménez 2018;Duraisamy et al. 2019;Brunton et al. 2020). In the case of turbulence modelling, DNNs have been reported to improve the results of Reynolds-averaged Navier-Stokes (RANS, Ling et al. 2016;Wu et al. 2018) models and large-eddy simulations (LES, Maulik et al. 2019;Lapeyre et al. 2019;Beck et al. 2019). There are also a number of on-going efforts towards including the constraints from the Navier-Stokes equations into prediction models through the so-called physicsinformed neural networks (Wang et al. 2017;Raissi et al. 2019). Furthermore, several artificial-intelligence-based solutions have been proposed to perform optimal control on different types of flows, such as the wake behind one or more cylinders (Rabault et al. 2019;Raibaudo et al. 2020). Other promising applications of machine learning to fluid mechanics include generation of inflow conditions (Fukami et al. 2019b) and extraction of flow patterns (Raissi et al. 2020).
DNNs have also found application in temporal prediction of dynamical systems. As an example, Srinivasan et al. (2019) compared the capabilities of the multi-layer perceptron (MLP, also known as fully-connected-layer neural network) and several long-short-term memory (LSTM) networks to predict the coefficients of a low-order model for near-wall turbulence (Moehlis et al. 2004). While the most relevant flow features are captured by both architectures, the LSTM network outperformed the MLP in terms of ability to predict turbulence statistics and the dynamics of the flow. This work has been extended by Eivazi et al. (2020), where the LSTM network has been compared with a Koopmanbased framework which accounts for non-linearities through external forcing. Although both approaches provide accurate predictions of the dynamical evolution of the system, the latter outperforms the LSTM in terms of time and data required for training. Similar temporal predictions of the near-wall model (Moehlis et al. 2004) were conducted by Pandey et al. (2020) using echo state networks (ESN). Moreover, nonlinear autoregressive exogeneous networks (NARXs) have been used by Lozano-Durán et al. (2020) to exploit the relation between the temporal dynamics of the Fourier coefficients of a minimal turbulent channel flow. Their results showed accurate predictions of the bursting events in the logarithmic layer from buffer-region data. Other related work, in the context of control of the Kuramoto-Sivashinsky (KS) chaotic system, was recently conducted by Bucci et al. (2019). Note however that the use of temporal sequences implies a high computational cost to generate well-resolved temporal data. Furthermore, longer sequences require higher memory requirements in order to predict the future behaviour. For these reasons, several neural-network-based models that learn spatial relations have been proposed in the literature. Convolutional neural networks (CNNs) have become increasingly popular during the last years due to the hierarchical structure of their input (Fukushima 1980(Fukushima , 1988LeCun et al. 1989;Lecun et al. 1998). For instance, Fukami et al. (2019aFukami et al. ( , 2020 have shown that turbulent flow fields can be reconstructed from extremely coarse data with remarkable success. CNNs have also been used to investigate the dynamical features of the flow without a-priori knowledge, as shown by Jagodinski et al. (2020).
Neural networks are mathematical models based on data-driven training, and as such they have been compared and used together with other data-driven methods. For instance, the relationship between proper orthogonal decomposition (POD, see Lumley 1967) and the MLP is well documented in the literature (Bourlard & Kamp 1988;Baldi & Hornik 1989). These works showed that a MLP with a single hidden layer is equivalent to POD if a linear activation function is used. Milano & Koumoutsakos (2002) compared the results of POD-based neural networks with linear and non-linear functions for the prediction of near-wall velocity fields, showing that nonlinear POD has significantly better predictive capabilities. More recently, the emergence of autoencoder architectures has motivated a renewed interest in the application of neural networks for dimensionality reduction. Hinton & Salakhutdinov (2006) proposed the use of deep autoencoders to obtain a low-order representation of high-dimensional data, showing that this approach is able to retain more information than POD. It is interesting to note that this work avoids the inherent difficulty of optimizing weights in deep autoencoders by training each layer with a Restricted Boltzmann Machine. Murata et al. (2020) used an autoencoder with convolutional layers to obtain a low-order representation of the flow around a cylinder. Their results suggest that CNN autoencoders with linear activation functions reproduce the same dimensionality reduction as POD, while the use of nonlinear activation functions improves the reconstruction performance. On a related note, flow reconstruction based on shallow neural networks was studied by Erichson et al. (2020) in several fluid-mechanics examples.
In this work, we assess the potential of DNNs for non-intrusive sensing, to be used for closed-loop control applications. In this type of control, the actuation is applied with the aim of suppressing the effect of certain structures in the flow (Choi et al. 1994). In order to effectively perform closed-loop control it is necessary to monitor the instantaneous state of the flow so as to devise the best way to affect it, but this type of measurement is extremely challenging, particularly at very high Reynolds numbers where the nearwall structures become progressively smaller. On the other hand, it is more feasible to perform non-intrusive sensing, i.e. to accurately measure time-resolved quantities at the wall, such as the wall-shear stress or the pressure, and then correlate these measurements with the flow farther away. In a seminal study over 20 years ago, Lee et al. (1997) used a CNN to predict, based on the wall-shear-stress components, the wall actuation that would lead to higher drag reduction. More recently, Guastoni et al. (2019a) used the two wall-shear-stress components to predict the instantaneous streamwise flow fields at several wall-normal positions using convolutional networks. Their results show that these neural networks provide better predictions than linear methods (see below) in terms of instantaneous predictions and second-order statistics. The same wall information was used by Kim & Lee (2020) to predict the instantaneous wall-normal heat flux with satisfactory results. Moreover, in the work by Güemes et al. (2019) the information of the most-energetic scales was encoded into a POD basis, and a CNN was used to predict that information at different wall-normal locations from streamwise wall-shearstress measurements. Their results demonstrated that CNNs can significantly outperform linear methods in the prediction of POD time coefficients for low-order reconstruction of the velocity fields.
A drawback of DNNs is the fact that they require training and test data to be taken from the same distribution, i.e. for the same flow and at the same Reynolds number in our case. However, in a real-world application the flow conditions will be continuously varying and/or it might be unfeasible to perform a full training at exactly the same conditions. It may be possible, however, to exploit initial training at a certain flow condition and transfer this knowledge to another one. Such knowledge transfer could reduce significantly the amount of data needed for training and improve the network applicability for industrial applications. Transfer learning (Pan & Yang 2009) is the suitable learning framework to address this issue, and it is discussed in detail below.
Before the appearance of DNNs, flow-field prediction was performed mainly through linear methods. Among them, the linear stochastic estimation (LSE) introduced by Adrian (1988) stands out. Recently Suzuki & Hasegawa (2017) and Encinar & Jiménez (2019) have used LSE to reconstruct the velocity field on a wall-parallel plane in a turbulent channel flow employing wall measurements. The latter study showed that LSE can only reconstruct the large wall-attached eddies in the outer part of the logarithmic region. An extension of the LSE method in the spectral domain (Tinney et al. 2006) was shown to be more suitable for noisy predictions in turbulent flows. More recently Baars & Tinney (2014) proposed a POD-based method for improving the spectral-LSE approach. Borée (2003) reported the possibility of projecting a synchronized field on the POD temporal modes of another quantity; this method is known as extended POD (EPOD). The correlation matrix between the temporal POD coefficients of two given quantities can be used to predict one based on the other one. The work of Borée (2003) proved EPOD to be equivalent to LSE when all modes are retained in the reconstruction. EPOD has been used to provide predictions in turbulent jets (Tinney et al. 2008), channel flows (Discetti et al. 2018), pipe flows ) and wall-mounted obstacles (Bourgeois et al. 2013;Hosseini et al. 2016) using remote probes. On the other hand, Sasaki et al. (2019) recently assessed the capabilities of both linear and non-linear transfer functions with single and multiple inputs to provide turbulent-flow predictions. They documented a significant improvement in the predictions when the transfer functions were designed to account for nonlinear interactions between the inputs and the flow field. The improved prediction capabilities of nonlinear methods over linear ones were also reported by Mokhasi et al. (2009) andGoza (2020).
The methods proposed by Guastoni et al. (2019a) and Güemes et al. (2019), henceforth referred to as fully-convolutional network (FCN) and FCN-POD respectively, are extended in the present study. Both models are able to provide a nonlinear characterization of the relation between wall features and the flow on wall-parallel planes. The purpose of this work is to provide a detailed comparison of the two aforementioned nonlinear methods regarding their capabilities to predict turbulent flow fields from wall information. Their improvement over linear methods is measured using EPOD. Furthermore, transfer learning was applied to the FCN approach with the purpose of evaluating to what extent a network trained at one Reynolds number can be used at a different one.
The remainder of this article is organised as follows. Section 2.1 describes the numerical databases used for training and testing the neural networks and §2.2 provides a brief introduction to convolutional neural networks. The FCN and FCN-POD methods are presented in §2.3 and §2.4 respectively, while EPOD is discussed in §2.5. Results from the considered prediction methods are presented and compared in §3, including instantaneous fields in §3.1, second-order statistics in §3.2, and spectra in §3.3. Furthermore, an assessment of transfer learning between different Reynolds numbers is presented in §4. Finally, the main conclusions of the work are presented in §5. An Appendix is provided containing additional information regarding the predicted instantaneous flow fields.

Datasets
All the DNN variants in this study have been trained using the data generated from direct numerical simulations (DNS) of a turbulent open-channel flow. Periodic boundary conditions are imposed in the x-and z-directions (which are the streamwise and spanwise coordinates, respectively), and a no-slip condition is applied at the lower boundary (y = 0, where y is the wall-normal coordinate). Differently from a standard channelflow simulation, a symmetry condition is imposed at the upper boundary. In standard channel flows, the wall-attached coherent structures may extend beyond the channel centerline, thus affecting the other wall  investigate to which extent the neural networks are able to learn the dynamics of nearwall turbulence, since the interaction of the large scales with both walls is not present. The simulation is performed using the pseudo-spectral code SIMSON (Chevalier et al. 2007) with constant mass flow rate, in a domain Ω = L x × L y × L z = 4πh × h × 2πh (where h is the channel height), as shown in figure 1. Two friction Reynolds numbers Re τ (based on h and the friction velocity u τ = τ w /ρ, where τ w is the wall-shear stress and ρ is the fluid density) are considered, as summarized in table 1. The flow field is represented with N y Chebyshev modes in the wall-normal direction and with N x and N z Fourier modes in the streamwise and spanwise directions, respectively. The instantaneous fields are obtained at constant time intervals from the time-advancing scheme, which is a second-order Crank-Nicholson scheme for the linear terms and a third-order Runge-Kutta method for the nonlinear terms. Dealiasing using a standard 3/2 rule is employed in the wall-parallel Fourier directions. The velocity fields to be used as ground truth for training and testing are sampled at the following inner-scaled wall-normal coordinates: y + = 15, 30, 50 and 100. Note that '+' denotes viscous scaling, i.e. in terms of the friction velocity u τ or the viscous length * = ν/u τ (where ν is the fluid kinematic viscosity). A dataset is defined as a collection of samples, each consisting of the shearstress and pressure fields at the wall as inputs, along with the corresponding velocity fields at the target wall-normal locations as outputs. The training/validation dataset at Re τ = 180 consists of 50,400 instantaneous fields, with a sampling period of ∆t + s = 5.08. The sampling period at Re τ = 550 is set to ∆t + s = 1.49 and the training/validation dataset includes 19,920 fields. In both cases, the dataset is split into training and validation sets, with a ratio of 4:1. As shown in table 1, the number of Fourier modes in the wall-parallel directions is higher at Re τ = 550, even if the the resolution of the individual fields is the same in viscous units. Since the domain much larger when scaled in inner units, a higher number of flow features is sampled per snapshot in the high-Re τ case, thus partially compensating the lower number of fields.
The predictions used to assess the performance of the trained models were obtained from additional test datasets. This is done both at Re τ = 180 and Re τ = 550, and it is necessary to ensure that the training and test datasets are completely uncorrelated, both in space and time. Test samples were taken from simulations initialized with a random seed, different from that of the training-data simulation. The size of the test dataset (more than 3,000 flieds for both Re τ ) is sufficient to achieve convergence of the turbulence statistics from the predicted flow, and then these are compared with the reference values from the DNS.

Convolutional neural networks
In this study we consider the instantaneous two-dimensional fields of the streamwise and spanwise wall-shear-stress components and of the wall pressure as inputs for our models. The presence of coherent features in the input fields motivates the use of convolutional layers in our neural-network models to process the information. In these layers, a convolution in two dimensions is performed, and it is defined as: where I ∈ R d1×d2 is the input, K ∈ R k1×k2 is the so-called kernel (or filter ) containing the learnable parameters of the layer, and the transformed output F is the feature map.
Multiple feature maps can be stacked and sequentially fed into another convolutional layer as input. This allows the next layer to combine the features individually identified in each feature map, enabling the prediction of larger and more complex features for progressively deeper convolutional networks. Since k i d i ∀i, the use of kernels greatly reduces the number of parameters that need to be learned during training (in comparison with fully-connected MLP networks).
A DNN that features this kind of layers is known as convolutional neural network (CNN, see Lecun et al. 1998). In this work we consider two different architectures to predict the instantaneous velocity fields at different wall-normal locations based on the same input fields. In one case, the instantaneous two-dimensional velocity fluctuations are directly predicted from the input fields by using a fully-convolutional neural network (FCN). This network is similar, but conceptually different from CNNs, which typically have several convolutional layers followed by one or more fully-connected layers (which are the building blocks of MLP networks). In these networks the localized information processed by the individual convolutional kernels is combined to obtain a global prediction, whereas in FCNs only convolutional layers are present and the network architecture is based on the assumption that the relation between input and output variables is spatially localized. The input region from which a single point of the output can draw information is called receptive field and it can be computed based on the network architecture, as described by Dumoulin & Visin (2016). Additional details regarding this architecture are provided in §2.3. The second approach is a development of the one used by Güemes et al. (2019), and it is based on the following steps: first, the fluctuation fields are projected on an orthonormal basis using proper orthogonal decomposition (POD) (Lumley 1967), so that the spatial and temporal dynamics are separated. Then, the neural network reconstructs the velocity fluctuation field by predicting the coefficients that determine the temporal dynamics. Here we also employ a fully-convolutional network, which will be referred to as FCN-POD, and its architecture is described in §2.4.

Fully-convolutional neural-network predictions
FCNs are commonly used in applications where the input and output domains have structural similarities. Image segmentation (Long et al. 2015) is one such case, since the output has the same spatial dimension as the input, as in our predictions of twodimensional flow fields. The inputs of the network are the wall-shear-stress components in the streamwise and spanwise directions, as well as the pressure at the wall. Each of the inputs of the network is normalized using the respective mean and standard deviation, as computed from the training/validation set. The predictions are performed using the same mean and standard deviation values on the test dataset inputs. The outputs are the instantaneous velocity fluctuations, denoted as u, v and w (corresponding to the streamwise, wall-normal and spanwise velocities, respectively), at a given distance from the wall. Note that the predictions are carried out at the same time as the input fields. In our previous work (Guastoni et al. 2019a), a similar FCN was used to predict the streamwise component of instantaneous flow fields. In the present study the predictions are extended to the wall-normal and spanwise components, however the back-propagation algorithm that is used to train the networks works best when the error in the prediction of three outputs (i.e. the three velocity components) has a similar magnitude for all of them. Thus, the fluctuations are scaled as follows: where RMS refers to root-mean-squared quantities. During inference (i.e. when the predictions are computed from the inputs in the test dataset), the outputs of the network are re-scaled back to their original magnitude. The network is trained to minimize the following loss function: which is the mean-squared error (MSE) between the instantaneous prediction u FCN and the true velocity fluctuations u DNS , as computed by the DNS and scaled in the same way as the network outputs. The chosen inputs and outputs allow the FCN to learn only the spatial relation between the quantities at the wall and the fluctuations farther away from it. Note that it would also be possible to consider predictions in time, and in that case convolutional neural networks could be used (van den Oord et al. 2016) treating time as another spatial coordinate, or it would be possible to use recurrent networks, specifically designed to learn temporal sequences as we have recently shown with long-short term memory (LSTM) networks (Srinivasan et al. 2019;Guastoni et al. 2019b). In both cases, the need of multiple samples in time makes the model less flexible than one that relies only on spatial correlations, both during training and testing. These models usually assume a constant sampling time for the data sequence, which might be difficult to enforce, for example if the fields are taken from a numerical simulation with adaptive time step. During inference, models that work with sequences would require input fields at different times to perform the prediction. On the other hand, a single input sample is sufficient for the FCN to predict the output. Input and output fields are obtained from a simulation with periodic boundary conditions in the streamwise and spanwise directions. Such constraints could be added to the loss function, however this would imply that periodicity would only be satisfied in a least-square sense. In our implementation we are able to enforce periodicity in both wall-parallel directions by leveraging the fact that the convolutional-network output is deterministic and influenced only by the local information in the receptive field. In other words, if the network receives a certain local input, the local output value will always be the same, regardless of the local position within the input field. In order to have the same values on both edges of the domain, the inputs fields are padded in the periodic directions, i.e., they are extended on both ends, using the values from the other side of the fields.
The FCN architecture is shown in figure 2. Each convolution operation (except for the last one) is followed by batch normalization (Ioffe & Szegedy 2015) and a rectified-linearunit (ReLU, see Nair & Hinton 2010) activation function. The receptive field for this architecture is 15 × 15 points, hence 16 points are added to each field in both streamwise and spanwise directions. Note that this padding leads to a network output that is slightly larger than the velocity fields from the DNS, and therefore the network output is cropped to match the size of the reference flow fields. The padding involves a computational overhead due to the increased size of the input fields, however it is important to highlight that the padding is architecture-dependent and not input-dependent, meaning that the input is ≈ 17% bigger with a 192 × 192 field resolution (at Re τ = 180), but only ≈ 6% bigger when the fields have a size of 512 × 512 (at Re τ = 550). Moreover, the FCN was trained using the Adam (Kingma & Ba 2015) optimization algorithm for 50 epochs, with a scheduled exponential learning-rate decay. We used the optimizer hyperparameters suggested by Kingma & Ba (2015). The total number of trainable parameters in the FCN is 1,264,131.

POD-based predictions with convolutional neural networks
The methodology proposed by Güemes et al. (2019) employs a field of streamwise wall-shear stress to reconstruct the flow field at a certain wall-normal distance as a linear combination of orthogonal modes φ i (x): where N m is the total number of POD modes, a i (t) is the temporal POD coefficient corresponding to mode i, and σ i is its corresponding root-squared energy contribution. While the orthogonal modes were estimated from a POD of the training dataset, the network was trained and then employed to predict the mode temporal coefficients for each snapshot. This approach was assumed to be especially advantageous since it allows to filter out the noise content represented by small and uncorrelated scales, thus taking advantage of the energy optimality of POD modes. While in the work by Güemes et al. (2019) the domain employed and reconstructed had a size of h × h, resulting in a compact POD eigenspectrum, the availability of a larger domain in the streamwise and spanwise directions spreads the energy content over a wider set of wave numbers and POD modes (it must be recalled here that POD and Fourier modes coincide for homogeneous fields). To address this issue in the present study, the large instantaneous flow fields were subdivided into N sx × N sz smaller regions (henceforth referred to as subdomains), roughly of size (h × 0.5h) in the streamwise and spanwise directions respectively for the Re τ = 180 case, and of size (0.4h × 0.2h) at Re τ = 550. Note that the size of these subdomains is comparable to that employed by Güemes et al. (2019). The advantage of this approach compared to directly decomposing the full field lies in the fact that, in these subdomains, the first POD modes contain a very large fraction of the total energy content, as shown in figures 3a) and c). This is a direct consequence of including the energy of the structures larger than the domain into the first POD mode (Liu et al. 2001;Wu 2014). The choice of the size of the subdomains was the result of a compromise between reconstructing the majority of the energy content of the flow and compression of the information. For Re τ = 180 the flow fields were divided into 12 × 12 subdomains, while for Re τ = 550 a discretization into 32 × 32 subdomains was performed. Note that the number of subdomains was selected to ensure that the first POD mode contains a similar level of energy in both cases.
The POD modes of the data discretized into subdomains were computed following the snapshot approach proposed by Sirovich (1987). The three fluctuating components of each instantaneous subdomain were rearranged into a snapshot matrix: where N t refers to the total number of snapshots, i.e. equal to the number of instantaneous flow fields N f times the number of subdomains per each flow field (N sx ×N sz ×N f ), and N p refers to the total number of grid points in one subdomain. Then, POD spatial modes can be evaluated solving the eigenvalue problem of the spatial correlation matrix C as follows: where Φ is a matrix the rows of which contain the spatial POD modes, while Λ is a diagonal matrix with elements λ i = σ 2 i , which represent the variance content of each mode. The POD coefficients a i (t) are obtained by projecting the flow fields on the spatial POD modes computed with equation (2.6). Note that this economy-size decomposition returns a number of POD modes N m equal to 3N p and that for such a discrete dataset equation (2.4) is an equality.
The temporal POD coefficients of each instantaneous flow field were rearranged in a tensor of size N sx ×N sz ×N r to train a FCN, with N sx ×N sz being 12×12 and 32×32 for Re τ = 180 and 550, respectively and N r the number POD modes to be predicted, with N r < N m . As shown in figures 3b) and d), the first 64 POD modes account for 90% of the total energy at Re τ = 180, while 128 modes are needed at Re τ = 550 to retain a similar amount of energy. Therefore our predictions are based on equation (2.4) truncated at N r , with N r = 64 and 128 for Re τ = 180 and 550 respectively. As illustrated in figure 4, each filter corresponds to the N sx × N sz POD coefficients of a given mode number. In general, the energy distribution reported for both Re τ cases is very similar. The main significant difference is that the energy distribution becomes more compact at y + = 15 for the low-Re τ case (see figure 3).
In order to reconstruct the instantaneous fluctuation fields, the time coefficients were predicted using a FCN, with the wall-shear-stress components and the wall pressure as inputs. The N r time coefficients belonging to each subdomain are used to reconstruct their respective fluctuation fields as in equation (2.4), where the orthonormal basis functions are retrieved from the training data. The entire fields are assembled by tiling the fields within these subdomains. Note that there is no guarantee of smoothness across the edges of the subdomains because of the finite number of modes that are used to reconstruct the flow and because of the prediction error in the temporal coefficients. The underlying assumption is ergodicity, i.e. both the training and test datasets share the same statistical features and, consequently, the same spatial modes. This requires a sufficiently large training dataset to ensure convergence of the spatial modes, which is generally ascribed to the convergence of second-order statistics. Note that the predictions are performed at the same instant as that of the input fields. The implemented neural network does not require the knowledge of the input at previous timesteps, thus avoiding the limitations of availability of time sequences, as discussed above. The network is trained to minimize the loss function: which is the MSE between the predicted and the actual POD temporal coefficients of the DNS data. The neural-network architecture considered here blends the FCN shown in figure 2 and the network used by Güemes et al. (2019) (see figure 1 in that work). As in the FCN approach, each convolution operation (except for the last one) is followed by batch normalization (Ioffe & Szegedy 2015) and a ReLU (Nair & Hinton 2010) activation function. After each activation function a max pooling layer is added. Differently from what was done in the FCN approach, here the velocity components were not scaled before the decomposition, in order to keep the physical encoding based on the turbulent kinetic energy (TKE) of the flow. Note that by modifying the relative contribution of the velocity components to the energy norm, the modes would have been sorted based on a norm different from the TKE. The main difference with respect to the network used by Güemes et al. (2019) is the fact that here a single network is used to predict the full set of POD coefficients, instead of using different networks to predict each mode. Additionally, the work by Güemes et al. (2019) focused on a smaller region of the flow field, and therefore no subdomains were required for the region of interest. Lastly, the final fully-connected layer in Güemes et al. (2019) was not considered here, in order to have an architecture more directly comparable with the FCN. As in the FCN case, the FCN-POD network was trained using Adam (Kingma & Ba 2015) optimization algorithm with a scheduled exponential learning-rate decay. In this case, theˆ parameter from Kingma & Ba (2015) was set to 0.1 following TensorFlow recommendations (Abadi et al. 2016).
For the Re τ = 180 case the number of trainable parameters is 4,733,248, while for the Re τ = 550 case it is 5,028,224.

Extended POD
In addition to the two FCN-based approaches, which involve nonlinear relations between input and output, we also consider a method involving a linear relationship, i.e. the EPOD. Doing so, it will be possible to assess the prediction improvement with nonlinear methods in the context of wall-bounded turbulence. If the wall quantities are rearranged into a snapshot matrix W, with each snapshot forming a row, the method of snapshots proposed by Sirovich (1987) can be used to decompose this matrix into POD modes as: with Ψ w and Φ w being the temporal and spatial mode matrices respectively, and Σ w being a diagonal matrix containing the singular values. The extended POD modes (Borée 2003), corresponding to the projection of the wall quantities on the flow-field temporal basis, are defined as: If the dataset is sufficiently large to reach statistical convergence, the matrix L describes the relationship between the temporal POD coefficients of a certain distribution of wall features, and those of the corresponding flow field. Once the temporal correlation matrix is known, an out-of-sample flow field u can be reconstructed using L and the instantaneous realization of wall features as follows: where ψ w is the vector containing the temporal coefficients of the wall fields used for prediction. Note that ψ w is retrieved by projecting the out-of-sample wall field w on the POD basis: ψ w = wΦ T w Σ −1 w , where Φ T w Σ −1 w is readily available from the training dataset.
An important remark is that the matrix Σ w can be ill-conditioned. In fact due to the correlation between subsequent time-resolved snapshots, the rank of Σ w is smaller than N f , which is the number of snapshots (here smaller than the number of points). To address this issue a reduced-order representation of the matrix Σ w is employed after truncating the null elements in the diagonal of Σ w . Even if Σ w would have rank equal to N f , it might be adequate to truncate the matrix L (Discetti et al. 2018). Decomposing the flow quantities as: U = Ψ u Σ u Φ u , similarly to what is done for the wall quantities in equation (2.8), it can be observed that the product of the two matrices Ψ u Ψ T w in equation (2.9) returns a unitary-norm matrix with rank equal to those of Ψ u and Ψ T w , which are bases in the R N f vector space. As a consequence, a certain j th wall mode, uncorrelated with any field mode, would not result in a corresponding null row or column. To ensure removing the uncorrelated content from the matrix Ψ u Ψ T w , Discetti et al. (2018) proposed to set to zero all the entries of the matrix with absolute values smaller than a threshold proportional to the matrix standard deviation. In the present work we have found an error drop of approximately 10 percentage points with respect to the standard EPOD procedure. However, since the EPOD is used as a benchmark for the performance of linear methods with respect to the FCN-based approaches proposed herein, the filtered EPOD by Discetti et al. (2018) is not included in this comparison for brevity.

Results
The predictions of the trained models are compared with the data obtained from the DNS at Re τ = 180 and 550. The performance assessment is carried out first from a qualitative point of view and subsequently from a quantitative perspective, based on predictions of instantaneous fields, turbulence statistics and spectra.

Instantaneous predictions
The predicted fluctuation fields are first qualitatively inspected. Note that the fluctuation flow fields are the direct output of the FCN models, while in the FCN-POD models the temporal coefficients need to be processed to reconstruct the fluctuations, as outlined above. In this work the sampling period in the simulation is fixed, however we showed in our previous work (Guastoni et al. 2019a) that using less correlated samples during training (i.e. higher sampling period) can effectively improve the quality of the instantaneous predictions of the FCN method, provided that the neural-network capacity is sufficient to generalize over the training dataset.
In figure 5, the predictions of an instantaneous field of streamwise velocity fluctuations based on the various methods (namely FCN, FCN-POD and EPOD) are compared with the reference DNS. The predictions of the wall-normal and spanwise fluctuations at the same instant are shown in Appendix A. At y + = 15 all the methods provide accurate results, although the EPOD overestimates the fluctuations from the high-speed streaks. At y + = 30 the neural-network-based models maintain a good level of accuracy while the EPOD, despite the improved predicted range of fluctuations, it does not seem to be as accurate as the other two models. The CNN-based methods start to exhibit some deviations with respect to the reference at y + = 50, where the FCN-POD field is smoother and the FCN is slightly noisier than the DNS. Farther from the wall, the footprint is less pronounced, and therefore the ability of EPOD (which is a linear method) to predict the flow in this region is significantly reduced. In fact, the fields predicted through EPOD at y + > 15 are qualitatively very similar to the DNS, although the fluctuations become increasingly attenuated at larger y + . Furthermore, the FCN-POD method tends to merge neighbouring regions with high-or low-velocity fluctuations, predicting more elongated streak-like patterns than in the reference field. This is more evident at y + = 100, leading to an overestimation of the amplitude of the regions in the flow where the velocity fluctuations are higher. At this location, the FCN is not able to provide a reliable prediction of the flow field, capturing only the regions in which the magnitude of the fluctuations is higher. The corresponding structures probably have a distinct footprint at the wall, which allows the FCN to identify them. Note that at Re τ = 180 there is no real scale separation. As discussed in §2.4, the FCN-POD approach does not guarantee flow smoothnes across the edges of the subdomains. Close inspection of the predictions from the FCN-POD method reveals the edges of the subdomains at all y + , and the tiling is more evident in the streamwise direction because of the discontinuities located at the same spanwise location, orthogonally to the main flow structures. Despite these limitations we can observe that the velocity fields are generally smooth, without steep discontinuities at the edges of the subdomains: the variations of the velocity magnitude at the edges are of the same order as the fluctuations at the corresponding wall-normal distance.
The qualitative observations discussed above are complemented with a quantitative assessment of the instantaneous prediction performance, by analyzing the MSE L between the instantaneous predictions (denoted by 'Pred') and the reference (defined for each of the fluctuations independently), as shown in figure 6. Both neural-network models (FCN and FCN-POD) are trained using a stochastic algorithm and, in order to show the robustness of the optimal configuration, the statistics at each y + are averaged over 3 different models, with different initial random weight initializations. Since the EPOD algorithm is completely deterministic, one single prediction is needed. The neuralnetwork-based models consistently provide a lower error than the EPOD, with the FCN yielding a slightly better instantaneous performance closer to the wall than the FCN-POD approach. The gap between the two is reduced when moving away from the wall, where the prediction error of the streamwise fluctuations is approximately the same for both models at y + = 50, and it is slightly higher for the FCN at y + = 100.
The FCN architecture reported by Guastoni et al. (2019a) would only predict the streamwise velocity component of the velocity field at the target y + . The addition of the two other components implies that the FCN has multiple outputs that need to be optimized at the same time. We note that adding the two additional fluctuating components as outputs leads to slightly less accurate predictions with respect to those reported by Guastoni et al. (2019a) for one single output. This is not surprising, since the capacity of the network remained unchanged, however we tested a variation of the model architecture based on this observation, in order to have more layers dedicated to the prediction of each individual component. This network variation has a common part, identical to the original FCN up to the 4 th convolutional layer, in which the weights are optimized using the information from the error gradients computed for all the outputs. The last two convolution operations are replicated for each velocity component and the weights of these layers are updated only with the error associated to the respective output. Such a network, despite its higher capacity, provided worse predictions. A strong causal relation between the different components of the velocity (Lozano-Durán et al. 2020) can be a possible explanation for this result, which shows that updating all the weights with information from the three components at the same time can be beneficial for the quality of the predictions. Note that it is not trivial to design an architecture able to provide the best trade-off between single-component predictions and usage of the information from all the components, and obtaining such an architecture would require further investigation. For the FCN-POD model the multiple-component predictions were obtained as discussed by Güemes et al. (2019). The temporal POD coefficients can be projected on spatial POD modes involving the three velocity components, thus requiring only one output to predict the three fluctuations. The network architecture is different than the one used by Güemes et al. (2019), since it predicts directly all the needed time coefficients for each snapshot. While the final fully-connected layer included in the network architecture by Güemes et al. (2019) improves the robustness of the prediction, the FCN-POD implementation used here has a much smaller number of weights, thus significantly reducing the computational cost, and retains a larger number of POD modes (and thus more energy). Predictions of the streamwise fluctuation fields from the various methods at Re τ = 550 are shown in figure 7, while the results for the wall-normal and spanwise components are presented in Appendix A. Despite the higher friction Reynolds number, the FCN maintains a performance similar to the one achieved for Re τ = 180, at all wall-normal locations. Note that the FCN has the same architecture as the lower-Reynolds-number case, i.e. it has the same number of trainable parameters, while in the case of the FCN-POD approach, the network was modified to reconstruct approximately the same amount of energy as at Re τ = 180. Despite the higher number of employed subdomains, the tiling is more apparent at Re τ = 550. The prediction performance of the FCN-POD model degrades less quickly than the FCN when moving away from the wall, however the latter still performs better at y + = 50, as shown in figure 8. On the other hand, the EPOD also exhibits similar error levels as those reported for Re τ = 180, except at y + = 15, where the reconstruction of the streamwise-fluctuation field is significantly worse.

Inclination of coherent structures
The coherent structures in wall-bounded turbulence are inclined (Marusic & Heuer 2007), with a slope that can be computed by finding the maximum spatial correlation R ij (δx) between the inputs at the wall (index i) and the outputs (index j), with δx representing the distance in the streamwise direction at which the correlation is computed. By including a streamwise shift in the output fields, it is possible to obtain the maximum correlation at δx = 0, ensuring that the footprint of the coherent structure is included in the receptive field of the output. The use of such a shift was also discussed by Sasaki et al. (2019) in a similar context. By considering the maximum correlation between the wall-shear stress in the streamwise direction and the streamwise velocity at a certain y + , we obtain an angle of ≈ 15 • , in very good agreement with previous observations (Marusic & Heuer 2007;Sasaki et al. 2019). This shift was implemented in two alternative ways: first, by modifying the target output field, i.e. considering a field that has been sampled later in the simulation, although the accuracy of the introduced shift is limited by the value of the sampling period. The second approach makes use of the periodicity of the output fields, which are translated in the streamwise direction until the maximum correlation is obtained at δx = 0. This approach allows to more accurately introduce the shift, however in this case the underlying hypothesis is that the shift is sufficiently small so that temporal dynamics modify the flow in a negligible manner. None of the two shift implementations provided the expected improvement, and we observed a significant degradation of the prediction performance. These results could possibly be explained by the fact that coherent structures of different size have different inclinations, and imposing a single value is detrimental for the overall network performance, despite having chosen the angle that provides the maximum spatial correlation. Furthermore, the quality of the predictions is measured using the MSE between the prediction and the reference: this error indicator considers all wavelengths at the same time, without considering how the different wavelengths are affected by the shift. Further investigation of this aspect will be conducted in future work.

Predictions of turbulence statistics
By averaging over the fields obtained from the neural-network models and EPOD, it is possible to evaluate the turbulence statistics of the predicted flow. First we consider the dataset at Re τ = 180: the predicted RMS fluctuations of the three components are shown in figure 9, together with the reference DNS profiles. The error in these statistical quantities is defined as: for the streamwise component, and similarly for the other two components. As above, the subscripts 'DNS' and 'Pred' refer to the reference and predicted profiles, respectively. An important premise is that neither of the neural-network-based models is explicitly optimized to reproduce the statistics of the original simulation. This prevents the neural networks from learning only the average behaviour of the flow, however the predictions may be less statistically accurate, with the aim of maximizing the instantaneous performance. Note that here we favor instantaneous performance because our motivation is to use non-intrusive sensing for closed-loop flow control. The prediction errors in the various RMS profiles are summarized in table 2, and they are averaged over the different training runs for the FCN and FCN-POD models. Note that the average is performed over the fluctuation-intensity values and not on the predictions, because that would alter the statistical properties of the predicted flow fields. The comparison of the errors from the different models shows that the statistical performance mimics the one observed for the instantaneous predictions at y + = 15 and y + = 30, with the FCN performing better than the FCN-POD and EPOD models. Furthermore, the FCN model provides a similar performance for the fluctuations of all three velocity components, while POD-based methods are more accurate in the predictions of u + RMS . This is related to the choice of not scaling the different velocity components in the FCN-POD and EPOD approaches, and the fact that near the wall the most energetic dynamics of the flow are in the streamwise direction. Taking into account the standard deviation in the results of the neural-network-based methods, the three models provide similar error levels at y + = 50. At y + = 100 the scenario is opposite to what we observed close to the wall: the FCN exhibits the highest errors, while the EPOD provides the best results. The error in the prediction of u + RMS from the FCN-POD model is between those of the two other models, while the wall-normal and spanwise intensities are closer to the errors from the FCN, due the reasons outlined above.
The statistical analysis is repeated also for the models trained at Re τ = 550, with the predicted RMS fluctuations shown in figure 10 and the relative error with respect to the reference simulation in table 3. FCN-based models do not show a significant  variation in the prediction of the streamwise fluctuations with respect to the results at Re τ = 180, whereas the EPOD exhibits higher errors at this Reynolds number (also in the other two fluctuating components). The FCN has a consistent behaviour also for the fluctuations in the y-and z -directions, however the FCN-POD performs slightly worse than before, following the same trend but with higher error levels. The FCN-POD method outperforms the FCN approach only at y + = 100, confirming the results of the instantaneous performance at Re τ = 550.

Predictions of power-spectral density
The energetic scales present in the predicted fields, as well as their associated energy, are compared with those in the reference DNS data through spectral analysis. In figure 11 we show the pre-multiplied two-dimensional power-spectral density of the streamwise, wall-normal and spanwise fluctuations (denoted by φ uu , φ vv and φ ww respectively) at Re τ = 180, where λ x and λ z denote the streamwise and spanwise wavelengths, whereas k x and k z are the corresponding wavenumbers. These results confirm the observations made in §3.1: at y + = 15, all the considered models are able to correctly predict the energy content of the flow at all wavelengths, with the FCN slightly outperforming the two POD-based approaches. Note that the FCN-POD model is able to reconstruct the energy content of the flow at wavelengths that are longer than the size of the subdomains, proving that this is not a limiting factor for the model. However, a small jump, probably due to  a lack of smoothness at the edges of the subdomains, can be observed in the streamwise wavelength for the 10%-energy level in the wall-normal and spanwise components. These jumps are found at a wavelength λ + x ≈ 180, corresponding to the subdomain size. At y + = 30 there is a slight energy attenuation which becomes increasingly more noticeable when the predicted flow is farther away from the wall. At y + = 100 the POD-based methods perform better than the FCN model, a fact that can be explained by considering two concurring aspects: the first is that POD methods only predict the temporal dynamics of the system, thus the overall energy-scale distribution stored in the POD spatial modes does not need to be predicted. This allows to reconstruct more than 50% of the flow fields, at least in the streamwise component. The second aspect is the fact that the receptive field from the FCN method, while sufficient for planes closer to the wall, is not large enough to reproduce the large scales present at larger y + . Figure 12 reports the reference and predicted power-spectral densities at Re τ = 550. As opposed to what was observed for the instantaneous predictions and the turbulence statistics, the spectra highlight the differences and similarities between the models used for the two Reynolds numbers. As noted above, the FCN architecture is the same for both Re τ , however this implies that the receptive field is smaller at higher Reynolds number when it is measured in outer units. This can potentially help in the prediction of the small scales, but it can also be detrimental for the larger scales. Note that the different size of the receptive field does not seem to affect the predicted energy content of the FCN, y + ≈ 50 i) 10 2 10 3 λ + which shows the same trends as for the low-Re τ case. On the other hand, the spectra of the FCN-POD with 12 × 12 subdomains (the same amount as in the low-Re τ case, not shown) exhibits spurious periodic peaks due to the tiling. This observation motivated the increased number of subdomains considered at Re τ = 550. As in the low-Re τ case, the FCN-POD approach is able to reconstruct the scales larger than the subdomain size, but the countour line exhibits a small jump in the streamwise wavelength for the 10%-energy level. This jump appears at a wavelength λ + x ≈ 200, corresponding to the subdomain size employed in the high-Re τ case. This jump is also appreciated in the streamwise component at y + = 15. The POD-based approaches are outperformed by the FCN in the range of y + = 15 − 50. Farther from the wall, the accuracy of the FCN is matched by the FCN-POD method. It is interesting to note that the EPOD does not follow the same attenuation process as the FCN-based methods in the wall-normal and spanwise components. As one moves farther from the wall, the FCN-based methods fail to reproduce a wider range of small scales, whereas the EPOD exhibits more difficulties predicting the large scales. When it comes to the spectral peaks far from the wall, while the FCN-based methods produce noisier predictions than in the low-Re τ case, the EPOD is not able to reproduce that part of the spectra.

Transfer learning
A number of more advanced techniques have also started to be adopted from the specialized machine-learning literature and applied to fluid-dynamics research. One notable example is transfer learning (Pan & Yang 2009), a method that allows to transfer knowledge from one neural-network model to another one, thus reducing the amount data and time required for training. Guastoni et al. (2019a) showed that the training time at a given wall-normal location may be significantly reduced if the network parameters are initialized using the optimized parameters of a previously-trained network at another wall-normal location. Similarly, Kim & Lee (2020) used the convolutional network trained at a low Reynolds number to predict the flow at a higher Reynolds number.
Transfer learning represents an appealing solution for the main drawback of neural networks, which is the need to train them with a sufficient amount of data. Training typically requires specialized hardware and in our specific application the computational cost of generating the training and test datasets is not negligible. Furthermore, this cost grows as Re τ increases, making the generation of training data through DNS unfeasible at the Reynolds numbers that are relevant for engineering applications. In this regard, it is important to make an efficient use of the data and the trained models at our disposal. In this work, the possibility of transferring knowledge between models trained at different friction Reynolds numbers is investigated. At a fixed wall-normal distance, the weights of the FCN model trained on the dataset at Re τ = 180 are loaded before training the network with the higher-Re τ dataset. This is possible because the network has the same amount of trainable parameters in both cases, as noted above. The learning rate is the only parameter that needs to be modified: a lower value has to be set, in order to prevent the optimizer from diverging too quickly from the weight configuration used for initialization. While in Guastoni et al. (2019a) we froze the first layers of the initialized network because the input was the same at the different wall-normal locations, in this case all the layers are trainable because the input distribution changes from one Re τ to the other one.
First we considered an initialized model with the full training/validation dataset, in order to assess the effect of the initialization on the training results. Subsequently, new training runs of the initialized model were performed with 25% and 50% of the original dataset. Differently from the previous sections, only one training run was performed for each case. In order to compare models trained with datasets of different sizes, we considered the number of weight updates through the optimization algorithm during training. In figure 13 the validation and test losses are compared for the models trained with the full dataset and a random initialization. When the initialized model is trained on the full dataset, the performance is consistently better than that of the random initialization, both in terms of validation and test loss. The improvement is more evident close to the wall, whereas at y + = 100 the two models provide approximately the same results after the first 150,000 updates. Transferring knowledge between different Reynolds numbers is then not only feasible, but also advantageous in terms of performance when the same amount of data is considered. If the training/validation dataset is reduced, the validation loss seems to overestimate the error compared with that of the test dataset when 25% of the data is used, at all wallnormal locations. The opposite holds when 50% of the training dataset is used, instead. Up to y + = 50, the initialized networks are able to provide a performance that is very similar to that of the reference model with the same number of updates, with significant savings in terms of amount of data needed to train the network. On the other hand, at y + = 100 a sufficient number of samples becomes a necessary condition to ensure the convergence to an optimal configuration: the loss of the network trained with 50% of the training dataset does not improve after the first 100,000 updates, while with 25% of the original dataset the network exhibits overfitting.
The initialized models are able to provide a comparable accuracy also from the statistical point of view, as reported in table 4. We stress once again that the networks are not explicitly optimized to reduce the error in the statistics and that small variations in these error figures can be ascribed to the stochastic nature of the optimization algorithm. Overall, these results demonstrate the feasibility of knowledge transfer from models at different Re τ : with careful tuning of the hyperparameters it should be possible to substantially reduce the training time, as well as the amount of data needed for training. Although not tested, the transfer between different wall-normal locations described in Guastoni et al. (2019a) is still applicable in this case, thus enabling a more efficient prediction of the flow at different wall-normal locations.

Conclusions
In this work, we introduced and compared two different models based on fullyconvolutional neural networks, for prediction of the velocity fluctuations at a given wall-normal distance, using quantities measured at the wall as inputs. The FCN Figure 13. Validation ( ) and test ( ) loss in the FCN prediction at (from left to right, top to bottom): y + = 15, 30, 50 and 100. Orange represents the models trained with the full dataset and random initialization, grey the models trained with the full dataset and initialized with previously-trained networks, pink and brown represent models initialized with the parameters from the Reτ = 180 network, trained with 50% and 25% of the original dataset, respectively.
FCN-POD models are improved versions of previous architectures, used by Guastoni et al. (2019a) and Güemes et al. (2019), respectively. Both of them are able to provide predictions in very good agreement with the reference data, simulated by means of the pseudo-spectral DNS code SIMSON (Chevalier et al. 2007), up to y + = 50. Such an agreement is verified by comparing the error in instantaneous predictions, turbulence statistics (namely RMS fluctuations) and the energy content at the different wavelengths (i.e. spectral analysis). Both models show better prediction capabilities than EPOD (which is a linear method) in almost all the wall-normal locations and investigated features, thanks to their ability to predict nonlinear scale interactions. Furthermore, we showed that these architectures can be used at two different friction Reynolds numbers (Re τ = 180 and 550) with minimal modifications, providing satisfactory results on both datasets.
The two models are designed under the assumption that local information at the wall is sufficient to predict the flow farther away, however the FCN-POD model partially encodes further physical information of the system by using the spatial modes obtained through POD of the training dataset. On the other hand, features like the periodicity of the flow are enforced in the FCN by exploiting the mathematical characteristics of the model. These architectural differences are associated with performance discrepancies at the tested wall-normal locations: the FCN provides higher accuracy than the FCN-POD model closer to the wall, i.e. up to y + = 30 at Re τ = 180 and up to y + = 50 at Re τ = 550. Farther from the wall, the FCN-POD method produces the most accurate predictions. The choice between these two models is motivated by the application into which the prediction model is integrated.
Despite the encouraging results discussed here, both models can be improved in terms of network architecture and training. An attempt to embed further physical information into the FCN did not result in improved predictions, as reported in §3.1.1. The correct way of incorporating this information to enhance the predictions is an active area of research. As another example, the high-frequency noise in the FCN predictions could be reduced with appropriate filtering, possibly adding a trainable layer to the network to perform this operation. The FCN-POD model has a higher number of hyperparameters to be set, such as the number of predicted temporal modes or the size of the subdomains. A more thorough inspection of the hyperparameter space may provide a significant improvement in the prediction performance. Differently from the FCN, in the FCN-POD model the velocity components are not scaled to have the same magnitude: such a modification could help to predict the wall-normal and spanwise components of the velocity more accurately, even though it would also modify the POD mode sorting because of the different energy norm. Furthermore, the FCN-POD results exhibit lack of smoothness at the subdomain edges in the flow predictions. Finally, both models are trained to minimize a loss function based on the instantaneous error. Such a function could be modified to improve other physical characteristics of the predicted flow, for example the turbulence statistics and the spectral energy content.
To reduce the training time in view of industrial applications, the implementation of transfer learning was tested for the FCN model. Transfer learning can exploit a network trained at a lower Reynolds number to provide the weight initialization for training at a higher Reynolds number, thus reducing the requirements in terms of training time and data. The results are very encouraging, showing that it is possible to train the network with 50% and even 25% of the original training dataset, obtaining a performance similar to that of the reference model up to y + = 50.
Once the neural networks are trained, they are computationally cheap to evaluate, and they can become even cheaper by pruning the parts that have a negligible contribution to the final result off the network. Such an operation is not possible a priori, since the training determines how the inputs have to be processed to obtain the output. By reducing the computational cost of the evaluation it is possible to deploy the model using low-powered hardware and/or potentially run it in real-time. Thus, the proposed FCN-based methods could be used for non-intrusive sensing of the flow, which is needed for closed-loop control applications. Furthermore, since the FCN models are able to reproduce non-linear interactions in wall-bounded turbulence, new promising avenues in turbulence research could be opened by the network interpretation (Fan et al. 2020), as shown by Iten et al. (2020), who demonstrated that neural networks can provide relevant physical insights. Figure 14. Comparison of the wall-normal fluctuation fields at Reτ = 180, scaled with the corresponding vRMS, from EPOD (1 st row), FCN-POD (2 nd row), reference DNS (3 rd row) and FCN (4 th row). Results at y + = 15 (1 st column), y + = 30 (2 nd column), y + = 50 (3 rd column) and y + = 100 (4 th column). Figure 15. Comparison of the spanwise fluctuation fields at Reτ = 180, scaled with the corresponding wRMS, from EPOD (1 st row), FCN-POD (2 nd row), reference DNS (3 rd row) and FCN (4 th row). Results at y + = 15 (1 st column), y + = 30 (2 nd column), y + = 50 (3 rd column) and y + = 100 (4 th column). Figure 16. Comparison of the wall-normal fluctuation fields at Reτ = 550, scaled with the corresponding vRMS, from EPOD (1 st row), FCN-POD (2 nd row), reference DNS (3 rd row) and FCN (4 th row). Results at y + = 15 (1 st column), y + = 30 (2 nd column), y + = 50 (3 rd column) and y + = 100 (4 th column).