Hostname: page-component-74d7c59bfc-wzxrw Total loading time: 0 Render date: 2026-02-05T08:59:22.525Z Has data issue: false hasContentIssue false

Spatiotemporal SWMM-LSTM surrogate modeling for efficient node-level water depth and inflow prediction in urban drainage networks

Published online by Cambridge University Press:  13 January 2026

Nima Zafarmomen
Affiliation:
Clemson University , USA
Vidya Samadi*
Affiliation:
Clemson University , USA
Edoardo Borgomeo
Affiliation:
University of Cambridge , UK
*
Corresponding author: Vidya Samadi; Email: samadi@clemson.edu
Rights & Permissions [Opens in a new window]

Abstract

Extreme weather events, combined with human-induced factors, such as expanding impervious surfaces and inadequate drainage infrastructure, are driving escalating urban flood risks worldwide. In this study, we present a novel spatiotemporal Long Short-Term Memory (LSTM)-based surrogate of the U.S. Environmental Protection Agency (EPA)’s Storm Water Management Model (SWMM) to predict maximum water depth and inflow at the asset level within urban drainage networks. The high-resolution SWMM model, encompassing the full network of conduits and manholes, was first calibrated and validated using U.S. Geological Survey (USGS) observations. The LSTM surrogate was then trained on data from 5,000 rainfall events across seven Annual Recurrence Intervals (ARIs) ranging from 1 to 100 years. The SWMM-LSTM surrogate model consistently achieves high predictive performance for both water depth and inflow, highlighting its robustness across diverse storm scenarios and ARI conditions. Hyperparameter optimization via grid search revealed task-specific configurations: larger hidden layers with moderate dropout improved water depth predictions, while deeper network architectures with minimal dropout optimized inflow forecasts. By providing rapid, computationally efficient predictions without compromising accuracy, the SWMM-LSTM surrogate offers a practical tool for real-time flood risk assessment, scenario evaluation and actionable decision-making in complex urban drainage systems.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press

Impact statement

The asset-level Storm Water Management Model (SWMM)-calibrated Long Short-Term Memory (LSTM) surrogate model represents a major step forward in computational flood modeling, uniting the physical rigor of hydraulic simulations with the speed and predictive power of deep learning. This hybrid framework overcomes the computational limitations of traditional approaches without compromising accuracy, enabling real-time flood predictions that were previously infeasible. By providing rapid, reliable forecasts, the model empowers emergency responders, urban planners and infrastructure managers to perform swift scenario analyses during extreme weather events, informing decisions and enhancing public safety. As urban populations grow and climate change amplifies the frequency and intensity of extreme rainfall, the demand for resilient flood management systems becomes ever more urgent. This research offers a scalable, adaptable framework suitable for diverse urban environments, supporting the design of smart city infrastructure and promoting climate-resilient communities.

Introduction

In recent years, the frequency and intensity of extreme weather events have escalated, significantly affecting the socioeconomic development of communities (Idowu and Zhou, Reference Idowu and Zhou2023). Among these hazards, urban flooding is especially pervasive, damaging property, disrupting essential services and impeding residents’ daily mobility and communication (Samadi et al., Reference Samadi, Fowler, Lamond, Wagener, Brunner, Gourley, Moradkhani, Popescu, Wasko and Wright2025). Urban flooding events are often exacerbated by anthropogenic factors such as inadequate land-use planning, extensive impervious surface coverage, steep topography and stormwater drainage issues, including insufficient capacity, aging facilities and blockages in existing infrastructure.

To mitigate urban flooding impacts, a range of strategies have been developed, including practical management approaches such as nature-based solutions (e.g., green infrastructure; Guido et al., Reference Guido, Popescu, Samadi and Bhattacharya2023; Zafarmomen et al., Reference Zafarmomen, Samadi and Lunt2025a) and infrastructure upgrades aimed at directly reducing flood risk. Preceding these interventions, modeling approaches provide critical tools for simulating flood dynamics (e.g., real-time simulations), identifying vulnerabilities, mapping inundation extents and informing targeted interventions. In data-scarce or ungauged basins, digital elevation model-based approaches have also been successfully applied to delineate flood-prone areas and rapidly estimate inundation depth using primarily topographic information (Samela et al., Reference Samela, Manfreda, De Paola, Giugni, Sole and Fiorentino2016; Manfreda and Samela, Reference Manfreda and Samela2019). Recent work demonstrates that large language models can aid in the interpretation of extreme events, such as flooding, and deliver earlier alerts to communities (Zafarmomen and Samadi, Reference Zafarmomen and Samadi2025).

Urban hydrologic and hydraulic models serve as essential tools for planning, designing and managing water cycles in drainage systems (Sytsma et al., Reference Sytsma, Crompton, Panos, Thompson and Mathias Kondolf2022; Aderyani et al., Reference Aderyani, Jafarzadegan and Moradkhani2025; Zafarmomen et al., Reference Zafarmomen, Samadi and Borgomeo2025b). Fully distributed models, such as MIKE SHE (Abbott et al., Reference Abbott, Bathurst, Cunge, O’Connell and Rasmussen1986), LISFLOOD-FP (Bates and De Roo, Reference Bates and De Roo2000), MIKE FLOOD (Patro et al., Reference Patro, Chatterjee, Mohanty, Singh and Raghuwanshi2009) and InfoWorks ICM (Sidek et al., Reference Sidek, Jaafar, Majid, Basri, Marufuzzaman, Fared and Moon2021), solve complex differential equations like the Saint-Venant equations to capture flow dynamics across heterogeneous urban landscapes. By discretizing urban areas into computational grids, these models simulate surface runoff, channel flow and inundation patterns with high spatial precision. Their accuracy, however, depends heavily on detailed boundary conditions, including water surface elevation, inflows and outflows and bathymetric data.

Semi-distributed models, such as Storm Water Management Model (SWMM), Hydrologic Engineering Center Hydrologic Modeling System and MOUSE (DHI, 2017), strike a middle ground in complexity (Bennett, Reference Bennett1998; Rossman and Supply, Reference Rossman and Supply2006; Gironás et al., Reference Gironás, Roesner, Rossman and Davis2010). These models divide the urban catchment into sub-catchments and use conceptual or simplified physical methods to transform rainfall into hydrographs, which are then routed through a simplified drainage network. While less computationally demanding than fully distributed models, they capture the essential processes of flood generation and routing. At the simplest level, lumped approaches, such as the rational method or the Soil Conservation Service (SCS) Curve Number method (Paudel et al., Reference Paudel, Nelson and Scharffenberg2009), provide rapid estimates of peak discharge and runoff volume, often used in data-scarce contexts or preliminary planning. Overall, semi-distributed models balance the realism of fully distributed methods with the efficiency of lumped ones, making them widely applicable in urban flood risk management.

Recent research has increasingly focused on hybrid approaches that combine physics-based models with computational frameworks to improve predictive accuracy. Similar improvements in predictive skill have been reported in engineering applications using Artificial Neural Network (ANN) models (Mousavi et al., Reference Mousavi, Bengar, Mousavi, Mahdavinia and Bengar2025). For instance, Khatooni et al. (Reference Khatooni, Hooshyaripor, MalekMohammadi and Noori2025) coupled SWMM with Hydrologic Engineering Center’s River Analysis System Two-Dimensional (HEC-RAS-2D) to capture hydrologic–hydrodynamic interactions, while Wang et al. (Reference Wang, Chen, Zeng, Chen, Li, Jiang and Lai2024) developed the Storm Tide Unit Flood Modeling System (STUFMS), linking SWMM with the Télémac-2D Hydrodynamic Modeling System (TELEMAC-2D) for bidirectional water exchange. These frameworks improved predictions of flood depth and velocity, underscoring the value of hybrid systems for resilience planning. Similarly, Zhao et al. (Reference Zhao, Huo, Liu, Yang, Luo, Ahmed and Elbeltagi2024) combined SWMM with LISFLOOD to evaluate the joint effects of extreme rainfall and urbanization on inundation dynamics, and Li et al. (Reference Li, Yuan, Hu, Xu, Cheng, Song, Zhang, Zhu, Shang, Liu and Liu2024) applied LISFLOOD to simulate spatiotemporal flood evolution and assess its impact on population, land use and buildings.

Hydrological modeling has advanced significantly through improved physical models, data assimilation techniques and machine learning (ML) methods (Zafarmomen et al., Reference Zafarmomen, Alizadeh, Bayat, Ehtiat and Moradkhani2024). Building on these developments, ML and deep learning (DL) have become promising tools that complement traditional physics-based approaches. By leveraging large datasets, ML/DL models improve predictive accuracy and timeliness (Javadi et al., Reference Javadi, Jalilehvand, Alizadeh and Zafarmomen2025), enabling integration with hydrologic and hydraulic models. For example, Zhao et al. (Reference Zhao, Liu, Li, Tang, Yang, Xu, Quan and Hu2023) coupled Long Short-Term Memory (LSTM) with SWMM to enhance runoff predictions at outlet points, while Shao et al. (Reference Shao, Chen, Zhang, Yu and Chu2024) introduced CRU-Net, a recurrent U-shaped network capable of predicting inundation areas at speeds far surpassing LISFLOOD-FP. More recently, Pan et al. (Reference Pan, Hou, Gao, Chen, Li, Imran, Li, Yang, Ma and Zhou2025) proposed a coupled SWMM–LSTM framework that incorporates sewer network outputs into surface flood modeling, demonstrating strong performance in predicting node overflow, inundation depth and flooded area.

While these studies advanced LSTM–hydrodynamic coupling, they primarily focused on either grid-based inundation or outlet-level performance. In contrast, this study develops a spatiotemporal LSTM surrogate trained on SWMM outputs to predict event-level maximum water depth and inflow at the node scale. By explicitly integrating hydrologic conditions and network connectivity, the model identifies hydraulic hotspots through inter-event variability, enabling rapid and scalable real-time flood forecasting. The framework captures rainfall–runoff dynamics by combining spatial dependencies across the drainage network with temporal storm variability, delivering robust node-level predictions of (i) maximum water depth (m) and (ii) maximum inflow (m3/s). By reducing computational costs while maintaining high predictive accuracy, this surrogate model offers a practical alternative to traditional hydrologic–hydraulic approaches for operational flood management.

This paper is structured as follows: Section “Methodology” describes the case study, SWMM setup and surrogate model development, including hyperparameter tuning and evaluation outcomes and predictive skill across various average recurrence intervals (ARIs). Section “Discussion and conclusion” discusses conclusions, highlighting the effectiveness of the spatiotemporal LSTM surrogate, its implications for urban flood management and avenues for future research.

Methodology

Case study

The Rocky Branch Watershed, a sub-watershed of the Congaree River located in Columbia, South Carolina, USA, is highly susceptible to urban flooding due to its unique topographic and anthropogenic characteristics. The watershed is characterized by steep slopes and a dense storm drainage network, which, combined with extensive urbanization, significantly exacerbates flood risks. Urban development has led to a predominance of impervious surfaces and substantial land-use changes, resulting in reduced infiltration rates, particularly within the central business district (Hung et al., Reference Hung, James, Carbone and Williams2020). Ress et al. (Reference Ress, Hung and James2020) demonstrate a strong correlation between runoff coefficients and the percentage of impervious areas, highlighting the critical role of urban land cover in amplifying surface runoff. Despite these challenges, conventional stormwater management measures in the Rocky Branch Watershed remain limited, contributing to persistent flash flooding issues that pose ongoing threats to infrastructure and public safety.

The spatial distribution of flood risk within the watershed is shown in the USGS elevation map (Figure 1), which illustrates the topographic variability and infrastructure layout. The map highlights steep elevation gradients, particularly in the central and eastern portions of the watershed, where dense clusters of rain gauges and pipes indicate a complex drainage system. These areas, encompassing key urban zones such as the University of South Carolina main campus and surrounding neighborhoods, exhibit heightened vulnerability due to the convergence of steep slopes and impervious surfaces. The green and yellow zones on the map signify lower elevations prone to water accumulation, while the red and brown areas denote higher elevations that accelerate runoff toward low-lying regions.

Figure 1. (a) Location of Columbia in South Carolina, USA. (b) Spatial configuration of the stormwater network and topography in the Rocky Branch Watershed, Columbia, South Carolina.

Drainage simulation model

The US EPA’s SWMM is a dynamic hydrologic–hydraulic simulator widely adopted for urban drainage analysis (Rossman, Reference Rossman2010). It couples rainfall–runoff generation with flow routing and water-quality processes across sewer networks (Rossman and Huber, Reference Rossman and Huber2016). Urban domains are partitioned into subcatchments connected by nodes and links (e.g., manholes, conduits and channels), enabling simulation of surface runoff and conveyance through pipe systems. Runoff production is represented by a nonlinear reservoir formulation, and the platform supports both event-based and continuous simulations for design, planning and operations (Huber et al., Reference Huber, Dickinson, Barnwell and Branch1988).

In this study, we apply SWMM to a 10.75-km2 watershed comprising 2,802 manholes and 2,801 conduits spanning 17.02 km. The watershed was subdivided into multiple drainage catchments using the Thiessen polygon method, which allocates contributing areas to junction nodes based on proximity. This approach is widely used in urban hydrologic modeling to define subcatchments in the absence of high-resolution flow routing data (Dong et al., Reference Dong, Bain, Akcakaya and Ng2023). Conduit diameters range from 8 to 84 inches and include 14 material types; reported Manning’s roughness values span 0.009 for polyethylene to 0.027 for corrugated metal. Field-surveyed elevations, maximum depths and inlet/outlet offsets are incorporated to reflect as-built conditions (Table 1). This high-resolution network and material heterogeneity provide a robust basis for hydraulic performance assessment, calibration and sensitivity analysis and scenario testing under extreme rainfall, supporting reliable urban flood risk evaluation and system planning.

Table 1. Summary of Rocky Branch stormwater drainage network characteristics

The graphical user interface for SWMM was originally developed and maintained by the US EPA. In our study, the PySWMM Python library, developed by McDonnell et al. (Reference McDonnell, Ratliff, Tryby, Wu and Mullapudi2020), served as a Python wrapper for the SWMM engine, enabling full execution and fine-grained control of dynamic simulations directly within the Python environment. This integration enables advanced workflows by providing a seamless pathway for implementing ML/DL techniques, thereby strengthening automation, calibration procedures and real-time modeling applications.

This study utilizes the Saint-Venant equations to simulate unsteady water flow within a drainage network through pipes and channels. The governing equations include the continuity equation for mass conservation (Equation 1) and the momentum equation (Equation 2), respectively (Lai, Reference Lai1986). SWMM employs the full version of these equations to model free surface flow dynamics.

(1) $$ \frac{\partial A}{\partial t}+\frac{\partial Q}{\partial x}=0 $$

The term $ \frac{\partial A}{\partial t} $ represents the rate of change of the cross-sectional flow area A with respect to time t, accounting for the temporal variation in water storage. Moreover, $ \frac{\partial Q}{\partial x} $ represents the spatial gradient of the discharge Q along the flow direction x, describing the change in flow rate along the channel or pipe.

(2) $$ \frac{\partial Q}{\partial t}+\frac{\partial \left({Q}^2/A\right)}{\partial x}+ gA\frac{\partial H}{\partial x}+ gA{S}_f+ gA{h}_L=0 $$

where $ A $ is the cross-sectional area, $ Q $ is the flow rate, $ {S}_f $ represents the friction slope and $ {h}_L $ accounts for energy losses caused by local hydraulic features, such as bends, expansions, contractions or other structures. Additionally, $ H $ denotes the water surface elevation. It is worth noting that the area $ A $ is a known function of flow depth $ y $ , which in turn can be derived from the head H.

This study employs the Horton infiltration model by applying the classical exponential decay function to simulate surface infiltration (Equation 3).

(3) $$ f(t)={f}_{\infty }+\left({f}_0-{f}_{\infty}\right){e}^{-\alpha t} $$

where $ {f}_0 $ is the initial infiltration rate, $ {f}_{\infty } $ is the minimum (asymptotic) infiltration rate after long wetting and $ \alpha $ is the decay coefficient representing how quickly infiltration decreases.

Cumulative infiltration is updated at each time step as:

(4) $$ F\left(t+\Delta t\right)=F(t)+\hat{f}\Delta t $$

where F(t) is the cumulative infiltration at time t, and $ \hat{f} $ is the average infiltration rate during the time step. Moreover, Infiltration recovers during dry periods, ensuring accurate modeling of variable antecedent conditions.

(5) $$ {f}_P={f}_{\infty }+\left({f}_0-{f}_{\infty}\right)\;{e}^{-{k}_d\left(t-{t}_w\right)} $$

where $ {f}_P $ is the regenerated infiltration capacity at time t, $ {k}_d $ is the attenuation coefficient for the recovery curve and $ {t}_w $ is the hypothetical time when recovery started.

Moreover, the hydraulic simulation of the drainage system’s pipe network is conducted using the dynamic wave method, while the underlying principles of the model are formulated as follows

(6) $$ {R}_s={\int}_{i>{f}_p}\left(i-{f}_p\right) dt $$

where Rs is surface runoff (mm), i is the rainfall intensity (mm/h).

Prior work in the Rocky Branch Watershed by Tanim et al. (Reference Tanim, Smith-Lewis, Downey, Imran and Goharian2024) and Morsy et al. (Reference Morsy, Goodall, Shatnawi and Meadows2016) applied SWMM with sub-catchment outlets positioned at fixed 50-m intervals. In contrast, we explicitly represent the drainage system at the asset level (manholes, pipes and inlets) using surveyed elevations and offsets, thereby achieving higher hydraulic resolution. Because a hydrologic–hydraulic model requires calibration, we treated the parameter ranges reported in those studies (e.g., imperviousness, Manning’s n value and Horton infiltration parameters) as priors and refined them via targeted trial-and-error. Calibration focused on reproducing the observed hydrograph at the USGS gauging station 02169505 on Rocky Branch, with emphasis on peak magnitude and timing. Rainfall data for these events were obtained from the nearby USGS gauge 021695045. Final parameter sets were selected to maximize the evaluation criteria.

The model was calibrated using two distinct storm events from the 2015–2024 USGS record, selected for their hydroclimatic diversity. The event of 13–14 December 2019 was a long-duration winter frontal storm (99.3 mm over 29.25 h), while the 25 July 2024 event was a short-duration summer convective storm (13.5 mm over 1.25 h). Both events ranked in the top 10% of intensity for the period, ensuring the calibration captured a wide range of hydrologic responses.

DL model

LSTM

The LSTM unit uses three gates: input, forget and output, to regulate how information enters, is retained, and is revealed from a long-term cell state. This design preserves long-range temporal dependencies while updating a hidden state that reflects recent information, making LSTMs well-suited to rainfall–runoff sequences. At each time step, the gates (sigmoid activations) and a tanh-based proposed update adjust the cell state and produce the hidden state, enabling stable learning without vanishing gradients (Hochreiter and Schmidhuber, Reference Hochreiter and Schmidhuber1997). LSTMs have shown promising improvement in forecasting flood-related time series, accurately modeling complex rainfall–runoff relationships (Saberian et al., Reference Saberian, Samadi and Popescu2024; Saberian et al., Reference Saberian, Zafarmomen, Panthi and Samadi2026).

In this study, the LSTM receives time-varying rainfall and related predictors and predicts event-level maxima (water depth and inflow). Full gate equations and training details are provided in the Supplementary Material (Supplementary Equations 1–5).

To train the DL model, we generated a dataset of 5,000 synthetic design hyetographs. These synthetic hyetographs were generated using a statistically based design storm method grounded in National Oceanic and Atmospheric Administration Atlas 14 precipitation frequency estimates. For each of the seven ARIs (1, 2, 5, 10, 25, 50 and 100 years), event total depths were distributed sampled from the 90% confidence intervals of Atlas 14 across 10 storm durations (5, 10, 15 and 30 min; 1, 2, 3, 6, 12 and 24 h) as provided in Supplementary Table S1, to account for variability. The temporal distribution within each storm was synthesized using a stochastic approach based on Huff’s quartile curves (Huff, Reference Huff1967), with sampling weights of 0.40 (first quartile), 0.25 (second), 0.20 (third) and 0.15 (fourth) to ensure diversity in storm patterns (Yin et al., Reference Yin, Xie, Nearing, Guo and Zhu2016). This was further perturbed with Dirichlet noise (Anello and Cordaro, Reference Anello and Cordaro2007) to create variability in peak timing and intensity while strictly conserving the total depth. All hyetographs were generated at a 15-min resolution and assumed to be spatially uniform across the watershed, a reasonable simplification for its scale (i.e., 10.75 km2), consistent with prior local studies (Morsy et al., Reference Morsy, Goodall, Shatnawi and Meadows2016; Tanim et al., Reference Tanim, Smith-Lewis, Downey, Imran and Goharian2024). For each event, the input sequence Ut comprises rainfall intensity through time, and the network predicts, for every drainage node, the event-level maxima: (i) maximum water depth (m) and (ii) maximum inflow (m3/s). From a hydrologic perspective, maximum water depth characterizes the storage response within local depressions or manholes, reflecting the balance of inflow, conveyance and infiltration at the point scale, while maximum inflow represents the cumulative runoff contributions from upstream subcatchments, capturing the rainfall–runoff transformation over the drainage area. Maximum water depth is a critical indicator of localized flood hazard, directly linked to surface inundation and infrastructure exposure, while maximum inflow reflects the integrated upstream hydrologic response that governs conveyance capacity and potential surcharging within the drainage network. Targets are obtained from the calibrated hydrologic–hydraulic model. An LSTM architecture is adopted because it captures long-range temporal dependencies and nonlinear threshold responses inherent to rainfall–runoff transformation and hydraulic routing, enabling accurate, computationally efficient node-level forecasts. This configuration is designed to enhance both offline and real-time flood prediction capabilities in urban drainage networks. The framework maps rainfall sequences to event-level maxima of water depth and inflow. In offline mode, the framework functions as a high-speed emulator of the SWMM, enabling rapid vulnerability assessments. In real-time mode, the model can be driven through real-time rainfall data to deliver continuously updated forecasts of maximum hydraulic responses during ongoing storm events. The LSTM dynamically encodes cumulative rainfall through its memory state to represent hydrologic dependencies and storage evolution. As new observations become available, predictions are incrementally refined, supporting near-real-time, node-level flood risk assessment and enabling proactive urban flood management.

Figure 2 illustrates the hybrid workflow. SWMM generates node-level targets from rainfall events, which are then used to train an LSTM that maps predictor sequences to maximum water depth and inflow for real-time prediction.

Figure 2. SWMM–LSTM workflow for surrogate modeling. (a) SWMM process: rainfall is processed through infiltration, surface runoff, and dynamic-wave routing to produce node-level maxima (water depth and inflow), which are used as training targets; (b) Repository of event datasets containing rainfall time series and the corresponding SWMM targets; (c) Input sequence: r time steps by c predictors assembled per node event; and (d) to predict node-level depth and inflow in real time, while SWMM is used only offline to generate training labels.

Hyperparameter tuning

The optimal configuration of hyperparameters was identified through a Grid Search methodology, a systematic and exhaustive approach that explores a comprehensive search space encompassing critical parameters of LSTM, such as the number of hidden units, the number of layers and dropout rates. This strategy evaluates every possible combination of these hyperparameters, rendering it an effective technique when the parameter set and their respective value ranges remain computationally tractable. Nevertheless, the method’s computational demands increase significantly with an expanding search space, posing challenges particularly when training sophisticated DL architectures.

Prior to hyperparameter tuning, the dataset of 5,000 events was partitioned into training (70%), validation (15%) and test (15%) subsets. The split was performed in a stratified manner based on the annual recurrence interval (ARI) categories to ensure a balanced representation of storm intensities across all subsets. The training set was used for model learning, the validation set for guiding the hyperparameter optimization and early stopping and the held-out test set was reserved for the final performance evaluation reported in Section “Results.”

The architectural framework of the model featured an LSTM network, configured with a range of hidden units (16–256), layer depths (1–4) and dropout rates (0–0.4). Each model instantiation was optimized using the Adam optimizer, set with a learning rate of 0.001 to ensure convergence.

The rectified linear unit activation function was implemented within the hidden LSTM layers to introduce nonlinearity and mitigate vanishing gradient issues, while a linear activation function was adopted at the output layer.

Evaluation criteria

The predictive accuracy and robustness of both the calibrated hydrologic–hydraulic model and the LSTM-based surrogate models were assessed using statistical performance metrics. These criteria quantify different aspects of model performance, including overall goodness of fit, error magnitude and bias, and are particularly relevant for maximum water depth and maximum inflow across the drainage network nodes.

The Nash-Sutcliffe Efficiency (NSE) is a dimensionless metric that quantifies the predictive accuracy of the model by comparing the variance of the residuals to the variance of the observed data (Nash and Sutcliffe, Reference Nash and Sutcliffe1970).

(7) $$ NSE=1-\frac{\sum_{i=1}^n{\left({Q}_{obs,i}-{Q}_{sim,i}\right)}^2}{\sum_{i=1}^n{\left({Q}_{obs,i}-\overline{Q_{obs}}\right)}^2} $$

where Qobs and Qsim are the observed and simulated values at time step i, respectively, $ \overline{Q_{\mathrm{obs}}} $ is the mean of the observed values and n is the number of observations.

The root mean square error measures the average magnitude of the prediction errors in the same units as the target variable (Chai and Draxler, Reference Chai and Draxler2014). It emphasizes larger errors due to the squaring of residuals.

(8) $$ RMSE=\sqrt{\frac{\sum_{i=1}^n{\left({Q}_{obs,i}-{Q}_{sim,i}\right)}^2}{n}} $$

The mean absolute error (MAE) provides a measure of the average absolute difference between observed and simulated values (Willmott and Matsuura, Reference Willmott and Matsuura2005).

(9) $$ MAE=\frac{\sum_{i=1}^n\left|{Q}_{obs,i}-{Q}_{sim,i}\right|}{n} $$

To evaluate the storm-to-storm variability at each node, independent of the absolute magnitude of the maxima, we compute a normalized row-wise standard deviation across events. Let $ \left\{{x}_{j1},{x}_{j2},\dots, {x}_{jm}\right\} $ denote the event-wise maxima at node j, where m represents the total number of events (Han et al., Reference Han, Kamber and Mining2006). A 0–1 normalization is first applied within node j to standardize the data. The normalized value $ \overset{\sim }{x_{jk}} $ for event k at node j is defined as:

(10) $$ \overset{\sim }{x_{jk}}=\left[\begin{array}{c}\frac{x_{jk}-{\mathit{\min}}_k{x}_{jk}}{{\mathit{\max}}_k{x}_{jk}-{\mathit{\min}}_k{x}_{jk}},\hskip0.6em {\mathit{\max}}_k{x}_{jk}>{\mathit{\min}}_k{x}_{jk}\\ {}0,\hskip0.48em otherwise\end{array}\right. $$

This normalization ensures that the values are scaled to the interval [0,1], preserving relative differences while removing the influence of absolute magnitude.

To quantify how sensitive each junction is to different storm configurations of the same statistical intensity, we define a normalized inter-event variability metric (Equation 10), calculated as the standard deviation of predicted maxima (water depth or inflow) across all events within each ARI class, normalized by the corresponding median. It effectively captures the fluctuation in flood severity due to differences in temporal rainfall structure, antecedent loading and internal hydraulic dynamics.

By combining these complementary evaluation metrics, the study ensured a rigorous assessment of model performance, capturing both general fit quality and the specific ability to reproduce critical flood characteristics relevant to urban flood risk management.

Results

Calibration performance

Figure 3 illustrates the calibration performance of the hydrology-hydraulic model, examined using two events spanning 13–14 December 2019 and 25 July 2024 within the Rocky Branch Watershed. Both the observed rainfall and USGS discharge data used for calibration were at a 15-min temporal resolution. The figure presents a time series comparison: rainfall is shown in the upper panel as an inverted hyetograph (mm), while the lower panel depicts the temporal variation of observed and simulated discharges (m3/s). The SWMM model performance during calibration was assessed using three widely adopted statistical indicators: NSE, RMSE and MAE. The calibrated model achieved NSE values of 0.790 and 0.820 for the two events, respectively, indicating a high degree of agreement between observed and simulated flows. Corresponding RMSE values were 0.532 and 0.496 m3/s, while MAE values were 0.372 and 0.357 m3/s, respectively. These results demonstrate the model’s capability to reproduce both the magnitude and timing of peak flows with minimal deviation from observations.

Figure 3. Time series comparison for (a) 13–14 December 2019 and (b) 25 July 2024, showing the rainfall hyetograph (mm) and observed versus simulated discharge hydrographs (Cubic Meters per Second (CMS)), representing the SWMM calibration events.

Visual hydrograph inspection demonstrated that the model effectively reproduced the rising and recession limbs of the flood events, along with the peak discharge magnitudes. Minor discrepancies were observed in the timing and magnitude of secondary peaks, likely stemming from uncertainties in rainfall distribution, spatial variability in infiltration and the simplification in the representation of drainage network features within the model. Moreover, the calibration extends to a winter and a summer storm, demonstrating that the model maintains comparable skill under different seasonal regimes and storm types. Nevertheless, the overall calibration performance indicated that the model was well-suited for simulating flood dynamics in the study area and provided a reliable foundation for subsequent scenario analysis.

Hyperparameter tuning

Figure 4 presents the mean NSE values for predicting maximum water depth and maximum inflow across different LSTM hyperparameter configurations. The maximum water depth trend suggested that a larger hidden size combined with moderate dropout enhanced model robustness and predictive accuracy, likely by mitigating overfitting while capturing complex patterns in the data. For maximum water depth prediction, the highest mean NSE value of 0.91 was achieved with a hidden size of 256, a single layer and a dropout rate of 0.4. Notably, smaller hidden sizes (e.g., 16) yielded lower mean NSE values (0.65), underscoring the importance of sufficient model capacity. In contrast, for maximum inflow prediction, the best NSE of 0.96 was observed with a hidden size of 256, two layers and no dropout. This suggested that a balance between network depth and minimal regularization was optimal for this task, potentially due to the inherent variability in inflow data requiring less aggressive dropout to retain critical features. Interestingly, maximum inflow prediction demonstrated higher peak NSE values than water depth prediction across all hidden sizes, indicating greater learnability for inflow patterns under optimal configurations. The mean NSE values across hidden sizes showed a decline with increasing size (e.g., 0.81 for size 16 vs. 0.67 for size 256), highlighting that smaller to moderate hidden sizes sufficed when paired with appropriate layering, as seen with the best configuration at hidden size 64 (mean NSE 0.78).

Figure 4. Heatmap comparison of mean NSE values for predicting maximum water depth (top panel) and maximum inflow (bottom panel) using various LSTM hyperparameter configurations, including hidden size, number of layers and dropout rate.

However, the optimal configuration varied slightly between water depth and inflow prediction, suggesting that the two tasks required fine-tuned architecture rather than a single shared optimal model. The results indicated that hyperparameter tuning was task-specific, with water depth predictions benefiting from larger hidden sizes and higher dropout rates, while inflow predictions favored moderate network depths with minimal dropout.

Figure 5 further illustrates the validation performance of the selected optimal LSTM configurations over 200 training epochs. For maximum inflow prediction, both validation loss and RMSE rapidly converged to low values, maintaining stability throughout training. In contrast, maximum water depth prediction exhibited a slower convergence and periodic spikes in error, indicating higher sensitivity to training fluctuations and possibly a more complex learning task.

Figure 5. (a) Training loss (MSE) and (b) training MAE over 200 training epochs for LSTM-based predictions of maximum water depth and maximum inflow.

These results underscore the importance of task-specific hyperparameter optimization in LSTM-based flood drainage prediction models. While both prediction tasks benefited from careful tuning, the distinct error convergence patterns and variability highlight that a single, universal architecture may not yield optimal performance for multiple hydrologic and hydraulic outputs. Instead, tailoring network depth, hidden size and regularization parameters to the physical characteristics and variability of each target variable can substantially enhance predictive accuracy and stability.

Modeling performance

Table 2 summarizes the predictive performance of the optimal LSTM configurations for each target variable – maximum water depth and maximum inflow – across seven ARI categories (1, 2, 5, 10, 25, 50 and 100 years). For maximum water depth, NSE values range from 0.892 (ARI 1) to 0.915 (ARI 5 and 10), indicating strong modeling performance. Corresponding MAE values range between 0.253 and 0.301 m, while RMSE values remain low at 0.082–0.103 m. For maximum inflow, the model demonstrates higher accuracy, with NSE values ranging from 0.91 (ARI 5) to 0.97 (ARI 2), MAE values between 0.012 and 0.019 m3/s and RMSE values between 0.022 and 0.028 m3/s. Results show consistently high predictive skill for both variables, with inflow predictions achieving higher NSE values and lower error metrics compared to water depth. Quantitative performance metrics for the training and validation datasets are provided in Supplementary Table S2 of the Supporting Material, indicating consistent model behavior and satisfactory generalization for both maximum water depth and maximum inflow.

Table 2. Performance of the LSTM-based surrogate model on the independent test set, evaluated across different ARIs

Note: The dataset of 5,000 synthetic storm events was randomly partitioned into 70% training, 15% validation and 15% testing subsets. Results reflect test-only metrics using NSE, MAE and RMSE computed between simulated and predicted maxima across all nodes. RMSE and MAE values are reported in meters (for water depth) and m3/s (for inflow).

Figure 6 illustrates the RMSE of test dataset distribution for maximum water depth and maximum inflow across ARI categories. For water depth RMSE, variability increased slightly for higher ARI values, with ARI100 showing the largest spread and several high outliers, which suggested increased prediction difficulty under extreme events. For inflow RMSE, median and mean values remained consistently low across ARIs, though ARI100 again showed a marginal increase in variability. These results confirmed that while inflow predictions were generally more stable across return periods, water depth prediction accuracy was more sensitive to extreme rainfall intensities.

Figure 6. Boxplots of RMSE for LSTM-based predictions of maximum water depth (top) and maximum inflow (bottom) across seven ARI categories. Medians (green lines), means (green diamonds), interquartile ranges (boxes), data within 1.5× IQR (interquartile range; whiskers) and outliers (circles) are shown.

Figure 7 presents the spatial distribution of normalized standard deviation values for maximum water depth and maximum inflow across a network of junctions. Panel (a) showed the variability in maximum water depth, with color gradients ranging from 0.14 to 0.49. These values represent the standard deviation of the normalized peak depths computed across all 5,000 synthetic rainfall events, providing a spatial measure of inter-event variability. Junction J937 was highlighted as the location with the greatest change in water depth for further analysis. Panel (b) showed a similar distribution for normalized inflow variability, with Junction J3088 highlighted for its elevated variability in inflow. Figure 8 further investigates the temporal dynamics of two highlighted nodes: J937 and J3088. For each node, Figure 8 displays the distribution of maximum water depth and inflow, respectively, across multiple realizations for seven ARI categories ranging from ARI1 to ARI100, highlighting the highest inter-event variability.

Figure 7. Spatial distribution of inter-event variability across junctions for (a) normalized standard deviation of maximum water depth and (b) normalized standard deviation of maximum inflow, computed over all 5,000 synthetic rainfall events. Junctions J937 and J3088 are also illustrated.

Figure 8. Statistical analysis of maximum water depth and inflow across ARIs at junctions J937 and J3088, showing all data points, mean, median, 5th percentile and 95th percentile across ARI categories (1, 2, 5, 10, 25, 50 and 100 years).

At J937, water depth remained relatively stable across lower ARIs (i.e., ARI1–ARI10), but a distinct surge in variability emerged from ARI25 onward. The 95th percentile rose sharply, indicating outlier events with extreme water levels, likely driven by local topographic effects or system bottlenecks. Despite this upper-bound sensitivity, both median and mean remained comparatively stable, suggesting that only a limited number of scenarios produced disproportionately high depths. At J3088, inflow exhibited a more gradual and consistent increase with ARI. Both the mean and median rose consistently while the 5th–95th percentile range widened, reflecting heightened uncertainty under rarer, more extreme events. In contrast to J937, no sharp transition was observed, suggesting that inflow dynamics at this location were primarily governed by cumulative upstream contributions rather than localized nonlinear responses.

Together, these figures underscored the spatial and temporal heterogeneity of hydraulic response within the network. Identifying nodes with elevated variability is critical for prioritizing flood-mitigation interventions and designing adaptive capacity within urban drainage system.

Discussion and conclusion

This study demonstrates the effectiveness of the proposed spatiotemporal LSTM-based surrogate model in predicting event-level maxima of both water depth and inflow. To position the proposed surrogate within the broader literature on hybrid physics–ML frameworks, we compared characteristics of two recent LSTM–hydrodynamic models (Zhao et al., Reference Zhao, Liu, Li, Tang, Yang, Xu, Quan and Hu2023; Pan et al., Reference Pan, Hou, Gao, Chen, Li, Imran, Li, Yang, Ma and Zhou2025) alongside this study. Zhao et al. (Reference Zhao, Liu, Li, Tang, Yang, Xu, Quan and Hu2023) developed an LSTM–SWMM hybrid for outlet discharge prediction and reported NSE values of 0.969 for the hybrid model and 0.954 for a LSTM, with forecasting performance decreasing as the lead time increased. Pan et al. (Reference Pan, Hou, Gao, Chen, Li, Imran, Li, Yang, Ma and Zhou2025) used an LSTM trained on One-Dimensional–Two-Dimensional (1D–2D) hydrodynamic simulations to predict time series of inundation depth at a small number of flood-prone locations, achieving R 2 > 0.90, MAE ≤ 0.069 m and RMSE ≤0.077 m. In comparison, our SWMM-calibrated surrogate predicts event-level maxima of both depth and inflow at 2,802 junctions across seven ARIs, with NSE = 0.89–0.92 for maximum depth and 0.91–0.97 for maximum inflow, and low error statistics (depth RMSE ≤0.103 m, inflow RMSE ≤0.028 m3/s). Our framework extends previous hybrid approaches by (i) operating at the asset level across the entire drainage network rather than at a single outlet or a handful of inundation points, and (ii) providing comparable or better predictive skills while delivering event-scale forecasts suitable for real-time applications.

Model performance was rigorously evaluated across multiple ARIs, achieving consistently high predictive skill, with NSE values of up to 0.97 for inflow and 0.92 for water depth, alongside low error statistics. These results confirm the ability of recurrent deep networks to capture the nonlinear rainfall–runoff transformations inherent in urban hydrologic–hydraulic systems when trained on sufficiently diverse datasets. Importantly, the model aligns with the well-documented challenges of reproducing threshold behaviors in hydraulic systems, including surcharging, backwater effects and localized bottlenecks. These nonlinear responses are often triggered once system capacity is exceeded and tend to escalate rapidly under extreme storm events due to overwhelming upstream runoff. Such behaviors highlight the intrinsic difficulty of capturing abrupt state shifts in flow regimes, where small variations in boundary conditions can lead to disproportionately large impacts on water depth and inflow. The model’s partial sensitivity to these dynamics underscores both the promise and the limitations of data-driven approaches in representing complex hydraulic transitions that are highly dependent on network topology and localized storage–conveyance interactions.

Furthermore, the surrogate exhibited systematically higher accuracy for inflow than for water depth, a physically consistent outcome since local depth maxima are more sensitive to pressurization, minor–major system exchanges and localized energy losses at manholes and structures, whereas inflow reflects more spatially integrated upstream contributions and thus exhibits smoother dynamics. Building upon the work of Pan et al. (Reference Pan, Hou, Gao, Chen, Li, Imran, Li, Yang, Ma and Zhou2025) in using LSTM surrogates to pinpoint inundation hotspots, our approach advances the field by generating complete area-wide forecasts, a critical capability for hydrological assessment and policy development. Additionally, studies such as Roy et al. (Reference Roy, Goodall, McSpadden, Goldenberg and Schram2025) highlighted the effectiveness seq2seq LSTM models in capturing rapid, nonlinear flood responses in urban areas, supporting the computational efficiency and accuracy of our proposed approach for real-time applications. Chang et al. (Reference Chang, Yang and Chang2025) proposed a hybrid neural network–backpropagation neural network (CNN–BPNN) model that couples spatial feature extraction with temporal learning and achieved high accuracy for 10-min urban water-level forecasts (sewer: R 2 = 0.97, RMSE = 0.08 m; internal/external: R 2 = 0.99, RMSE = 0.06 m), complementing our LSTM-based framework for real-time flood control. The surrogate model required careful adjustment to achieve optimal predictive accuracy. Results from the grid search revealed that larger hidden sizes combined with moderate dropout improved LSTM-based water depth predictions, whereas two-layer architectures with minimal dropout were more effective for inflow predictions. This outcome reflects the distinct error landscapes of the two target variables: water depth estimation demands stronger regularization to prevent overfitting local nonlinearities, while inflow prediction benefits from greater network capacity and remains stable without dropout. These insights underscore the importance of task-specific hyperparameter optimization when applying DL to complex urban drainage systems.

Although this study focuses on deterministic performance metrics, the surrogate model inevitably inherits uncertainty from both rainfall inputs and SWMM calibration parameters. A full probabilistic treatment is beyond the scope of this work, but future extensions could benefit from incorporating stochastic surrogate models such as LSTM ensembles or Monte Carlo dropout (Tabas and Samadi, Reference Tabas and Samadi2022) to generate predictive variance and uncertainty bounds suitable for operational decision-making.

For the Rocky Branch watershed, a full SWMM dynamic-wave simulation required an average of ~118 min per storm, whereas the trained LSTM surrogate produced predictions for all 5,000 events in just 16 min. This drastic reduction in computational runtime is what enables real-time, rapid decisions during extreme events.

It should be noted that although the LSTM surrogate can ingest rainfall information sequentially and update its predictions as new data become available, the model is trained using complete storm sequences at the event level. Accordingly, predictions produced during real-time storms represent inferences based on partial inputs rather than explicitly trained instantaneous forecasts. Despite this distinction, the approach remains valuable for real-time applications by providing rapidly updated estimates of expected peak conditions.

The proposed LSTM-based surrogate model offers two distinct advantages for urban flood management. First, by using an asset-level calibrated SWMM as input for the surrogate model, the approach provides a computationally efficient pathway for real-time decision support, enabling rapid scenario screening and transparent identification of critical hotspots across the drainage network. Second, surrogate enables rapid, high-resolution predictions of maximum water depth and inflow, facilitating real-time decision support, efficient scenario analysis and transparent identification of flood hotspots for stakeholders. Unlike traditional SWMM simulations, which are computationally intensive and often impractical for real-time applications due to runtimes exceeding hours for large networks, our approach leverages precomputed SWMM output stored in metafiles (e.g., Excel files). These files archive time-series rainfall data alongside corresponding maximum depth and inflow results for each event, enabling the LSTM model to train on these datasets and deliver forecasts with exceptional speed. This efficiency is critical for emergency response during extreme events, empowering communities and authorities with timely, actionable insights to mitigate flood impacts. By combining strong predictive accuracy with orders-of-magnitude faster computation, the developed surrogate offers a practical, scalable tool that supports flood mitigation, urban planning and community resilience under increasingly extreme rainfall events.

Open peer review

To view the open peer review materials for this article, please visit http://doi.org/10.1017/wat.2026.10013.

Supplementary material

The supplementary material for this article can be found at http://doi.org/10.1017/wat.2026.10013.

Data availability statement

The discharge data were obtained from the USGS gauge Rocky Branch at Pickens St. in Columbia, SC (Station 02169505; https://waterdata.usgs.gov/monitoring-location/USGS-02169505/). Rainfall data were obtained from https://waterdata.usgs.gov/monitoring-location/USGS-021695045, and drainage network layers were provided by the City of Columbia’s GIS portal (https://gis.columbiasc.gov) upon request for academic research.

Author contribution

Nima Zafarmomen: Writing – original draft, visualization, validation, software, methodology, formal analysis and conceptualization.

Vidya Samadi: Writing – review and editing, validation, supervision, project administration, methodology, funding acquisition and conceptualization.

Edoardo Borgomeo: Writing – review and editing, validation, supervision and methodology.

Financial support

This work is supported by the US National Science Foundation (NSF) Directorate for Engineering under grant CBET 2429082. Clemson University (USA) is acknowledged for its generous allotment of computing time on the Palmetto cluster. EB is supported by INT/UCam Early Career Support Scheme (Award number G122390).

Competing interests

The authors declare none.

References

Abbott, MB, Bathurst, JC, Cunge, JA, O’Connell, PE and Rasmussen, J (1986) An introduction to the European hydrological system — Systeme Hydrologique Europeen, “SHE”, 1: History and philosophy of a physically-based, distributed modelling system. Journal of Hydrology 87(1), 4559. https://doi.org/10.1016/0022-1694(86)90114-9.Google Scholar
Aderyani, FR, Jafarzadegan, K and Moradkhani, H (2025) A surrogate machine learning modeling approach for enhancing the efficiency of urban flood modeling at metropolitan scales. Sustainable Cities and Society 123, 106277. https://doi.org/10.1016/j.scs.2025.106277.Google Scholar
Anello, G and Cordaro, G (2007) Perturbation from Dirichlet problem involving oscillating nonlinearities. Journal of Differential Equations 234(1), 8090. https://doi.org/10.1016/j.jde.2006.11.011.Google Scholar
Bates, PD and De Roo, APJ (2000) A simple raster-based model for flood inundation simulation. Journal of Hydrology 236(1–2), 5477. https://doi.org/10.1016/S0022-1694(00)00278-X.Google Scholar
Bennett, TH (1998) Development and Application of a Continuous Soil Moisture Accounting Algorithm for the Hydrologic Engineering Center Hydrologic Modeling System (HEC-HMS). Davis: University of California.Google Scholar
Chai, T and Draxler, RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)?–arguments against avoiding RMSE in the literature. Geoscientific Model Development 7(3), 12471250. https://doi.org/10.5194/gmd-7-1247-2014.Google Scholar
Chang, L-C, Yang, M-T and Chang, F-J (2025) Flood resilience through hybrid deep learning: Advanced forecasting for Taipei’s urban drainage system. Journal of Environmental Management 379, 124835. https://doi.org/10.1016/j.jenvman.2025.124835.Google Scholar
DHI (2017) MOUSE Pipe Flow Reference Manual, Hørsholm, Denmark: DHIGoogle Scholar
Dong, Z, Bain, DJ, Akcakaya, M and Ng, CA (2023) Evaluating the Thiessen polygon approach for efficient parameterization of urban stormwater models. Environmental Science and Pollution Research 30(11), 3029530307. https://doi.org/10.1007/s11356-022-24162-7.Google Scholar
Gironás, J, Roesner, LA, Rossman, LA and Davis, J (2010) A new applications manual for the storm water management model(SWMM). Environmental Modelling & Software 25(6), 813814. https://doi.org/10.1016/j.envsoft.2009.11.009.Google Scholar
Guido, BI, Popescu, I, Samadi, V and Bhattacharya, B (2023) An integrated modelling approach to evaluate the impacts of nature-based solutions of flood mitigation across a small watershed in the Southeast United States. Natural Hazards and Earth System Sciences Discussions 2023, 130. https://doi.org/10.5194/nhess-23-2663-2023.Google Scholar
Han, J, Kamber, M and Mining, D (2006) Concepts and techniques. Morgan Kaufmann 340(1), 94104103205Google Scholar
Hochreiter, S and Schmidhuber, J (1997) Long short-term memory. Neural Computation 9(8), 17351780. https://doi.org/10.1162/neco.1997.9.8.1735.Google Scholar
Huber, WC, Dickinson, RE, Barnwell, TO and Branch, A (1988) Storm Water Management Model; Version 4, United States: Environmental Protection Agency.Google Scholar
Huff, FA (1967) Time distribution of rainfall in heavy storms. Water Resources Research 3(4), 10071019. https://doi.org/10.1029/WR003i004p01007.Google Scholar
Hung, C-LJ, James, LA, Carbone, GJ and Williams, JM (2020) Impacts of combined land-use and climate change on streamflow in two nested catchments in the southeastern United States. Ecological Engineering 143, 105665. https://doi.org/10.1016/j.ecoleng.2019.105665.Google Scholar
Idowu, D and Zhou, W (2023) Global megacities and frequent floods: Correlation between urban expansion patterns and urban flood hazards. Sustainability 15(3), 2514. https://doi.org/10.3390/su15032514.Google Scholar
Javadi, M, Jalilehvand, M, Alizadeh, H and Zafarmomen, N (2025) Analysis of historical global warming impacts on climatological trends for the partially gauged Hirmand river basin based on multiple data products and bias correction methods. Journal of Hydrology: Regional Studies 62, 102886. https://doi.org/10.1016/j.ejrh.2025.102886.Google Scholar
Khatooni, K, Hooshyaripor, F, MalekMohammadi, B and Noori, R (2025) A new approach for urban flood risk assessment using coupled SWMM–HEC-RAS-2D model. Journal of Environmental Management 374, 123849. https://doi.org/10.1016/j.jenvman.2024.123849.Google Scholar
Lai, C (1986) Numerical modeling of unsteady open-channel flow. In Advances in Hydroscience (Vol. 14, pp. 161333). Amsterdam, Netherlands: Elsevier.Google Scholar
Li, J, Yuan, L, Hu, Y, Xu, A, Cheng, Z, Song, Z, Zhang, X, Zhu, W, Shang, W, Liu, J and Liu, M (2024) Flood simulation using LISFLOOD and inundation effects: A case study of typhoon in-fa in Shanghai. Science of the Total Environment 954, 176372. https://doi.org/10.1016/j.scitotenv.2024.176372.Google Scholar
Manfreda, S and Samela, C (2019) A digital elevation model based method for a rapid estimation of flood inundation depth. Journal of Flood Risk Management 12, e12541. https://doi.org/10.1111/jfr3.12541.Google Scholar
McDonnell, BE, Ratliff, K, Tryby, ME, Wu, JJX and Mullapudi, A (2020) PySWMM: The python interface to stormwater management model (SWMM). Journal of Open Source Software 5(52), 1. https://doi.org/10.21105/joss.02292.Google Scholar
Morsy, MM, Goodall, JL, Shatnawi, FM and Meadows, ME (2016) Distributed stormwater controls for flood mitigation within urbanized watersheds: Case study of rocky Branch watershed in Columbia, South Carolina. Journal of Hydrologic Engineering 21(11), 05016025. https://doi.org/10.1061/(ASCE)HE.1943-5584.0001430.Google Scholar
Mousavi, M, Bengar, HA, Mousavi, F, Mahdavinia, P and Bengar, MA (2025) Interlayer bond strength prediction of 3D printable concrete using artificial neural network: Experimental and modeling study. Structure 71, 108147. https://doi.org/10.1016/j.istruc.2024.108147.Google Scholar
Nash, JE and Sutcliffe, JV (1970) River flow forecasting through conceptual models part I—A discussion of principles. Journal of Hydrology 10(3), 282290. https://doi.org/10.1016/0022-1694(70)90255-6.Google Scholar
Pan, X, Hou, J, Gao, X, Chen, G, Li, D, Imran, M, Li, X, Yang, N, Ma, M and Zhou, X (2025) LSTM model-based rapid prediction method of urban inundation with rainfall time series. Water Resources Management 39(2), 661688.Google Scholar
Patro, S, Chatterjee, C, Mohanty, S, Singh, R and Raghuwanshi, NS (2009) Flood inundation modeling using MIKE FLOOD and remote sensing data. Journal of the Indian Society of Remote Sensing 37, 107118. https://doi.org/10.1007/s12524-009-0002-1.Google Scholar
Paudel, M, Nelson, EJ and Scharffenberg, W (2009) Comparison of lumped and quasi-distributed Clark runoff models using the SCS curve number equation. Journal of Hydrologic Engineering 14(10), 10981106. https://doi.org/10.1061/(ASCE)HE.1943-5584.0000100.Google Scholar
Ress, LD, Hung, CJ and James, LA (2020) Impacts of urban drainage systems on stormwater hydrology: Rocky Branch watershed, Columbia, South Carolina. Journal of Flood Risk Management 13(3), e12643. https://doi.org/10.1111/jfr3.12643.Google Scholar
Rossman, LA (2010) Storm Water Management Model User’s Manual, Version 5.0, Vol. 276. Cincinnati, OH: National Risk Management Research Laboratory, Office of Research and Development.Google Scholar
Rossman, LA and Huber, W (2016) Storm Water Management Model Reference Manual Volume III–Water Quality. Washington, DC: US EPA Office of Research and Development EPA/600/R-16/093.Google Scholar
Rossman, LA and Supply, W (2006) Storm Water Management Model, Quality Assurance Report: Dynamic Wave Flow Routing. Washington, DC: US Environmental Protection Agency, Office of Research and Development.Google Scholar
Roy, B, Goodall, JL, McSpadden, D, Goldenberg, S and Schram, M (2025) Forecasting multi-step-ahead street-scale nuisance flooding using a seq2seq LSTM surrogate model for real-time application in a coastal-Urban City. Journal of Hydrology 656, 132697. https://doi.org/10.1016/j.jhydrol.2025.132697.Google Scholar
Saberian, M, Samadi, V and Popescu, I (2024) Probabilistic hierarchical interpolation and interpretable configuration for flood prediction. Hydrology and Earth System Sciences Discussions 2024, 141. https://doi.org/10.5194/hess-2024-261.Google Scholar
Saberian, M, Zafarmomen, N, Panthi, K and Samadi, V (2026) Unraveling the power of neural networks for flood prediction across complex hydrological systems. Geohorizons 1(1), gh2025–4. https://doi.org/10.1144/gh2025-4.Google Scholar
Samadi, V, Fowler, HJ, Lamond, J, Wagener, T, Brunner, M, Gourley, J, Moradkhani, H, Popescu, I, Wasko, C and Wright, D (2025) The needs, challenges, and priorities for advancing global flood research. Wiley Interdisciplinary Reviews: Water 12(3), e70026. https://doi.org/10.1002/wat2.70026.Google Scholar
Samela, C, Manfreda, S, De Paola, F, Giugni, M, Sole, A and Fiorentino, M (2016) DEM-based approaches for the delineation of flood-prone areas in an ungauged basin in Africa. Journal of Hydrologic Engineering 21(2), 06015010. https://doi.org/10.1061/(ASCE)HE.1943-5584.0001272.Google Scholar
Shao, Y, Chen, J, Zhang, T, Yu, T and Chu, S (2024) Advancing rapid urban flood prediction: A spatiotemporal deep learning approach with uneven rainfall and attention mechanism. Journal of Hydroinformatics 26(6), 14091424. https://doi.org/10.2166/hydro.2024.024.Google Scholar
Sidek, LM, Jaafar, AS, Majid, WHAWA, Basri, H, Marufuzzaman, M, Fared, MM and Moon, WC (2021) High-resolution hydrological-hydraulic modeling of urban floods using InfoWorks ICM. Sustainability 13(18), 10259. https://doi.org/10.3390/su131810259.Google Scholar
Sytsma, A, Crompton, O, Panos, C, Thompson, S and Mathias Kondolf, G (2022) Quantifying the uncertainty created by non-transferable model calibrations across climate and land cover scenarios: A case study with SWMM. Water Resources Research 58(2), e2021WR031603. https://doi.org/10.1029/2021WR031603.Google Scholar
Tabas, SS and Samadi, S (2022) Variational Bayesian dropout with a Gaussian prior for recurrent neural networks application in rainfall–runoff modeling. Environmental Research Letters 17(6), 065012. https://doi.org/10.1088/1748-9326/ac7247Google Scholar
Tanim, AH, Smith-Lewis, C, Downey, ARJ, Imran, J and Goharian, E (2024) Bayes_Opt-SWMM: A gaussian process-based Bayesian optimization tool for real-time flood modeling with SWMM. Environmental Modelling & Software, 106122. https://doi.org/10.1016/j.envsoft.2024.106122.Google Scholar
Wang, Z, Chen, Y, Zeng, Z, Chen, X, Li, X, Jiang, X and Lai, C (2024) A tight coupling model for urban flood simulation based on SWMM and TELEMAC-2D and the uncertainty analysis. Sustainable Cities and Society 114, 105794. https://doi.org/10.1016/j.scs.2024.105794.Google Scholar
Willmott, CJ and Matsuura, K (2005) Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate Research 30(1), 7982. https://doi.org/10.3354/cr030079.Google Scholar
Yin, S, Xie, Y, Nearing, MA, Guo, W and Zhu, Z (2016) Intra-storm temporal patterns of rainfall in China using Huff curves. Transactions of the ASABE 59(6), 16191632. https://doi.org/10.13031/trans.59.11010.Google Scholar
Zafarmomen, N, Alizadeh, H, Bayat, M, Ehtiat, M and Moradkhani, H (2024) Assimilation of sentinel-based leaf area index for modeling surface-ground water interactions in irrigation districts. Water Resources Research 60(10), e2023WR036080. https://doi.org/10.1029/2023WR036080.Google Scholar
Zafarmomen, N and Samadi, V (2025) Can large language models effectively reason about adverse weather conditions? Environmental Modelling & Software 188, 106421. https://doi.org/10.1016/j.envsoft.2025.106421.Google Scholar
Zafarmomen, N, Samadi, V and Lunt, S (2025a) Nature-based solutions for urban stormwater management (Publication No. 1109). Clemson Cooperative Extension, Land-Grant Press by Clemson Extension. https://open.clemson.edu/forest_wildlife/54.Google Scholar
Zafarmomen, N, Samadi, V and Borgomeo, E (2025b) NeuralSWWM python package: A hybrid hydrologic-machine learning algorithm for stormwater management. In AGU25.Google Scholar
Zhao, Z, Huo, A, Liu, Q, Yang, L, Luo, C, Ahmed, A, and Elbeltagi, A (2024) Assessment of urban inundation and prediction of combined flood disaster in the middle reaches of Yellow river basin under extreme precipitation. Journal of Hydrology 640, 131707. https://doi.org/10.1016/j.jhydrol.2024.131707.Google Scholar
Zhao, C, Liu, C, Li, W, Tang, Y, Yang, F, Xu, Y, Quan, L and Hu, C (2023) Simulation of urban flood process based on a hybrid LSTM-SWMM model. Water Resources Management 37(13), 51715187. https://doi.org/10.1007/s11269-023-03600-2.Google Scholar
Figure 0

Figure 1. (a) Location of Columbia in South Carolina, USA. (b) Spatial configuration of the stormwater network and topography in the Rocky Branch Watershed, Columbia, South Carolina.

Figure 1

Table 1. Summary of Rocky Branch stormwater drainage network characteristics

Figure 2

Figure 2. SWMM–LSTM workflow for surrogate modeling. (a) SWMM process: rainfall is processed through infiltration, surface runoff, and dynamic-wave routing to produce node-level maxima (water depth and inflow), which are used as training targets; (b) Repository of event datasets containing rainfall time series and the corresponding SWMM targets; (c) Input sequence: r time steps by c predictors assembled per node event; and (d) to predict node-level depth and inflow in real time, while SWMM is used only offline to generate training labels.

Figure 3

Figure 3. Time series comparison for (a) 13–14 December 2019 and (b) 25 July 2024, showing the rainfall hyetograph (mm) and observed versus simulated discharge hydrographs (Cubic Meters per Second (CMS)), representing the SWMM calibration events.

Figure 4

Figure 4. Heatmap comparison of mean NSE values for predicting maximum water depth (top panel) and maximum inflow (bottom panel) using various LSTM hyperparameter configurations, including hidden size, number of layers and dropout rate.

Figure 5

Figure 5. (a) Training loss (MSE) and (b) training MAE over 200 training epochs for LSTM-based predictions of maximum water depth and maximum inflow.

Figure 6

Table 2. Performance of the LSTM-based surrogate model on the independent test set, evaluated across different ARIs

Figure 7

Figure 6. Boxplots of RMSE for LSTM-based predictions of maximum water depth (top) and maximum inflow (bottom) across seven ARI categories. Medians (green lines), means (green diamonds), interquartile ranges (boxes), data within 1.5× IQR (interquartile range; whiskers) and outliers (circles) are shown.

Figure 8

Figure 7. Spatial distribution of inter-event variability across junctions for (a) normalized standard deviation of maximum water depth and (b) normalized standard deviation of maximum inflow, computed over all 5,000 synthetic rainfall events. Junctions J937 and J3088 are also illustrated.

Figure 9

Figure 8. Statistical analysis of maximum water depth and inflow across ARIs at junctions J937 and J3088, showing all data points, mean, median, 5th percentile and 95th percentile across ARI categories (1, 2, 5, 10, 25, 50 and 100 years).

Supplementary material: File

Zafarmomen et al. supplementary material

Zafarmomen et al. supplementary material
Download Zafarmomen et al. supplementary material(File)
File 23.9 KB

Author comment: Spatiotemporal SWMM-LSTM surrogate modeling for efficient node-level water depth and inflow prediction in urban drainage networks — R0/PR1

Comments

9/27/2025

Dear Prof. Fenner,

Editor in Chief of Cambridge Prisms: Water

On behalf of my coauthor, I wish to request your consideration of our research article titled “Spatiotemporal SWMM-LSTM Surrogate Modeling for Efficient Node-Level Water Depth and Inflow Prediction in Urban Drainage Networks” for publication in Cambridge Prisms: Water. This project is the result of a US National Science Foundation-funded project.

This study presents a novel hybrid modeling framework that integrates high-resolution, asset-level calibration of the EPA’s SWMM with spatiotemporal Long Short-Term Memory (LSTM) networks. We believe this work aligns closely with the journal’s focus on innovative water research and management strategies, particularly in urban flood risk modeling and smart water systems. Our study provides a scalable, adaptable framework that can inform resilient urban planning and climate-responsive infrastructure development.

We confirm that this manuscript has not been published elsewhere and is not under consideration by any other journal. All authors have approved the submission and declare no competing interests. Thank you very much for your consideration. We look forward to your feedback and hope that our work will contribute to advancing research in urban water management.

Sincerely,

Vidya Samadi, Ph.D., M.ASCE.

Assistant Professor & Director of Clemson Hydroinformatics Research Group,

Affiliate Faculty, Artificial Intelligence Research Institute for Science and Engineering (AIRISE), School of Computing Clemson University, SC, USA.

Review: Spatiotemporal SWMM-LSTM surrogate modeling for efficient node-level water depth and inflow prediction in urban drainage networks — R0/PR2

Conflict of interest statement

Reviewer declares none.

Comments

This study proposes a hybrid spatiotemporal SWMM-LSTM framework that integrates the U.S. EPA’s Storm Water Management Model (SWMM) with a Long Short-Term Memory (LSTM) neural network for rapid and accurate flood prediction at the node (asset) level within urban drainage systems.

The model is applied to the Rocky Branch Watershed in Columbia, South Carolina—a flood-prone urban basin. A high-resolution SWMM model was calibrated using USGS observations and then used to generate synthetic rainfall-response datasets (5,000 events across seven ARIs from 1–100 years). The LSTM surrogate predicts maximum water depth and maximum inflow with high fidelity to SWMM results.

Results show:

NSE up to 0.97 for inflow and 0.92 for water depth.

RMSE values between 0.02–0.03 m³/s (inflow) and 0.08–0.10 m (depth).

Identification of hydraulically sensitive nodes through inter-event variability.

The framework substantially reduces computation time while maintaining high accuracy, supporting real-time decision-making in flood management.

The paper makes a significant contribution by moving beyond outlet-based predictions to node-level flood metrics. The coupling of a physically-based model with a deep learning surrogate is timely and well aligned with current hydroinformatics trends.

However, the manuscript would benefit from a more explicit quantitative comparison with previous hybrid frameworks (e.g., Zhao et al., 2023; Wu et al., 2025). Presenting performance and runtime differences in a summary table would clearly position the novelty and efficiency gains.

While model validation is solid, uncertainty quantification is limited. The study could briefly discuss:

How uncertainty in rainfall inputs or SWMM calibration parameters propagates to LSTM outputs.

Possible use of ensemble LSTM or dropout variance as uncertainty estimators.

Even a conceptual paragraph would strengthen the study’s rigor and transparency.

The paper frequently highlights the model’s “rapid computation,” but no runtime benchmarks are provided.

A concise comparison such as

“SWMM simulation = 3 hours per storm vs. LSTM = 0.5 seconds per storm”

would reinforce claims of real-time capability.

The generation of the 5,000 synthetic hyetographs is central to training. The authors mention they were “benchmarked against historical data,” but the method of synthesis (e.g., design storms, stochastic generator, scaling from observed events) is unclear.

Clarifying this process—or providing a short description of the generator and its validation—would improve reproducibility.

Figures are generally informative but can be enhanced:

Figure 4–6: Add clearer legends and unit labels; ensure consistent color scales for depth vs. inflow.

Figure 7–8: Consider larger font and emphasize node IDs (J937, J3088).

These minor adjustments will increase readability for interdisciplinary audiences.

Location Comment / Suggestion

p. 10 (§2.2) “SWMM employes” → “SWMM employs.”

Eq. 6 Replace “Rs is surface runoff” with “Rₛ denotes surface runoff.”

Fig. 2 caption “which use as training targets” → “which are used as training targets.”

p. 23 Add y-axis labels (“RMSE (m)” and “RMSE (CMS)”).

Throughout Ensure consistent units (m³/s, m) and use SI spacing conventions.

References Verify that all 2025 citations are in-press or available online; include DOIs where possible.

The manuscript is scientifically sound, well-written, and contributes valuable insights into physics-informed deep learning for hydrology. Required revisions are limited to improving contextual comparisons, figure clarity, and adding brief discussions of uncertainty and computational performance. Once addressed, the paper is fully suitable for publication in Cambridge Prisms: Water.

please consider this reference for improving the paper:

Samela, C., Manfreda, S., De Paola, F., Giugni, M., Sole, A., & Fiorentino, M. (2016). DEM-based approaches for the delineation of flood-prone areas in an ungauged basin in Africa. Journal of Hydrologic Engineering, 21(2), 06015010.

Manfreda, S., Samela, C., Gioia, A., Consoli, G. G., Iacobellis, V., Giuzio, L., & De Paola, F. (2019). A digital elevation model based method for a rapid estimation of flood inundation depth. Journal of Flood Risk Management, 12(S1), e12541.

Review: Spatiotemporal SWMM-LSTM surrogate modeling for efficient node-level water depth and inflow prediction in urban drainage networks — R0/PR3

Conflict of interest statement

Reviewer declares none.

Comments

The manuscript “WAT-2025-0029” presents a study that develops an LSTM-based surrogate model trained on SWMM-generated data to predict maximum water depth and inflow across an urban drainage network. While the topic is relevant and timely, leveraging data-driven models to enhance urban hydrodynamic simulations, several aspects of the methodology, model structure, and data description remain ambiguous or insufficiently justified. The following comments and questions are intended to improve the clarity, reproducibility, and methodological soundness of the study.

Major Comments:

1- The description of the surrogate model (Section 2.3.1 and Figure 2) is somewhat unclear regarding the exact input–output configuration and its operational intent. From the current explanation, it seems that the LSTM model receives rainfall time series as input and predicts the maximum water depth and maximum inflow for each drainage node or junction. If this interpretation is correct, it implies that during real-time operation, once a rainfall event begins, the model could be driven by a sequence of observed or forecast rainfall to predict the evolving or final (maximum) hydraulic states. However, this functionality is not explicitly discussed or demonstrated in the manuscript.

Please clarify the following points:

• Are the LSTM inputs the full rainfall time series of each synthetic event, and are the outputs the event-level maximum for each node?

• If so, how does the model generalize about unseen rainfall events in real time? Does it require the full event to be known a priori, or can it produce predictions incrementally as rainfall data arrive (which would justify using a temporal model like LSTM)?

• If the model only predicts event-level maxima (one value per node per event), then using an LSTM (a sequence model) might not be necessary, since no sequential output or evolving state is involved. In that case, a simpler regression or feed-forward neural network could suffice.

More broadly, discuss the practical real-time applicability of the surrogate: whether it can be used operationally during storms, or if it only serves as an offline emulator of SWMM outputs for scenario screening. This distinction is critical for understanding the value and novelty of the proposed hybrid framework.

2- The manuscript mentions that 5,000 synthetic rainfall events were generated across seven Average Recurrence Intervals (ARIs) and benchmarked against historical observations (Section 2.3.1). However, the method used to produce these synthetic hyetographs is not described in sufficient detail to ensure reproducibility or to assess the representativeness of the training data. What approach or model was used to generate synthetic rainfall events? (e.g., design storm method based on IDF curves, stochastic rainfall generator, scaling of observed events, or other approaches). What storm durations, temporal distributions, and total depths correspond to each ARI category? Were the hyetographs spatially uniform or spatialy distributed across the watershed? How was “benchmarking against historical observations” quantitatively performed? Since the surrogate model’s predictive performance and generalizability depend heavily on the diversity and realism of these rainfall events, a more transparent explanation of the synthetic data generation procedure is essential.

3- The manuscript does not clearly describe how the dataset was partitioned into training, validation, and test subsets, nor does it report the model’s performance for each set. While Section 2.3.2 discusses hyperparameter tuning and Figure 5 shows training loss curves, there is no quantitative information on validation or testing outcomes. Table 2 summarizes performance across ARIs but does not specify whether these resuls correspond to the training or testing phase. How were the 5,000 rainfall events divided among training, validation, and testing subsets? Were the events split randomly or stratified by ARI categories? What performance metrics (NSE, RMSE, MAE) were achieved on the independent test set? Was any cross-validation or k-fold strategy used to ensure robustness and avoid overfitting?

4- The concept of “inter-event variability” is highlighted as a key contribution, but the analysis is limited. While Equation 10 and Figures 7–8 show normalized variability, the discussion does not explain the physical meaning, drivers, or practical implications. Please elaborate on how inter-event variability identifies hydraulic hotspots and how it can inform flood management decisions.

5- The manuscript inconsistently describes the relationship between SWMM and the LSTM model. In some sections (e.g., Abstract and Impact Statement), the framework is described as an integration of SWMM and LSTM, suggesting dynamic coupling. In other sections (e.g., Methodology, Figure 2), the LSTM is presented as a surrogate model, trained on SWMM outputs to emulate its behavior. The LSTM operates in tandem with SWMM (i.e., a coupled hybrid model), or it functions as an independent surrogate that replaces SWMM after training.

6- The calibration section (Section 3.1 and Figure 3) lacks sufficient information regarding the rainfall events used. The manuscript mentions two events (12–13 December 2019 and 25 July 2024) but does not explain why these specific storms were chosen, whether they represent extreme floods, typical events, or seasonal contrasts. Please provide the selection criteria and basic statistics for each event (total rainfall, duration, and peak intensity) to justify their representativeness for calibration.

In addition, the rainfall hyetograph in Figure 3 appears visually overlapping and difficult to interpret. Clarifying the temporal resolution of both rainfall and discharge data (e.g., 15-min, hourly) and improving the figure’s readability (thinner bars or separated panels) would enhance interpretability.

Minor Comments:

1- The manuscript does not provide sufficient information on the 5,000 synthetic rainfall events used to train the surrogate model. While seven ARI categories are mentioned, there is no table or figure showing their corresponding rainfall duration, total depth, or temporal patterns. Without this information, it is difficult to evaluate whether the generated hyetographs realistically represent the hydrologic variability of observed storms.

2- The manuscript lacks sufficient detail regarding the data used for both model calibration and training the surrogate model. While the USGS gauge 02169505 is mentioned for calibration and the City of Columbia GIS portal is acknowledged, the sources and processing of rainfall and network data are not clearly explained.

3- The case study description refers to the Rocky Branch Watershed as a single catchment, yet Section 2.2 indicates that the SWMM setup includes 2,802 manholes and 2,801 conduits, implying a subdivision into multiple subcatchments. However, the manuscript does not clarify how these subcatchments were defined or how rainfall inputs were assigned across them. Was rainfall inputs spatially uniform for the entire watershed or applied separately to each subcatchment? How many subcatchments were modeled, and what were their average sizes or key characteristics?How does this spatial structure translate into the LSTM model inputs and outputs (e.g., one rainfall input for all nodes or distinct rainfall–response pairs per subcatchment)?

4- In Page 9 (Lines 164–168), the manuscript discusses the PySWMM library and its capability to run and control SWMM within a Python environment. However, it is not clear whether PySWMM was used in this study or only mentioned as a potential tool.

5- In Figure 2 and its caption, it is clearly stated that “SWMM is used only offline to generate training labels,” indicating that the LSTM functions as a surrogate model, not an integrated or dynamically coupled system. This reinforces that the repeated references to an “integrated SWMM–LSTM framework” throughout the manuscript are conceptually inaccurate and should be revised for consistency. Additionally, in Figure 2, the direction of the arrow between panels (a) and (b) appears to be reversed. Since rainfall time series serve as input to SWMM, which then produces node-level outputs (maximum water depth and inflow), the correct information flow should be from (a) SWMM → (b) dataset repository. Please review and correct the figure accordingly to avoid confusion.

6- The manuscript does not clearly specify the temporal resolution of the rainfall and streamflow data used in the SWMM calibration and evaluation. While the synthetic rainfall events for LSTM training are noted as 15-minute resolution, the time step of observed rainfall and USGS discharge data used for calibration (Section 3.1) is not provided.

7-Figure 7 displays the spatial distribution of “inter-event variability,” but the information provided is insufficient to interpret the results. The caption and text do not specify which rainfall event(s) were used, nor do they explain what the node colors represent quantitatively. From Equation (10), it appears that the variability is computed across all events, yet this should be stated explicitly.

8- Figure 8 presents inter-event variability for two junctions, maximum water depth at J937 and maximum inflow at J3088, but the manuscript does not explain why different response variables were chosen for each node. It would be helpful to clarify: The rationale for selecting these two specific junctions (e.g., based on network location, sensitivity ranking, or flow accumulation). Why was water depth analyzed for one and inflow for the others reflect different hydraulic behaviors (e.g., ponding vs. conveyance sensitivity)? Whether these nodes represent typical or extreme variability cases across the system.

9- Page 18 (~L335) reports “hidden size 26” for the optimal inflow configuration. This appears to be a typo (likely 256). In Table 2 (and related text in Section 3.3), the RMSE and MAE values are reported without units.

Recommendation: Spatiotemporal SWMM-LSTM surrogate modeling for efficient node-level water depth and inflow prediction in urban drainage networks — R0/PR4

Comments

Reviewers were rather supportive and the manuscript could represent a valuable contribution worthy of being published in Cambridge Prism: Water. Indeed, Reviewers provided several comments that could be useful to improve the manuscript. In particular, the description of many aspects of the method, such as generation of rainfall input, creation of the data sets used to train and test the method, should be enhanced in order to improve the reproducibility. Pro and cons of the method, including applicability for real time management and uncertainty characterization should be discussed and novel contributions properly highlighted. Authors are thus kindly asked to address all the comments and provide detailed replies. The decision on publication of this paper is deferred until the authors are able to revise and resubmit the paper.

Decision: Spatiotemporal SWMM-LSTM surrogate modeling for efficient node-level water depth and inflow prediction in urban drainage networks — R0/PR5

Comments

No accompanying comment.

Author comment: Spatiotemporal SWMM-LSTM surrogate modeling for efficient node-level water depth and inflow prediction in urban drainage networks — R1/PR6

Comments

No accompanying comment.

Review: Spatiotemporal SWMM-LSTM surrogate modeling for efficient node-level water depth and inflow prediction in urban drainage networks — R1/PR7

Conflict of interest statement

No ones.

Comments

The paper is suitable for the publication. Please check English grammar.

Review: Spatiotemporal SWMM-LSTM surrogate modeling for efficient node-level water depth and inflow prediction in urban drainage networks — R1/PR8

Conflict of interest statement

Non

Comments

Overall, the authors have addressed the majority of my comments, and the revised manuscript is improved in terms of clarity, methodological transparency, and consistency. I have two remaining comments for the authors’ consideration:

(1) The authors clarify that the LSTM surrogate can be driven incrementally as rainfall data arrive to update predictions of event-level peak depth and inflow. However, the model is trained using complete rainfall sequences with supervision applied only at the event level (i.e., final maxima), rather than with time-step–resolved or truncated-sequence targets. Consequently, incremental predictions during an ongoing storm represent extrapolations from partial inputs rather than explicitly learned time-evolving forecasts. To avoid potential overstatement of operational real-time capability, the manuscript would benefit from clearly distinguishing between incremental inference enabled by the model architecture and the current training strategy. Alternatively, if the authors intend to claim true incremental predictive capability, this could be supported by additional results demonstrating model behavior when driven by partial rainfall sequences.

(2) While reporting test-set performance is appropriate and the loss curves in Figure 5 provide useful insight into convergence and overfitting behavior, the manuscript does not report quantitative performance metrics (e.g., NSE, RMSE, MAE) for the training and validation sets. Providing a summary of these metrics, either in a supplementary table or briefly in the main text, would further support the assessment of model generalization and complement the loss-based diagnostics.

Recommendation: Spatiotemporal SWMM-LSTM surrogate modeling for efficient node-level water depth and inflow prediction in urban drainage networks — R1/PR9

Comments

Authors properly addressed reviewers' comments. The manuscript is almost ready for publication provided that Authors address last minor comments of a reviewer

Decision: Spatiotemporal SWMM-LSTM surrogate modeling for efficient node-level water depth and inflow prediction in urban drainage networks — R1/PR10

Comments

No accompanying comment.

Author comment: Spatiotemporal SWMM-LSTM surrogate modeling for efficient node-level water depth and inflow prediction in urban drainage networks — R2/PR11

Comments

No accompanying comment.

Recommendation: Spatiotemporal SWMM-LSTM surrogate modeling for efficient node-level water depth and inflow prediction in urban drainage networks — R2/PR12

Comments

Authors properly addressed last minor comments. The manuscript can be accepted for publication. Congratulations

Decision: Spatiotemporal SWMM-LSTM surrogate modeling for efficient node-level water depth and inflow prediction in urban drainage networks — R2/PR13

Comments

No accompanying comment.