Heavy tails and probability density functions to any nonlinear order for the surface elevation in irregular seas

Abstract The probability density function (PDF) for the free surface elevation in an irregular sea has an integral formulation when based on the cumulant generating function. To leading order, the result is Gaussian, whereas nonlinear extensions have long been limited to Gram–Charlier series approximations. As shown recently by Fuhrman et al. (J. Fluid Mech., vol. 970, 2023, A38), however, the second-order integral can be represented exactly in closed form. The present work extends this further, enabling determination of this PDF to even higher orders. Towards this end, a new ordinary differential equation (ODE) governing the PDF is first derived. Asymptotic solutions in the limit of large surface elevation are then found, utilizing the method of dominant balance. These provide new analytical forms for the positive tail of the PDF beyond second order. These likewise clarify how high-order cumulants (involving statistical moments such as the kurtosis) govern the tail, which is shown to get heavier with each successive order. The asymptotic solutions are finally utilized to generate boundary conditions, such that the governing ODE may be solved numerically, enabling novel determination of the PDF at third and higher order. Successful comparisons with challenging data sets confirm accuracy. The methodology thus enables the PDF of the surface elevation to be determined numerically, and the asymptotic tail analytically, to any desired order. Results are worked out explicitly up to fifth order. The theoretical probability of extreme surface elevations (typical of rogue waves) may thus be assessed quantitatively for highly nonlinear irregular seas, requiring only relevant statistical quantities as input.


Introduction
Perhaps the most fundamentally important statistical description of an irregular sea is the probability density function (PDF) for the surface elevation itself.The positive tail of the PDF is of special importance, as it governs the likelihood of extreme positive surface elevations, typical of rogue waves.In his classical work, Longuet-Higgins (1963) showed that this PDF could be formulated in terms of an integral arising from the cumulant generating function.His formulation is free of assumptions involving narrow-bandedness of the spectrum or small directionality of the wave field.To leading order, the result is Gaussian, whereas Longuet-Higgins (1963) derived approximate Gram-Charlier series solutions for this PDF to second and third order.Recently, the present authors (henceforth referred to as FKZ, Fuhrman, Klahn & Zhai 2023) have shown that, to second order, the integral of Longuet-Higgins (1963) can be solved exactly in terms of the Airy function.Notably, the FKZ distribution includes an inherently heavy tail, i.e. slower exponential decay in the probability density of large surface elevations, relative to the Gram-Charlier series solutions of Longuet-Higgins (1963).Through asymptotic analysis, FKZ showed that to second order, this tail is theoretically governed by the skewness (equivalently, the third cumulant).FKZ likewise showed that their exact second-order theory is more accurate than both second-and third-order approximate solutions provided by Longuet-Higgins (1963), especially in the heavy-tailed region.However, in the most nonlinear cases considered, it was also evident that even the tail of the second-order FKZ distribution is not sufficiently heavy to match those from the data sets considered.Heavy positive tails are likewise apparent in PDFs stemming from numerous other numerical (e.g.Klahn, Madsen & Fuhrman 2021b;Liu et al. 2022), experimental (see e.g.Onorato et al. 2009;Trulsen et al. 2020;Zhang & Benoit 2021;Zhang, Ma & Benoit 2024) and field measured (see e.g.Tayfun & Alkhalidi 2020) data sets involving nonlinear irregular wave fields.Hence it seems clear that heavy positive tails are indeed representative of real ocean waves, and in highly nonlinear conditions, PDFs properly accounting for the additional effects of kurtosis (and potentially even higher-order statistical moments) are required.
Motivated by this need, in the present work we further our efforts to determine the PDF for the surface elevation in nonlinear, irregular seas to even higher order.Towards this end, the remainder of this paper is organized as follows.In § 2, we will derive a new ordinary differential equation (ODE), theoretically governing the PDF to any order in nonlinearity.Section 3 will present exact solutions to this ODE at first and second order, which will be shown to be consistent with previously obtained results.At third and higher order, no known exact solutions to the ODE exist.Hence in § 4, we will derive new asymptotic solutions to the ODE, in the limit of large surface elevation, employing the method of dominant balance.These will be derived explicitly up to fifth order in nonlinearity, and will demonstrate precisely how higher-order cumulants (involving statistical moments such as the kurtosis, hyperskewness and hyperkurtosis) govern the tail at third, fourth and fifth order, respectively.Moreover, in § 5, we will show how the asymptotic solutions may be utilized to derive necessary boundary conditions for the ODE, such that it may be solved numerically.This methodology, for the first time, enables the theoretical PDF to be determined, effectively to any desired order in nonlinearity.Section 6 will compare the new high-order PDFs with those based on challenging, highly nonlinear data sets involving irregular waves in finite water depth.Conclusions will finally be drawn in § 7.

An ODE governing p(ζ )
Consider a nonlinear, irregular wave field having surface elevation η, with standard deviation σ .We define the non-dimensional surface elevation as ζ ≡ η/σ , and assume that it can be expressed in terms of a classical Stokes-type perturbation series, i.e.
such that the nth term in the series is O(ε n−1 ), where ε is a characteristic wave steepness.
As first discussed by Longuet-Higgins (1963), in this perturbative setting, the PDF of ζ is most conveniently expressed in terms of its cumulants κ n , since these are ordered in powers of ε for n ≥ 3.In fact, choosing the coordinate system such that η has zero mean, we have κ 1 = 0, the normalization implies that κ 2 = 1, and Longuet-Higgins (1963).For completeness, we mention that κ 3 equals the skewness S ≡ ζ 3 of η, and that κ 4 equals the excess kurtosis K − 3, where K ≡ ζ 4 is the kurtosis of η.Expressions for higher cumulants in terms of the statistical moments are provided in table 2 below.For clarity, we emphasize that the nth-order nonlinear approximation of the full system will refer to that in which the first n terms in (2.1), and the first n + 1 cumulants, are retained.
Using the cumulant generating function, Longuet-Higgins (1963) has shown that the PDF of ζ can be expressed as In the following, we will show how this integral representation may be used to find an ODE of infinite order, which governs the behaviour of p(ζ ).To that end, we start by rewriting the integral as where the trigonometric representation of the complex exponential function is utilized.If we differentiate this expression an odd number of times, say 2m − 1, then we find that Likewise, if we differentiate the expression an even number of times, say 2m, then we find that since the sine function containing the odd powers of s vanishes at s = 0, and the exponential function must vanish at infinity; if it did not, then the integral (2.2) would not be convergent.If we now combine this identity with the product rule and the results (2.4) and (2.5), then we find that In what follows, we will truncate the summation in the ODE at various orders.We will denote the equation that arises when truncating the sum at the nth term as the nth-order equation, consistent with the nth-order approximation defined above.

Exact solutions of the ODE
To first and second order, (2.7) admits exact analytical solutions, and we briefly review these here for the sake of completeness and to demonstrate consistency with previously known solutions.In addition, we discuss an important property of solutions of the ODE for order greater than two, which may be derived without knowing the exact solution.

Exact first-order solution
To first order, the ODE (2.7) is simply where it has been invoked that κ 2 = 1, again by definition of ζ .This equation has the general solution where B is an arbitrary constant.Requiring the PDF to have unit integral yields B = 1/ √ 2π, such that the solution becomes the standard normal (Gaussian) distribution which is well known to be the correct linear result.

Exact second-order solution
This result was first derived very recently by FKZ, who evaluated the integral (2.2) directly using a novel change of coordinates.We note that the above derivation represents an alternative (arguably simpler) way in which it may be derived, newly enabled by the governing ODE (2.7).

A property of third-and higher-order solutions
To Nth order, the ODE reads To the best of our knowledge, no analytical solution of this equation has yet been found for N ≥ 3.However, N linearly independent solutions must exist, since the equation is linear and of Nth order.Moreover, since the coefficients of p and its derivatives in (3.7) are all entire (i.e.everywhere analytic) functions in the complex ζ -plane, these linearly independent solutions must all be entire functions themselves (see e.g.§ 3.1 of Bender & Orszag 1999).This result, which holds for arbitrary, finite N, is remarkable when considering that the integral (2.2) does not even converge when truncated at orders 4m and 4m + 1 (m being a positive integer), i.e. when the highest even power is evenly divisible by 4. Thus while it is seemingly not possible to obtain e.g. the third-order distribution by evaluating the integral (2.2) directly, either analytically or numerically, the ODE approach provides a novel vehicle for obtaining this distribution.

Asymptotic solutions of the ODE in the limit ζ → ∞
In this section, we will solve the ODE (2.7) analytically in the limit of large ζ .To do so, we make use of the method of dominant balance (see e.g.§ 3.4 of Bender & Orszag 1999), which assumes that the asymptotic form of the solution is as ζ → ∞.Here, A(ζ ) is a function to be determined, and B is a constant.In the following, we present a detailed derivation to find the form of A(ζ ) for the case where (2.7) is truncated at second order, and leave the determination of B to § 5.As the method is conceptually easily generalized to higher order, but involves algebra of rapidly increasing complexity with each order, we give a less detailed derivation for the third-order case, and merely present the key points and results for the general order case.Finally, we discuss the asymptotic behaviour of A(ζ ) in the limit where the nonlinear order becomes large.
4.1.Asymptotic solution to second order When truncated at second order, (2.7) becomes and on insertion of the asymptotic form (4.1), it turns into the asymptotic relation To find the leading behaviour of A(ζ ) when ζ becomes large, we now assume that one of the three terms on the right-hand side is much larger than the other two in this limit.If we keep either the term −dA/dζ or the term (κ 3 /2) d 2 A/dζ 2 and solve the resulting asymptotic relation, it is easily seen that the assumption is violated; in both cases, the term proportional to (dA/dζ ) 2 will be much larger than the other two terms.Hence to find the leading behaviour of A(ζ ), we assume which is readily solved to give after taking the negative square root.We note that the positive square root in this case would not make sense, as a PDF must have a decreasing tail.To proceed from here, we define , with A 2 being the remainder, which is much smaller than A 1 when ζ is large.To find the leading behaviour of A 2 , we insert the new form of A into (4.3),which gives the asymptotic relation We now again assume that one of the terms on the right-hand side is much larger than the other.It is clear that assuming the second term to be largest leads to and this contradicts the assumption that A 2 is much smaller than A 1 .For that reason, we keep the term dA 1 /dζ , and upon solving the resulting asymptotic relation, we find that The next step of the method is to define , with A 3 assumed small relative to A 1 and A 2 in the limit of large ζ .Inserting this form of A into (4.3) and making assumptions similar to those above then gives Doing this once more, now with where log(•) denotes the natural logarithm.At this stage, we could keep on finding more terms in the asymptotic expansion of A, but as it turns out, the next term in the series is proportional to ζ −1/2 , and is thus irrelevant for the asymptotic behaviour of p(ζ ), since it goes to zero in the limit ζ → ∞.We have thus shown that the asymptotic form is Note that the present asymptotic solution provides independent confirmation of (4.1) of FKZ.As also pointed out by FKZ, at second order, the highest-power ζ term in the exponential has power 3/2.This is less than the power 2 in the exponential of the Gaussian distribution (3.3), which defines the first-order asymptotics.Additionally, FKZ showed that the asymptotic forms of the Gram-Charlier series approximations of Longuet-Higgins (1963) likewise had power 2 on ζ in the exponential.As a result, the second-order FKZ distribution has an inherent heavy positive tail, as discussed in detail therein.

Asymptotic solution to third order
To third order, the ODE (2.7) is To find the leading behaviour of A(ζ ), we neglect all but a single term on the right-hand side.Guessing that the leading behaviour is a constant times ζ α , where α > 1, it is clear that the dominant term must be (−κ 4 /6)(dA/dζ ) 3 .Solving the resulting asymptotic relation, we find that in the limit of large ζ .Proceeding as in the previous subsection, one may show that the full asymptotic form of where a 0 = 1/3, and the remaining coefficients are The final third-order result is thus where the a i coefficients are as given above, and a 0 is invoked directly as the power of ζ in the denominator of the leading factor.This form (4.16) of the positive tail to third order is new.We note that for sufficiently large ζ , it will be dominated by the largest power term in the exponential, i.e. a 1 ζ 4/3 , where it is noted that a 1 involves only the kurtosis and no other statistical moments.As the 4/3 power is even less than the 3/2 power dominating the tail for large ζ to second order, this implies that in cases where third-order effects (i.e.kurtosis) are significant, the positive tail of the PDF will be even heavier than that at second order.Moreover, we note that the form of the positive tail is in stark contrast to the third-order Gram-Charlier approximation derived by Longuet-Higgins (1963), which for large ζ is dominated by the skewness, and not the excess kurtosis, as shown by FKZ; see their (4.3).

Asymptotic solution to higher order
Inserting the asymptotic form (4.1) into the general, Nth-order ODE (3.7) leads to the asymptotic relation Taking the Nth root such that the right-hand side is negative, and solving the resulting relation for A, then gives the leading behaviour of A, which is (4.20) To find the next term in the asymptotic expansion of A, we define A 1 to be the right-hand side of (4.20), and write Having removed the largest terms of the asymptotic relation (4.17), we now seek its second largest terms.These turn out to be the terms N(dA 1 /dζ ) N−1 (dA 2 /dζ ) and (dA 1 /dζ ) N−1 , which originate from the term (dA/dζ ) N of D N and the term (dA/dζ ) N−1 of D N−1 , respectively.Balancing these terms scaled with the coefficients of the right-hand side sum of (4.17) gives the relation dA 2 /dζ ∼ κ N /κ N+1 , which is readily solved to give At this stage, the number of terms that must be considered in order to find more terms in the expansion of A quickly increases, and we therefore stop our detailed derivation here.What one finds by continuing the procedure is that the general asymptotic form of A is i.e. a power series in ζ 1/N plus a logarithmic term.The general, Nth-order asymptotic form of p is therefore where the expressions for the coefficients a 0 , a 1 , . . ., a N+1 are listed in table 1 for the orders N = 2, 3, 4, 5.This result is new, and we note that it implies that the tail of p(ζ ) is dominated by the highest-order cumulant retained in the approximation, since the coefficient a 1 depends only on κ N+1 .The first six cumulants, as required up to order N = 5, are likewise presented in terms of the statistical moments in table 2.Moreover, we note that the leading power of ζ , i.e. (N + 1)/N, decreases with N, hence p(ζ ) becomes increasingly heavy tailed with each nonlinear order.In this connection, an interesting observation is that the leading power approaches unity in the limit of large N, and one may therefore conjecture that the fully nonlinear tail of p(ζ ) (that is, when all cumulants are retained) behaves like exp(a 1 ζ + • • • ) for large ζ .However, one should be careful in this connection as the coefficients in the series (4.22) may become arbitrarily large when N is large; for example, using Stirling's approximation for the factorial function, one may show that a 1 ∼ −N/(e κ 1/N N+1 ) in this limit, where e is Euler's number.Since κ N+1 = O( N−1 ), it is expected that a 1 grows asymptotically linearly in magnitude with N, and as such, there is no guarantee that the series (4.22) converges for the fully nonlinear case.(5.1) (v) Adjust the assumed value for B accordingly, and repeat the procedure until (5.1) is sufficiently close to unity.
We find that this process may be performed manually without great difficulty, though it is even more conveniently automated.To hopefully encourage use by others, Matlab functions automating the process described above are freely provided to third, fourth and fifth order, as indicated in the supplementary material available at https://doi.org/10.11583/DTU.24720564.These solutions use Matlab's zero finder (fzero) to obtain B, such that the integral (5.1) is exactly equal to unity.Backward integration of the ODE is performed numerically, utilizing Matlab's fourth-order Runge-Kutta scheme (ode45), with built in error control.Utilizing these, the tail (determined analytically, with correct B) and p(ζ ) (determined numerically) may be obtained to the desired order, requiring only the necessary cumulants (determined from statistical moments; see again table 2) as input, in addition to an appropriately chosen range of ζ .From our own testing, the method is quite robust, as will be demonstrated.(Note that analogous implementations written in Python are likewise freely provided in the supplementary material.) Example solutions to third, fourth and fifth order are depicted in figure 1. Parameters utilized in these examples are as indicated in table 3. We emphasize that these parameters are made up and not based on any experiment or simulation.However, their magnitudes are definitely reasonable, as is illustrated e.g. by table 4. Dashed lines depict the analytical tail solutions, whereas solid lines depict the numerical solutions for the full PDF.It is seen that in all cases, the asymptotic tail closely matches the full PDF for large ζ .It is likewise seen that for negative ζ , the numerical solutions eventually turn oscillatory.This behaviour is similar to the theoretical second-order solution of FKZ based on the Airy function; see (3.6).The oscillatory part of the solutions (sometimes yielding negative p(ζ )) is obviously unphysical.In the present cases, they are sometimes even divergent; see e.g. the exponential growth in oscillation amplitude for ζ < 0 in figures 1(b,c).3.
Therefore, we choose to avoid this unphysical region altogether, and in all subsequent applications, the oscillatory parts of the solution will be removed.This is easily achieved simply by replacing the oscillatory part (taken as where ζ is less than the first zero crossing during the backward integration in ζ ) with zeros after the solution is found, such that this part of the solution does not contribute to (5.1).Note that this is equivalent to taking ζ min as equal to the zero closest to ζ = 0, which is our general recommendation for practical use.

Comparisons
In this section, we will validate the new high-order PDFs through comparisons with those based on various data sets.As the novelty of the present work begins at third order, we will focus exclusively on challenging data sets involving highly nonlinear irregular waves in finite water depth, where the second-order FKZ distribution is not sufficiently accurate.
As discussed in the forthcoming subsections, the data sets to be considered have been generated through a variety of means, i.e. theoretically, numerically or experimentally.
Results will be shown up to the minimum order required.All cases considered utilize ζ min equal to the zero crossing closest to ζ = 0, such that the (unphysical) oscillatory part of the solution is omitted naturally, as described in the previous section.

Comparison with data from irregular wave theory
As a first means of validation, we will compare against a data set generated from the multi-directional irregular wave theory of Madsen & Fuhrman (2012), carried out to second order, as considered previously by FKZ in their figure 9.For the present purpose, we will compare against the most nonlinear (i.e.challenging) of those considered by FKZ, from their figure 9(c).The data for this case consist of numerous time series, generated based on directionally spread irregular waves having characteristic dimensionless depth k p h = 1.2 and resulting characteristic steepness ε = k p H m0 /2 = 2k p σ = 0.2057, where k p is the peak wavenumber, h is the water depth, and H m0 is the spectral significant wave height.The irregular waves were generated based on a JONSWAP (Hasselmann 1973) spectrum where ω is the angular frequency, ω p is the peak angular frequency, γ s = 3.3, and σ s = 0.07 if ω < ω p and 0.09 otherwise.A cutoff frequency ω c = 3ω p is utilized, such that S(ω > ω c ) = 0.The constant S 0 is defined such that the first-order spectrum (6.1) satisfies with second-order components added afterwards.The waves are directionally spread based on (6.3) where Γ (•) denotes the gamma function, and N D = 50 is the directional spreading parameter, which governs the width of the directional spectrum.Other details are as described in § 6.1 of FKZ.The data set has e.g. S = 0.3355 and K = 3.190, leading to the cumulants listed in table 4 (top row).
Comparison of the data-based PDF with those based on theory are depicted in figure 2. The numerically determined (third-order) PDF has utilized ζ max = 8.The PDF from the data has utilized bin size Δζ = 0.2.Error bars are estimated as ±p(ζ )/ √ N b , where N b is the number of samples in each bin, following Onorato et al. (2009).It is clear from this figure that the case is sufficiently nonlinear that neither the first-order (Gaussian, blue dotted line) nor second-order (FKZ, green dashed line) distribution matches e.g. the probability density of the extreme positive tail.It is, however, very clear from figure 2 that the present third-order result (solid line) matches the extreme tail, and the PDF in general, very well for, say, ζ > −3.Results utilizing the present fourth-order PDF essentially overlay that at third-order, and are thus not shown for brevity.This case confirms accuracy of the new PDFs developed in the present work to third order in nonlinearity.
It should be mentioned that comparison utilizing a third-order statistical distribution with data generated from a second-order irregular wave theory is legitimate.This is because the surface elevations stemming from second-order Stokes-type wave theories do contain kurtosis, such that third-order effects from the statistical perspective are present.The comparison made in figure 2 thus demonstrates that the under-prediction of the positive tail by the second-order FKZ distribution is due to third-order effects associated with kurtosis, confirming the speculation of FKZ.Klahn et al. (2021c).The horizontal axes are to scale, whereas the vertical axis is exaggerated by a factor of two.The horizontal area shown is 4λ p × 4λ p .

Comparison with data from a fully nonlinear wave model
As a second means of comparison, we will consider data generated by the fully nonlinear, spectrally accurate wave model of Klahn et al. (2021c), as previously utilized and validated in e.g.Klahn et al. (2021b) and Klahn, Madsen & Fuhrman (2021a), and FKZ.The comparisons with data generated from this model within FKZ utilized directionally spread irregular wave fields with 1.0 ≤ k p h ≤ ∞, and all were quite closely matched by their second-order distribution.Thus for the present purpose we have performed additional simulations with this model in reduced water depth, to generate an even more nonlinear and challenging data set.The simulations are again based on a JONSWAP spectrum, utilizing k p h = 0.6 and N D = 2. Simulations have been performed on large 2048 × 2048 horizontal grids, with 11 points distributed in the vertical direction.Following Klahn et al. (2021b), the computational domain has horizontal lengths L x = 100λ p and L y = 100(1 + N D ) 1/2 λ p , where λ p = 2π/k p is the peak wavelength, which provides similar resolution in both horizontal directions.Simulations are initialized with linearized initial conditions, with nonlinear terms ramped to fully on over a duration 10T p , where T p is the peak wave period, utilizing time step Δt = T p /50. Simulations have been considered up to time 100T p .Five different simulations have been performed, each of which requires several months on a single processor.We have found that in this case, the free surface statistics become reasonably stable for t > 20T p , and results have been sampled at t = 50T p for the present purposes.Note that the characteristic Ursell number in this case corresponds to Ur = H m0 λ 2 p /h 3 = 2(2π) 2 ε/(k p h) 3 ≈ 26.6, which indicates that the case is indeed quite nonlinear for an irregular wave field.Collectively, the simulations result in a steepness ε = 2k p σ = 0.0727, and the statistical moments S = 0.3974, K = 3.212, hyperskewness S h = 4.092 and hyperkurtosis K h = 19.84,leading to the cumulants listed in table 4 (second row).An example of a zoomed-in region of the computed free surface (spanning 4λ p × 4λ p ) containing multiple rogue waves is depicted in figure 3. A longer space series spanning 100λ p along the line containing the largest crest elevation is likewise provided in figure 4.
The resulting PDF from this data set is depicted in figure 5 (circles, utilizing bin size and error bars calculated as before).For comparison, also shown are the PDFs from linear theory (Gaussian) and the second-order FKZ distribution, as well as the present distributions carried out to third and fourth order (computed utilizing ζ max = 9).It is seen that while the second-order FKZ distribution is a significant improvement over the Gaussian, it fails to capture the extreme positive tail adequately.Conversely, the extension to third or even fourth order, as newly enabled by the present work, captures the positive tail much more convincingly, such that they could seemingly be utilized reliably to predict the probability of e.g.extreme wave crests arising from rogue waves.Note that there are only minor differences between the present third-and fourth-order results, indicating convergence at fourth order.We have confirmed that fifth-order results are visually indistinguishable from those at fourth order, and are thus not shown for brevity.This case can be taken as validation of the present method for determining the PDF up to fourth order.Trulsen et al. (2020).The inset depicts the region immediately surrounding the largest crest.

Comparison with experimental data
As a final means of validation, we will compare with measured results of Trulsen et al. (2020), who performed wave flume experiments of long-crested irregular waves propagating over a trapezoidal bar.These experiments resulted in extremely nonlinear surface elevations on top of the bar, and hence present an appropriately challenging case to compare with the higher-order PDFs developed in the present work.For the present purpose, we will focus exclusively on their run 3 data.This case utilized JONSWAP irregular waves with peak period T p = 1.1 s and significant wave height H m0 = 2.5 cm at the wave maker (water depth h = 53 cm).The waves were shoaled to the top of the bar with water depth h = 11 cm, again resulting locally in an extremely nonlinear wave field.Here, we will focus exclusively on the surface elevation data from the most nonlinear position x = 2.2 m; see e.g.figures 4 and 5 of Trulsen et al. (2020).The measured surface elevation time series at this position has S = 0.7888, K = 4.193, S h = 10.35 and K h = 44.56.These lead to the cumulants listed in table 4 (final row), which are seen to be extremely large relative to the cases considered previously.An example time series segment spanning 100T p is provided in figure 6, which is seen to contain several rogue waves.
The PDF from this data set is compared with the present higher-order (third to fifth order, computed utilizing ζ max = 9) distributions in figure 7. Also included for completeness are the first-order Gaussian and second-order FKZ distributions.It is clear in this case that the surface elevation is strongly non-Gaussian.Even though the surface elevation in this case stems from a non-uniform water depth, the PDF is predicted quite reasonably by both the present new fourth-order and (especially) fifth-order distributions.This is especially true for the heavy positive tail, which is likely of most practical interest e.g. for predicting the exceedance probability of extreme wave crests.
Let us utilize the present case to emphasize the potential importance of the higher-order PDFs made possible in the present work, regarding e.g. the probability of rogue waves with surface elevation exceeding a given threshold ζ 0 .The exceedance probability is defined by (6.4) Results are shown in table 5 for several exceedance thresholds, ranging from ζ 0 = 3 to ζ 0 = 6, i.e. surface elevations exceeding 3-6 standard deviations above the mean.Convergence to the precision shown has been confirmed by varying ζ max .For the sake of discussion, we will focus on the results with threshold ζ 0 = 6.It is seen that the present fifth-order result predicts a probability of rogue waves with surface elevation exceeding 6σ that is more than 4000 times greater than would be expected from linear theory, and 12.8 times greater than would be expected based on the second-order theory of FKZ.Only starting at third order is the exceedance probability the correct order of magnitude, which is still approximately half that predicted at fifth order.We finally compare the present results for this case with other previously proposed nonlinear distributions in figure 8.This figure specifically compares the data of Trulsen et al. (2020) and the present fifth-order PDF with (i) the third-order PDF of Longuet-Higgins (1963), where Note that the Tayfun (1980) distribution assumes both narrow-bandedness of the spectrum as well as unidirectionality of the wave field.Note also that the integral in (6.7) is computed numerically, rather than utilizing e.g. the asymptotic approximation of Socquet-Juglard et al. (2005).Figure 8 clearly demonstrates that these previously proposed theoretical PDFs do not accurately describe the heavy positive tail inherent in this data set, which is again quite well predicted by the present fifth-order theory.
It is emphasized that the present PDF requires only the relevant statistical moments (alternatively, the relevant cumulants) as input.Accurate reproduction of this challenging PDF has previously required detailed numerical simulations with fully nonlinear wave models; see e.g.figure 13(c) of Zhang & Benoit (2021).

Conclusions
In this work, we have revisited the probability density function (PDF) for the free surface elevation in nonlinear irregular seas, based on the integral found originally by Longuet-Higgins (1963), stemming from the cumulant generating function.A new ordinary differential equation (ODE) has been derived, which governs this PDF to any desired order in nonlinearity, presented as (2.7).We emphasize that this ODE is free of any assumptions regarding narrow-bandedness of the spectrum or the directionality of the wave field.It has been confirmed that exact solutions of this ODE lead to the Gaussian distribution at first order, and the recently found Fuhrman et al. (2023) distribution at second order.Asymptotic solutions of this governing ODE have been derived analytically in the limit of large surface elevation, making use of the method of dominant balance.Up to a multiplicative constant, these provide the theoretical form of the positive tail of the PDF, thus newly clarifying how high-order cumulants (involving statistical moments such as the kurtosis, hyperskewness and hyperkurtosis) theoretically govern the positive tail.It is shown that the positive tail gets heavier at each successive order, implying increased probability of large surface elevations typical of rogue waves.
The asymptotic solutions have been utilized to generate boundary conditions, enabling backward numerical integration of the governing ODE.By ensuring a unit integral, this enables determination of the multiplicative constant, and hence the full theoretical PDF.The present work thus enables novel determination of the PDF to third order and above.Results up to fifth order have been derived explicitly, but the methodology can be extended trivially to any desired order in nonlinearity.
The new higher-order PDFs have been compared against those from data sets generated from irregular wave theory and fully nonlinear numerical simulations, as well as laboratory experiments.Focus has been exclusively on highly nonlinear cases in finite water depth, where second order is demonstrably inadequate.Generally excellent results have been found, even in the most challenging cases, confirming accuracy of the new high-order PDFs.In such highly nonlinear circumstances, the new PDFs have been shown to be far superior to previously existing theoretical methods.
Matlab and Python functions returning the new PDFs up to fifth order in nonlinearity are freely provided in the supplementary material.The authors sincerely hope that these may be useful for others wishing to utilize the new high-order PDFs in practice.
into this equation and cancelling factors of B exp(A(ζ ))

5.
Numerical solution of the ODE to yield p(ζ ) to any order In the previous section, we derived the form of the positive tail of p(ζ ) for general order N, up to a multiplicative constant B. Combining this result with numerical integration of the ODE (3.7), p(ζ ) may be determined for all values of ζ .In fact, an algorithm to determine p(ζ ) as well as its asymptotic tail with the correct value of B is as follows: (i) Assume a value for B. (ii) Use the asymptotic tail solution to establish boundary conditions for p(ζ ) and its first N − 1 derivatives at some large positive ζ = ζ max .(iii) Solve the ODE numerically, integrating backwards from ζ max to some negative ζ min .(iv) Compute numerically the integral ζ max ζ min p(ζ ) dζ.

Figure 1 .
Figure 1.Example solutions depicting numerical p(ζ ) (solid lines) and analytic asymptotic tail solutions (dashed lines) to (a) third, (b) fourth and (c) fifth order.Parameters utilized are as indicated in table 3.

Figure 2 .
Figure 2. Comparison of the PDF computed from the second-order directionally spread irregular wave theory of Madsen & Fuhrman (2012, referred to as MF12) (circles, with error bars) with linear theory (blue dotted line), second-order theory of FKZ (green dashed line), and the present third-order solution (solid line).

Figure 3 .
Figure 3. Snapshot of the surface elevation in the vicinity of the largest surface elevation generated by the fully nonlinear model of Klahn et al. (2021c).The horizontal axes are to scale, whereas the vertical axis is exaggerated by a factor of two.The horizontal area shown is 4λ p × 4λ p .

Figure 4 .Figure 5 .
Figure 4. Example free surface elevation along the line containing the largest crest generated by the fully nonlinear wave model.The inset depicts a zoomed-in region immediately surrounding the largest crest.The variable x p denotes the x-position of the largest value of ζ .

Figure 6 .
Figure 6.Example time series involving the largest crests (occurring at time t = t p ) from experiments of Trulsen et al. (2020).The inset depicts the region immediately surrounding the largest crest.

Figure 8 .
Figure 8.Comparison of the PDF from the experiments of Trulsen et al. (2020) (circles with error bars) with the present fifth-order solution (solid line), the Tayfun (1980) second-order distribution (6.7) (red dashed line) and the Longuet-Higgins (1963, referred to as LH63) third-order distribution (6.5) (blue dotted line).

Table 1 .
The coefficients in the asymptotic expansion of A (see (4.22)) for different nonlinear orders N.

Table 2 .
The first six cumulants expressed in terms of the skewness S, kurtosis K, hyperskewness S h ≡ ζ 5 and hyperkurtosis K h ≡ ζ 6 .

Table 3 .
Summary of parameters utilized in the example PDFs presented in figure1.

Table 4 .
Summary of cases considered in the present work.
Figure 7.Comparison of the PDF from the experiments of Trulsen et al. (2020) (circles with error bars) with linear theory (blue dotted line) and second-order FKZ distribution (green dashed line), as well as the present third-(red dash-dotted line), fourth-and fifth-order (solid lines) solutions.