Regressing bubble cluster dynamics as a disordered many-body system

Abstract The coherent dynamics of bubble clusters are of fundamental and industrial importance, and are elusive due to the complex interactions of disordered bubble oscillations. Here we introduce and demonstrate a method for decomposition of the Lagrangian time series of bubble dynamics data by combining theory and principal component analysis. The decomposition extracts coherent features of bubble oscillations based on their energy, in a way similar to proper orthogonal decomposition of Eulerian flow field data. This method is applied to a dataset of spherical clusters under harmonic excitation at different amplitudes, with various nuclei density and polydispersity parameters. Results indicate that the underlying correlated mode of oscillations is isolated in a single dominant feature in cavitating regimes, independent of the nuclei's parameters. A systematic data analysis procedure further suggests that this feature is globally controlled by the dynamic cloud interaction parameter of Maeda and Colonius (J. Fluid Mech., vol. 862, 2019, pp. 1105–1134) that quantifies the mean-field interactions, regardless of initial polydispersity or nonlinearity. The method provides a simplified and comprehensive representation of complex bubble dynamics as well as a new path to reduced-order modelling of cavitation and nucleation.

For the past decades, the Rayleigh-Plesset (R-P) equation and its variations have been actively explored to investigate the dynamics of single bubbles (Plesset 1949;Plesset & Prosperetti 1977).Relatively few studies addressed the theory of clusters.By using the mean field approach to interacting bubbles modelled by the R-P equation, d 'Agostino & Brennen (1989) derived a non-dimensional parameter that dictates the linear coherent oscillations of monodisperse clusters, the so-called 'cloud interaction parameter'.Zeravcic, Lohse & Van Saarloos (2011) used the coupled R-P equations and identified disorders represented by the Anderson localization of acoustic energy in polydisperse, lattice-like clouds under weak excitation.We have recently extended the interaction parameter to the non-equilibrium, cavitating clusters under strong excitation by considering the effective interaction at excited states (Maeda et al. 2018;Maeda & Colonius 2019;Maeda & Maxwell 2021).To recall, we scale the mean kinetic energy of liquid induced by N( 1) bubbles as (Maeda & Colonius 2019) where K s is the energy of a single bubble: K s = 2πρR 3 b,c Ṙ2 b,c ; B d is the parameter controlling the effective contribution of hydrodynamic inter-bubble interaction: B d = N R(t) /R C , and R b,c and R C denote the characteristic (reference) bubble radius and the cluster radius; ρ l is the liquid density; • and (•) denote time average during a period in which bubble dynamics are statistically stationary and the mean value about the bubbles (i = 1, 2, . . ., N), respectively.The scaling can be simply derived from the coupled R-P equation for the correlated (synchronized) limit of monodisperse bubbles (R 1 = R 2 • • • = R b,c ).Although realistic correlations are imperfect due to polydispersity and nonlienarity, B d was found to control well both the coherent dynamics of polydisperse cavitating clusters, and their acoustic emission in numerical simulations and experiments (Maeda & Colonius 2019;Maeda & Maxwell 2021).Overall, previous studies indicate that the coherence can depend on both polydispersity and nonliearity in a non-separable manner, posing perplexing questions about the universality of scaling.The theoretical characterization of the nonlinear dynamics of disordered many-body systems is, in general, not a simple task.Meanwhile, greater computing power has enabled learning physics by analysing big data.Principal component analysis (PCA) is a powerful method for unsupervised learning which has seen recent success in characterizing the coherent physics of many-body and high-dimensional systems in fields ranging from quantum information to fluid dynamics (Milano & Koumoutsakos 2002;Holmes et al. 2012;Lloyd, Mohseni & Rebentrost 2014;Taira et al. 2017).
In this study, we introduce and demonstrate a method for unsupervised data decomposition to study the coherent bubble cluster dynamics by combining theory and PCA.PCA extracts dominant states and dynamical features, such as coherent quantum states and turbulent structures, and their amplitudes as the eigenfunction (feature) and the eigenvalue (variance) of the co-variance matrix of physical data.When applied to spatio-temporal data of dynamical systems, PCA is often denoted as proper orthogonal decomposition (POD).Those data can be properly weighted prior to PCA such that the variance becomes consistent with the norm induced by an energetic inner product of state variables (e.g.kinetic energy) (Lall, Marsden & Glavaški 1999;Rowley 2005).This weighting allows a physical interpretation that resulting features associated with a large variance are energetically dominant coherent structures.Proper weighting of Lagrangain bubble dynamics data is non-trivial since the linear variance of extracted features needs to account for the nonlinear interaction energy.For meaningful analysis, we introduce strategic pre-processing of the data prior to PCA such that the PC-variance becomes theoretically consistent with the energy modelled by the coupled R-P equation.Analysing simulation datasets of clusters, we show that the PCA can systematically extract not only coherent but also incoherent features whose magnitudes are respectively measured by the PC-variance and the entropy.We discover that the coherence is lost by disorders induced by polydispersity and nonlinearity, while under strong excitation, the underlying correlations are globally isolated in a single coherent feature whose variance (energy) is scaled by B d , regardless of the disorders.
The remainder of this paper is as follows.In § 2, we describe the method.The first PC variance, spectral entropy and the coherence measure are introduced as quantifiable measures to characterize the coherent bubble dynamics from extracted features.In § 3, we verify and demonstrate the method using a numerical dataset of bubble clouds with different density and polydispersity parameters under various amplitudes of harmonic excitation.The deviation of the PC-variance from the physical energy is quantified for two weighting methods.The amplitude dependencies of the measures are quantified.The PC-spectra and their correlations with the coherent energy are analysed.Moreover, the extracted coherent dynamics is related to B d and its universality is discussed for cavitating clouds.In § 4, the physical significance of the method is discussed.In § 5, we state conclusions.

Principal component analysis of bubble dynamics data
For clusters modelled by the coupled R-P equations, observable dynamical variables are the bubbles' radial velocities and radii.Consider a data matrix Q containing the N t snapshots of the radial velocities with a constant temporal interval, Q = [q 1 , q 2 , . . ., q Nt ], (2.1) where q k denotes the vector containing the radial velocities of the N bubbles at time t k (in the kth snapshot), (2.2) For later convenience, we also define the vector r k , containing the radii of the bubbles at the same instances: PCA can be performed on Q by using the singular value decomposition (SVD) (e.g.Abdi & Williams 2010;Jolliffe & Cadima 2016): (2.4) The ith principal component (feature) is stored in the ith score matrix, where Σ i contains the ith largest singular value and zeros elsewhere.

Weighted principal component analysis
Although this procedure for PCA is simple and straightforward, the physical meaning of the extracted features are obscure since PCA itself is not informed on the underlying dynamics of the system.A related issue of PCA for fluid flow data has been addressed in the context of POD.In POD of the snapshots of Eulerian fluid flow data, the state vectors consisting of the data matrix are often weighted such that the corresponding weighted inner product of the state becomes consistent with the kinetic energy of the original system.The dominant features (POD modes) can then be interpreted as energetically dominant coherent flow structures.Moreover, in our previous studies (Maeda & Colonius 2019;Maeda & Maxwell 2021), the scaling of bubble cloud dynamics by B was successfully demonstrated based on the total kinetic energy of liquid induced by interacting bubbles.These considerations motivate us to relate the energy to PCA for addressing the physics of bubble cloud.
Inspired by the POD, we consider weighting the present bubble dynamics data prior to PCA.To find an appropriate weight, we revisit the potential theory behind (1.1).The kinetic energy of the fluid induced by the spherical bubble oscillations is explicitly expressed as where R i , Ṙi and r ij are the radius and the radial velocity of bubble i, and distance between the centres of bubble i and j, respectively.The second term in the brackets represents the contribution of the long-range interactions.At each instant, K can be expressed as a weighted inner product of q: where (2.8) The weight matrix, W , can be obtained through the Cholesky factorization of T (r): (2.9)

Radial velocity
Radius . Schematic of the feature extraction by PCA from Lagrangian bubble dynamics data, Q, after pre-processing.The variance of the resulting features is consistent with the interaction energy predicted by the coupled Rayleigh-Plesset equation.
oscillations.The choice of r k in the time series to define W is unclear.A straightforward choice is to use the temporal mean of the radius for each bubble, r : r = Nt k=1 r k /Nt, but T( r ) is not a favourable approximation for T(r) for bubbles with large-amplitude oscillations.
We address this obstacle by variable transformation as a means of pre-processing the data.The schematic is shown in figure 1.First, we transform the variables from (q, r) to Here, ξ i is nothing but the velocity potential evaluated at the surface of bubble i. Supplemental discussions and justifications for this transformation are provided in Appendix A. The instantaneous energy of the system is expressed by a weighted inner product as (2.10) and Ω is the weight matrix satisfying (2.11) where (2.12) Second, we partially replace η with η to approximate the system.Using ( ξ , η), the instantaneous energy of the approximate system is expressed as The temporal mean of the energy of the original system can then be approximated as This is a critical result in the present context of PCA, since the energy is now related to the weighted inner product with the constant weight, P( η ).Quantitative verification of this approximation is addressed through numerical experiments in the following section.
Using the new set of variables, the PC decomposition is performed as (2.15) The ith PC is stored in Π ξ,i : Π ξ,i = U ξ Σ ξ,i .We denote the ith largest singular value of Σ ξ as σ i .The degree of coherence can be measured by the normalized variance of the first PC: which represents the ratio of the energy occupied by the first PC to the total energy of the system.In the following sections, to distinguish the present method from the standard PCA, we denote (2.15) as the weighted principal component analysis of transformed data (WPCA-TD).

Spectral entropy and coherence measure
We introduce two additional key measures to characterize the coherent physics of bubble clusters through WPCA-TD.First, to quantify the degree of incoherent bubble oscillations, we define the spectral entropy of the weighted co-variance matrix, ΩΞ , as where ln 2 (N) is the normalization factor.The spectral entropy of data is a discrete analogue of the Shannon entropy based on the spectrum of data and has been used to quantify the randomness of data in various applications (Kullback 1997;Alter, Brown & Botstein 2000;Hu et al. 2005;De Domenico & Biamonte 2016).Aubry, Guyonnet & Lima (1991) introduced a similar definition of entropy for spatio-temporal signals of canonical fluid flows from their POD eigenvalues and used the entropy to characterize flow instabilities.
In the present context, when bubbles are in perfect correlation, we expect to excite only the first PC capturing the entire energy of the system: yielding ŜvN = 0.In contrast, if the energy is equi-partitioned into all PCs, σ 2 k = 1/N for all k : k ∈ [1, N] and ŜvN = 1.Second, we define the coherence measure C: where Here, σ 2 1 is the first PC-variance excluding the contribution of interactions, obtained from the WPCA-TD of the same data using a diagonal weight matrix Ω whose diagonal entries are those of Ω.Further details of Ω are provided in Appendix B. In the limit of perfect correlation (also see (1.1)), (2.21) Therefore, C ≈ 1.This condition is typically realized in monodisperse clusters under weak (linear) oscillations (d 'Agostino & Brennen 1989).For real clusters under strong excitation, bubbles are not perfectly correlated and the energy can be distributed in broad features.In this regime, approximation (2.20) is not necessarily expected to hold and C can take values far from unity.Phenomenologicaly speaking, C quantifies from data the degree to which the most coherent mode of oscillation is represented by the cluster's mean-field interaction.A schematic of the process to obtain C from data is provided in Appendix B.

Numerical experiments
3.1.Datasets To verify and demonstrate the WPCA-TD, we use datasets of spherical clusters excited by 40 cycles of harmonic pressure excitation.Each cluster contains O(10-10 3 ) bubbles with their initial radii following log-normal distributions with a reference radius of , where s d is the lognormal standard deviation as the measure of polydispersity (Maeda & Colonius 2019).We address s d = [0.1,0.3, 0.5, 0.7].In real bubble clouds, nuclei are expected to be polydisperse.Here, s d = 0.7 may be a representative estimation based on previous studies (Katz 1978 The frequency of excitation is f = 500 kHz unless noted, near the adiabatic resonant frequency of the reference bubble.The amplitude of excitation is defined by A, relative to the ambient pressure at p 0 = 1.0 atm.For each set of parameters, we compute an ensemble average by taking a mean of the results from 20 bubble clouds with distinct spatial placements of bubbles in the clouds.For data generation, we use mesh-free, coupled Keller-Miksis equations by modifying previous methods.Details of this method are provided in Appendix D.

Visualization of representative data
In figure 2, we show the evolution of representative quantities of a bubble cloud in the dataset with (B 0 , s d , N) = (0.5, 0.1, 100) excited at A = 20, during a stationary state.Figure 2(a) shows the radius of a representative bubble.The plot presents familiar features of cavitation bubbles forced by continuous, strong excitation, including fast events of collapse/rebound and slow growth/decay between them.During the slow phase, small-amplitude oscillations at excitation amplitude are evident.For the same bubble, figure 2(b) shows the evolution of the velocity potential obtained from the raw data, and those from the first and second dominant features extracted through the WPCA-TD.Compared with the radius, the raw data of the velocity potential looks much more symmetric about zero, even around the collapse events.The potential of the first feature presents a sinusoidal profile at the excitation frequency and with a constant amplitude which is close to the peak amplitude of the original data.The collapse events are not captured in this feature.The potential of the second feature is symmetric but has a much lower amplitude compared with the first feature.There is no clear similarity between the original data and the second feature, unlike that between the original data and the first-PC.Figure 2(c) shows the void fraction.The fraction oscillates with an amplitude of approximately 5.0 × 10 −4 near 3.0 × 10 −3 at the excitation frequency.Slow, small amplitude of variations are also observed.Figure 2(d) shows the kinetic energy of fluid induced by bubble oscillations.The energy oscillates approximately 0.75 μJ at doubled the excitation frequency.Although the frequency is as expected, the peak amplitude largely fluctuates in the window as well as the waveform is not symmetric, unlike the void fraction.The fluctuation and asymmetry indicated can be associated with the incoherent oscillations to the kinetic energy.For instance, if one bubble is expanding and another bubble is collapsing out of phase, their net contribution may not appear in the void fraction due to mutual cancellation but can appear in the energy.Overall, the averaged quantities, void fraction and the energy, are much smoother than the individual bubble dynamics.This can be trivially explained by the incoherence of violent collapse events among bubbles and coherence of the linear response against fundamental frequencies.Meanwhile, the quantitative nature of the coherent response is not predictable or easily analysable from these plots due to strong nonlinearity, especially under inter-bubble interactions with disordering factors including randomness of bubble position and polydispersity.
Figure 3(a-c) shows the projected side-views of the three-dimensional bubble cloud of figure 2. The size of spheres can be interpreted as the mean energy of oscillations of bubbles at those locations in each feature.The overall distribution of the spheres of the first feature resembles that of the raw data, while the energy of the bubbles in the second feature is much smaller than that of the first feature.The plots therefore visually confirm that the first feature represents the most energetic mode of oscillations in the original data.This result also agrees with the observation of figure 3(b).

Error analysis
To show the effectiveness of the pre-processing, in figure 4, we plot the relative errors of the time-averaged kinetic energy of bubble clusters, one approximated using the original variable ( q T T ( r )q ) and the other using the transformed variables ( ξ T P( η )ξ ), against the excitation amplitude.At A < 10 −1 , the error is nearly zero for both approximations.At A > 10 −1 , at which bubble dynamics become nonlinear, the error grows with A for the former, while it remains small for the latter.This result confirms the improved approximation by the pre-processing.

Amplitude dependence of the key measures
The coherent dynamics critically depend on the excitation amplitude.Figure 5 shows the dependence of σ 2 1 , ŜvN and C against A for various density and polidispersity parameters of clouds.This dependence is best highlighted in the result of a sparse, weakly polydisperse cluster (B 0 = 0.5, s d = 0.1) in figure 5(a).The relative importance of the first PC, σ 2 1 , decays nearly monotonically from 0.9 to 0.2 through three distinct regimes.For A 0.2, σ 2 1 ≈ 1, meaning the entire energy is captured by the first PC.Then constant around 0.2.The decay indicates the decrease in the coherence with increasing A, and can be explained by the excitation of the nonlinear oscillations and cavitation triggered at A ≈ 1 and above.Here, ŜvN has a profile vertically mirrored to σ 2 1 ; ŜvN increases from approximately 0.1 to 0.5 through the three regimes, indicating more partitioning of the energy into multiple PCs and increase of incoherence, by increasing A. The mirrored profiles of σ 2 1 and ŜvN suggest that these parameters are complementary.Remarkably, C draws a square-well like profile, where C ≈ 1 for both low (A < 0.2) and large (A > 5) amplitudes reaching a minimum value (C ≈ 0.5) at A ≈ 1.This counter-intuitive result suggests that the energy of the first PC is scaled by the mean-field parameter B d regardless of the increase of incoherence for large A, and the scaling is lost only for the intermediate range at 1 < A < 5. Comparisons with the result of dense, weakly polydisperse clusters (B 0 , s d ) = (5.0,0.1) shown in figure 5(b) highlights the effect of the density of bubbles.In figure 5(b), σ 2 1 takes values near unity at small A, and decays at A ≈ 1.0 to approximately 0.4 and stays nearly constant at A > 5.0.At A > 10, σ 2 1 slightly grows against A. Here, ŜvN has a profile vertically mirrored to σ 2 1 .The features of σ 2 1 and ŜvN are similar to those observed in figure 5(a), except that the transition occurs at a larger range of A. The sudden decay and growth of σ 2 1 and ŜvN can likewise be associated with the linear-to-nonlinear transition of bubble dynamics which results in incoherence.Meanwhile, the positive shift of the transition range of A indicates that the dense bubble clouds tend to behave more coherently than sparse bubble clouds, agreeing 985 A23-10 with the previous theory (d 'Agostino & Brennen 1989;Maeda & Colonius 2019).Here, C is greater than 0.5 at almost all values of A, and is relatively more insensitive against A compared with in figure 5(a), indicating that the cluster dynamics is moderately controlled by B d .Figures 3(c) and 3(d) respectively correspond to sparse and dense clusters with strongly polydisperse nuclei; (B 0 , s d ) = (0.5, 0.7) and (5.0, 0.7).These plots show clear differences from those of weekly polydisperse clusters in figures 5(a) and 5(b).For the sparse clusters (figure 5c), σ 2 1 mildly decays from 0.8 to 0.7 at 0.1 < A < 0.2, sharply decays to the minimum of 0.2 at A ≈ 0.4 and then mildly grows to 0.3 at A = 1.2.The sharp decay resembles that in figure 5(b), while the slope is milder.Additionally, ŜvN has a profile mirrored to σ 2 1 , but it is less symmetric than in figures 5(a) and 5(b).Here, ŜvN is nearly constant around 0.2 at 0.1 < A < 2.0 and draws a concave curve with its maximum of approximately 0.5 at A ≈ 5.0 followed by a smooth decay to 0.3 at A = 20.The C almost constantly increases from 0.2 to O(1) throughout the plot, indicating that the coherent dynamics is controlled by B d , only at large A, unlike the weakly polydisperse case of figure 5(a).The overall trend of the plots in figure 5(b) is similar to that in figure 5(a), although the changes of variables against A are milder in figure 5(b).
Overall, the mutual trends of variables at A > 5 are similar between figures 5(a) and 5(c), and between figures 5(b) and 5(d), indicating the decreasing influence of the initial polydispersity in the nonlinear regime.At A < O(1), the polydisperse clouds tend to have smaller values of σ 2 1 , and larger values of ŜvN and C.This can be explained by the enhancement of incoherent dynamics induced by polydispersity.

Principal component spectrum
To gain deeper insights to the meaning of C, in figures 6(a)-6(d), we show the PC-variances obtained at A = 2 × 10 −2 , 1.2 and 20, for the first 10 PCs, obtained from the sparse, weakly polydisperse clouds blueplotted in figure 5(a).The insets show the evolution of the square-root of the normalized total energy ( √ K), and those of the first and the second PC-variances ( √ K 1 and √ K 2 ), during the four periods of excitation in statistically stationary states.As expected, in the linear regime (figure 6a), the first PC occupies nearly the entire energy and √ K 1 evolves at the fundamental frequency.In the transition regime (figure 6b), the first PC occupies 40 % of the energy and the rest is partitioned into the sub-dominant PCs with a smooth decay.Here, √ K evolves more chaotic than the linear regime, as expected due to the nonlinear response of bubbles.Both √ K 1 and √ K 2 evolve with similar quasi-periodic profiles.We interpret that both PCs represent the coherent part of the energy.
Interestingly, in the nonlinear regime (figure 6c), energy partitioning is non-smooth; the first PC occupies 40 % of the total energy similar to the transition regime, but the rest of the energy is broadly distributed into the other PCs with much smaller amplitudes.The evolution of √ K is non-periodic with noisy, fine structures of spikes.These spikes are expected due to the incoherent collapse events.The evolution of √ K 1 is, in contrast, highly periodic and somewhat resembles that of figure 6(b).The evolution of √ K 2 is more chaotic and less smooth than √ K 1 .The difference between the first-PC and the rest of the PCs suggest that, in this regime, only the first PC captures a major coherent feature and the rest of the PCs represent more the incoherent dynamics as broadband noise.Figure 6(d) shows the result of the nonlinear regime with a stronger excitation amplitude (A = 20).Overall, both the spectrum and the evolution of K look similar to that of figure 6(c), other than that the amplitude of the first PC-variance is increased to 60 %.The resemblance of  figures 6(c) and 6(d) indicates that the dynamical features identified from these two plots are common in the nonlinear regimes.The resemblance of the evolution of the first PCs in the linear and the nonlinear regimes can explain the recovery of C at A > 5 in figure 6(a).Although the overall dynamics are much more chaotic in the nonlinear regime, the contribution of the coherent interactions to the system's energy effectively appears only in the first PC in both regimes and therefore the relative contribution of the interaction to the first PC is commonly scaled by B d .This result also implies that the underlying coherence in the nonlinear regime represents the perfect correlation (synchronized oscillations) like that of linear, monodisperse clouds.

Evaluation of the coherence measure
To assess the variation of the coherence measure dependent on the three regimes, in figures 7(a)-7(c), we plot B 1 against B d for O(10 3 ) clusters with various values of s d , N and R C , at the three distinct excitation amplitudes (A = 2 × 10 −2 , 1.2 and 20).Appendix D summarizes the parameters used.With the weak excitation (figure 7a), bubble oscillations are in a linear regime and B d ≈ B 0 .The data points are collapsed on the line of C = 1 for s d = 0.1, while data points are scattered for the other values of s d .This result is expected as B d was originally defined to scale the coherence of the monodisperse, perfectly correlated bubbles in the linear regime (d 'Agostino & Brennen 1989;Zeravcic et al. 2011).With the intermediate excitation (figure 7b), the bubble oscillations are nonlinear.The data points are scattered from C = 1, regardless of the value of s d .In this regime, the result indicates that the coherence is lost due to the nonlinear dynamics, regardless of polydispersity.Surprisingly, with the strong excitation (figure 7c), the data points are collapsed on the line of C = 1, meaning that the variance of the first PC is scaled by B d , regardless of the parameters.This collapse is not observed for the second PC, regardless of the parameters (Appendix E).The results suggest that the scaling of B 1 is universal for cavitating clusters, which are typically excited at O(1) MPa and above.Physically speaking, the scaling indicates that in cavitating clusters, the energy is partitioned into a single coherent mode of correlated (synchronized) oscillations and incoherent modes generalizing the aforementioned interpretation of figures 6(c) and 6(d).The coherent energy is controlled by the mean field originally derived for monodisperse, near-equilibrium bubbles.It is suggested that this partitioning and scaling are universal regardless of nuclei's polydispersity and the degree of nonlinearity.This finding explains the successful use of B d in characterizing and controlling seemingly disordered clusters in our previous studies (Maeda & Colonius 2019;Maeda & Maxwell 2021).The isolation of coherence implies that the details of the microscopic scale as well as of the many-body interactions represented by higher-order PCs could be modelled as fast variables, which force the macroscopic (relevant) scales in the form of noise.A rigorous way to corroborate this hypothesis might be the Mori-Zwanzig projection operator formalism (Mori 1965) by assuming that the fluctuations of fast and slow variables are uncorrelated.Although the choice of relevant slow variables may be unclear, the present approach is promising since the WPCA-TD can be seen as a prompter for more refined models of the evolution of slow variables.

Significance of the WPCA-TD to inform on cavitation physics
In general, PCA can be considered as a mathematical technique for data decomposition and the physical meaning of extracted features can be left to one's interpretation.To the authors' knowledge, studies of complex fluid flows employing data decomposition and feature extraction techniques often rely on one's intuition to discuss the physical meaning of features.In fact, in § 3.5, the analyses of the PC spectra required rigorous interpretations based on previous knowledge on cavitation physics.Meanwhile, the correlation between B 1 and B d identified in the last section shows that the present WPCA-TD can directly extract B d from data without additional signal processing like spectral filtering, which are often required for Fourier-based analysis, despite the presence of noisy and incoherent features.To generalize, WPCA-TD is not only a tool for data decomposition and feature extraction, but also can be used to directly inform on the non-dimensional number that controls the coherent physics of interests.This informative aspect of the WPCA-TD can provide a meaningful shortcut to address the physics of cloud cavitation.
To perform WPCA-TD, bubble dynamics data can be obtained from experimental measurements as well as other numerical approaches (e.g.Kameda & Matsumoto 1996;Maeda & Colonius 2018;Pishchalnikov et al. 2019).The cluster's shape can be arbitrary.Although a bubble's translation and deformation are neglected in the present numerical experiments, WPCA-TD can incorporate dynamical variables controlling these effects (e.g.Ilinskii, Hamilton & Zabolotskaya 2007;Murakami, Gaudron & Johnsen 2020).Physically meaningful data decomposition requires the fine temporal resolution of individual bubble dynamics.In practice, such information would be difficult to obtain in experiments except for a small number of bubbles in a highly controlled environment.We thus emphasize that WPCA-TD would primarily be useful to process fine temporal resolution of numerical data.

Conclusion
In conclusion, to corroborate the coherent dynamics of bubble clusters, we introduced and demonstrated WPCA-TD, a method of PCA to comprehensively decompose the time series of Lagrangian bubble dynamics data into coherent dynamical features, in way similar to the modal decomposition of Eulerian flow field data.The data are pre-processed such that the PC-variance of the features becomes consistent with the hydrodynamic potential energy induced by bubble oscillations that is predicted by the R-P equation.By analysing simulation datasets of clusters under harmonic excitation, we demonstrated that the coherent energy and the degree of incoherence are respectively quantified by the variance and the spectral entropy.The coherence was lowered by disorders induced by the nuclei's polydispersity and nonlinear response of bubbles, as expected.Meanwhile, in cavitating regimes, underlying, correlated modes of oscillations were isolated in a single dominant feature.The variance of this feature was found to be controlled by the previously identified mean-field parameter, B d , regardless of the disordering factors, indicating that the underlying coherent dynamics may be universal in cavitating clusters.These results suggest that the method can provide a simplified and comprehensive representation of . Comprehensive schematic of the present procedure to obtain the spectral entropy, PC-variance and coherence measure from input data.
events do not influence much the temporal mean of the bubble size.Therefore, η is close to η except at collapse events and, even if η changes rapidly during the collapse and rebound, this change has a relatively small influence on the mean behaviour of the system at the time scale of (statistically) stationary bubble oscillations.We can express the instantaneous energy of this approximate system as With the same initial conditions (ξ = ξ and η = η at t = 0), the temporal mean of the energy of the original system is approximated as Note that Preston et al. (2007) used a POD-based analysis of single bubble dynamics for reduced-order modelling of heat and mass diffusion across the bubble interface.In the study, the temperature and concentration fields were obtained by solving partial differential equations and then represented by POD modes, following the POD/Galerkin framework.This is distinct from the present PCA (POD) of Lagrangian bubble dynamics data based on the direct transformation and projection of the Rayleigh-Plesset equation.

Appendix B. Schematic of the input-output procedure
Figure 8 shows the schematic of the present procedure to obtain the spectral entropy, PC-variance and coherence measure from the input data including the time series of the radius and radial velocity of Lagrangian bubbles.
For computing C, we introduced the alternative weight, Ω .Here, Ω is defined through P : P ( s ) = Ω Ω * , where P is the diagonal matrix with its entries from P: The corresponding inner product, ξ T P ξ , represents the portion of the energy of the system excluding the contribution of inter-bubble interactions: where Appendix C. Details of the K-M equation used for data preparation In this section, we provide details of the formulation of the Keller-Miksis (K-M) equation for multiple bubbles, which is used to generate the Lagrangian bubble dynamics data.
Various extensions of the Rayleigh-Plesset equation and its applications to a system of multiple bubbles are available (Takahira, Akamatsu & Fujikawa 1994;Doinikov 2004;Ilinskii et al. 2007;Yasui et al. 2008).Our formulation can be derived from the K-M equation extended for multiple bubbles.First, we recall the general formulation for the oscillations of interacting spherical bubbles in weakly compressible liquid with arbitrary inter-bubble distances, whose derivation is, for instance, provided in Appendix 2.4 of Fuster & Colonius (2011).In the present study, we use a simplified version of this derivation.To recall, Fuster & Colonius (2011) describe the radial evolution of bubble i as where F * and I * represent the forcing due to the external potential and the inter-bubble interaction, respectively, and are expressed as and Here, φ ∞ is the velocity potential of liquid at infinity, and φ i (R i ) and H i are the potential and the enthalpy of liquid evaluated at the surface of bubble i; t is the retarded time defined as t = t − (d ij − R j )/c, where (d ij − R j )/c represents the travel time for the pressure wave to reach bubble i from the surface of the bubble j and d ij is the distance between the centres of those bubbles.In the sparse limit, the equation recovers the original Keller-Miksis equation (Keller & Miksis 1980).To close the equations, several relations are considered.Using Bernoulli's equation, the potential for the bubble i can be expressed as The velocity potentials at the bubble i and bubble j satisfy the following relation: The enthalpy and the potential derivative are approximated as Finally, p ∞ is obtained using the information of the background Eulerian field computed on a mesh.
In the present study, we simplify F * and I * by invoking further approximations.First, consider that the bubble cluster size is smaller than the characteristic length-scale of the pressure wave in the field, λ c .The inter-bubble distance in the cluster is naturally smaller than λ c : Dividing both sides with c, where f c and T c are the characteristic frequency and the period of the wave in the field.Therefore, the difference between t and t is much smaller than the characteristic time scale of the dynamics.Given this knowledge, we approximate that t ≈ t.Second, we neglect the terms of the order of ( Ṙ/c) 2 .Using these approximations, we can simplify F * and I * as and

Figure 2 .
Figure 2. Evolution of representative quantities of a bubble cloud with (B 0 , s d , N) = (0.5, 0.1, 100) excited at A = 20, during a stationary state.(a) Radius of a representative bubble.(b) For the same bubble, ξ obtained from the raw data, and those from the first and second dominant features extracted using WPCA-TD.(c) Void fraction.(d) Kinetic energy of the fluid induced by the bubble cloud.

Figure 3 .Figure 4 .
Figure 3. Projected side-views of the three-dimensional bubble cloud of figure 2. The size of the spheres denotes the root-mean-square amplitude of the velocity potential evaluated at the bubble surface at corresponding locations during the stationary state of oscillations, for (a) the raw data, (b) the first principal feature and (c) the second principal feature.

Figure 6 .
Figure 6.(a-d) PC spectral profiles at A = 2 × 10 −2 , 1.2, 8.6 and 20.Insets show the evolution of the square root of the normalized total energy ( √ K, black) and those of the first (blue) and the second (red) PC-variances ( √ K 1 and √ K 2 ), during the four periods of harmonic excitation, with t being a non-dimensional time t = tf .The y-axis of each inset is normalized by the maximum value of √ K.
;Ando, Liu & Ohl 2012; Maeda & Maxwell 2021)9)and is used unless noted.The smaller values of s d are used in some cases to quantify the effect of polydispersity.The bubbles are randomly distributed in the spherical region with a specified cluster radius.Similar parameters of clusters were previously simulated to compare with experiments(Maeda & Colonius 2019;Maeda & Maxwell 2021).The density of bubbles is characterized by B 0 , the value of B d at rest.The far-field pressure is given as p ://doi.org/10.1017/jfm.2024.313Published online by Cambridge University Press complex bubble dynamics.Analogous to the use of POD for reduced-order modelling (ROM) of various flows, the dynamical features extracted by PCA may be used for ROM, without directly solving many-body interactions.Such a model may be promising for controlling cavitation and nucleation without tracking individual nuclei. https