To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Slip effects on solid boundaries are common in complex fluids. Boundary depletion layers in polymer solutions can create apparent slip effects, which can in turn significantly impact the dynamics of moving bodies. Motivated by microswimmer locomotion in such environments, we derive a series of slip slender-body theories for filamentous bodies experiencing slip-like boundary conditions. Using Navier’s slip model, we derive three slip slender-body theories, linking the body’s velocity to the distribution of hydrodynamic forces. The models are shown to be consistent with each other and with existing numerical computations. As the slip length increases, we show that the drag parallel to the body decreases towards zero while the perpendicular drag remains finite, in a manner which we quantify. This reduction in drag ratio is shown to be inversely related to microswimmer mobility in two simple swimmer models. This increase could help rationalise empirically observed enhanced swimming in complex fluids.
Regression and classification are closely related, as shown in this chapter, which discusses methods used to map a linear regression function into a probablity function by either logistic function (for binary classification) or softmax function (for multi-class classification). According to this probablity function, an unlabeled sample can be assigned to one of the classes. The optimal model parameters in this method can be obtained based on the training set so that either the likelihood or the posterior probability of these parameters are maximized.
This paper investigates the flow past a flexible splitter plate attached to the rear of a fixed circular cylinder at low Reynolds number 150. A systematic exploration of the plate length ($L/D$), flexibility coefficient ($S^{*}$) and mass ratio ($m^{*}$) reveals new laws and phenomena. The large-amplitude vibration of the structure is attributed to a resonance phenomenon induced by fluid–structure interaction. The modal decomposition indicates that resonance arises from the coupling between the first and second structural modes, where the excitation of the second structural mode plays a critical role. Due to the combined effects of added mass and periodic stiffness variations, the two modes become synchronised, oscillating at the same frequency while maintaining fixed phase difference $\pi /2$. This further results in the resonant frequency being locked at half of the second natural frequency, which is approximately three times the first natural frequency. A reduction in plate length and an increase in mass ratio are both associated with a narrower resonant locking range, while a higher mass ratio also shifts this range towards lower frequencies. A symmetry-breaking bifurcation is observed for cases with $L/D\leqslant 3.5$, whereas for $L/D=4.0$, the flow remains in a steady state with a stationary splitter plate prior to the onset of resonance. For cases with a short flexible plate and a high mass ratio, the shortened resonance interval causes the plate to return to the symmetry-breaking stage after resonance, gradually approaching an equilibrium position determined by the flow field characteristics at high flexibility coefficients.
This chapter offers a comprehensive overview of large language models (LLMs), examining their theoretical foundations, core mechanisms, and broad-ranging implications. We begin by situating LLMs within the domain of natural language processing (NLP), tracing the evolution of language modeling from early statistical approaches to modern deep learning methods.</p>The focus then shifts to the transformative impact of the Transformer architecture, introduced in the seminal paper Attention Is All You Need. By leveraging self-attention and parallel computation, Transformers have enabled unprecedented scalability and efficiency in training large models.</p>We explore the pivotal role of transfer learning in NLP, emphasizing how pretraining on large text corpora followed by task-specific fine-tuning allows LLMs to generalize across a wide range of linguistic tasks. The chapter also discusses reinforcement learning with human feedback (RLHF)—a crucial technique for refining model outputs to better align with human preferences and values.</p>Key theoretical developments are introduced, including scaling laws, which describe how model performance improves predictably with increased data, parameters, and compute resources, and emergence, the surprising appearance of complex behaviors in sufficiently large models.</p>Beyond technical aspects, the chapter engages with deeper conceptual questions: Do LLMs genuinely "understand" language? Could advanced AI systems one day exhibit a form of consciousness, however rudimentary or speculative? These discussions draw from perspectives in cognitive science, philosophy of mind, and AI safety.</p>Finally, we explore future directions in the field, including the application of Transformer architectures beyond NLP, and the development of generative methods that extend beyond Transformer-based models, signaling a dynamic and rapidly evolving landscape in artificial intelligence.
This chapter is concerned with the constrained optimization problem which plays an important role in ML as many ML algorithms are essentially to maximize or minimize a given objective funcion with either equality or inequality constraints. Such kind of constrained optimization problems can be reformulated in terms of the Lagrangian function including an extra tern for the constraints weighted by their Lagrange multipliers as well as the original function. The chapter also consider the important duality principle based on which the constrained optimization problem can be addressed as either the primal (original) problem, or the dual problem, which is equivalent to the primal if a set of KKT conditions are satisfied, in the sense that the solution of the dual is the same as that for the primal. The chapter further considers two methods, linear and quadratic programming, of which the latter is the foundation for support vector machine (SVM), an important classification algorithm to be considered in a later chapter.
This chapter considers unsupervised learning methods for clustering analysis when the data samples in the given dataset are no longer labeled, including the K-means method and Gaussian mixture model. The K-means algorithm is straight forward in theory and simple to implement. Based on a set of K randomly initialized seeds assumed to be the mean vectors of some K clusters, the algorithm keeps on modifying them iteratively until they become stabilized. The drawback of this method is that the resulting clusters are only characterized by their means, while the shapes of their distribution are not considered. If the distributions of the actual clusters in the dataset are not spherical, they will not be properly represented. This problem can be addressed if the dataset is modeled as a mixture of Gaussian distributions, each characterized by its means and covariance, which are to be estimated iteratively by the expectation maximization (EM) method. The resulting Gaussian clusters reveal the structure of the dataset much more accurately. The k-means and Gaussian mixture methods are analogous, respectively, to the discriminative minimum-distance classifier and the generative Bayesian classifier. Following the same idea of GMM, the last section of this chapter also considers the algorithm of Bernoulli mixture model for clustering of binary data.
This chapter reviews the basic numerical methods for solving equation systems, including fixed-point iteration that will be used while discussing reinforcement learning, and the Newton-Raphson method for solving both univariate and multivariate systems, which is closely related to methods for solving optimization problems to be discussed in the following chapters. Newton’s method is based on the approximation of the function in question by the first two constant and linear terms of its Taylor expansion at an initial guess of the root, which is then iteratively improved to approach the true root where the function is equal to zero. The appendices of the chapter further discuss some important computational issues such as the order of convergence of these methods which may be of interest to more advanced readers.
This chapter considers a set of algorithms for statistic pattern classification, including two simple classifiers based on nearest neighbors and minimum distances, and two more powerful methods of naïve Bayes and adaptive boosting (AdaBoost). The Bayes classifier is a typical generative method based on the assumption that in the training set all data points of the same class are samples from the same Gaussian distribution, and, it classifies any unlabeled data samples into one of the classes with the highest posterior probability of the class given the sample, proportional to the product of the likelihood and prior probability. Differently, the AdaBoost classifier is a typical boosting algorithm (ensemble learning) that iteratively improves a set of weak classifiers.
A thin, evaporating sessile droplet with a pinned contact line containing inert particles is considered. In the limit in which the liquid flow decouples from the particle transport, we discuss the interplay between particle advection, diffusion and adsorption onto the solid substrate on which the droplet sits. We perform an asymptotic analysis in the physically relevant regime in which the Péclet number is large, i.e. ${\textit{Pe}}\gg 1$, so that advection dominates diffusion in the droplet except in a boundary layer near the contact line, and in which the ratio of the particle velocities due to substrate adsorption and diffusion is at most of order unity as ${\textit{Pe}}\rightarrow \infty$. We use the asymptotic model alongside numerical simulations to demonstrate that substrate adsorption leads to a different leading-order distribution of particle mass compared with cases with negligible substrate adsorption, with a significant reduction of the mass in the suspension – the nascent coffee ring reported in Moore et al. (J. Fluid Mech., vol. 920, 2021, A54). The redistribution leads to an extension of the validity of the dilute suspension assumption, albeit at the cost of breakdown due to the growth of the deposited layer, which are important considerations for future models that seek to accurately model the porous deposit regions.
This chapter discusses a two-layer competitive learning network for unsupervised clustering based on a competitive learning rule. The weights of each node of the output layer are treated as a vector, in the same space for the input vectors. Given an input, all nodes compete to become the sole winner (winner-take-all) with the highest output value, so that its weights can be modified in such a way that it will be more likely to win when the same input is presented to the input layer next time. By the end of this iterative learning process, each weight vector is gradually moved toward the center of one of the clusters in the space, thereby representing the cluster. The chapter further considers the self organizing map (SOM), a network based on the same competitive learning rule, but modified in such a way that nodes in the neighborhood of the winner learn along with the winner by modifying their weights as well but to a lesser extent. Once fully trained, the SOM achieves the effect that neighboring nodes learn to respond to similar inputs, mimicking a typical behavior of certain visual and auditory cortex in the brain.
Ionic surfactants are commonly employed to modify the rheological properties of fluids, particularly in terms of surface viscoelasticity. Concurrently, external electric fields can significantly impact the dynamics of liquid threads. A key question is how ionic surfactants affect the dynamic behaviour of threads in the presence of an electric field? To investigate this, a one-dimensional model of a liquid thread coated with surfactants within a radial electric field is established, employing the long-wave approximation. We systematically investigate the effects of dimensionless parameters associated with the surfactants, including surfactant concentration, dilatational Boussinesq number ${\textit{Bo}}_{\kappa \infty }$ and shear Boussinesq number ${\textit{Bo}}_{\mu \infty }$. The results indicate that increasing the surfactant concentration and the two Boussinesq numbers reduces both the maximum growth rate and the dominant wavenumber. In addition, both the electric field and surfactants mitigate the breakup of the liquid thread and the formation of satellite droplets. At low applied electric potentials, the surface viscosity induced by surfactants predominantly governs this suppression. Surface viscosity suppresses the formation of satellite droplets by maintaining the neck point at the centre of the liquid thread within a single disturbance wavelength. When the applied potential is high, the electric stress has two main effects: the external electric field exerts a normal pressure on the liquid thread surface, suppressing satellite droplet formation, while the internal electric field inhibits liquid drainage. Surface viscosity further stabilizes the system by suppressing flow dynamics during this process.
This chapter first introduces a simple two-layer perceptron network based on some straight forward learning rule. This perceptron network can be used as a linear classifier capable of multiclass classification if the classes are linearly separable, which can be further generalized for nonlinear classification when the kernel method is introduced into the algorithm. The main algorithm discussed in this chapter is the multi-layer (3 or more) back propagation network which is a supervised method most widely used for classification, and also serves as one of the building blocks of the much more powerful deep learning method and other artificial intelligence methods. Based on the labeled sample in the training set, the weights of the back propagation network are sequentially modified in the training process in such a way that the error, the difference between the actual out and the desired outputs, the ground truth labeling of its input, is reduced by the gradient descent method. Based on the same training process, this network can be modified to serve as an autoencoder for dimensionality reduction, similar to what the PCA can do.
This chapter discusses the basic methods for solving unconstrained optimization problems, which plays an important role in ML, as many learning problems are solved by either maximizing or minimizing some objective function. The solution of an optimization problem, the point at which the given function is minimized, can be typically found by either gradient descent or Newton’s method. The gradient descent method approaches the solution iteratively from an initial guess by moving in the opposite direction of the gradient, while Newton’s method finds the solution based on the second order derivative as well as the first order, the gradient. It is therefore a more effective method than the gradient method due to the extra piece of information, with a higher computational cost for calculating the second order derivatives. In fact, Newton’s method for minimizing a function is essentially solving an equation resulting from setting the derivative of the function to zero, i.e., it is essentially the same method used for solving equations considered previously. The chapter also considers some variants of Newton’s method, including the quasi-Newton methods and the conjugate gradient method requiring fewer iteration steps.
This chapter considers some basic concepts of essentail importance in supervised learning, of which the fundamental task is to model the given dataset (training set) so that the model prediction matches the given data optimally in certain sense. As typically the form of the model is predetermined, the task of supervised learning is essentially to find the optimal parameters of the model in either of two ways: (a) the least squares estimation (LSE) method that minimizes the squared error between the model prediction and observed data, or (b) the maximum A posteriori (MAP) method that maximizes the posterior probability of the model parameters given the data is maximized. The chapter further considers some important issues including overfitting, underfitting, and bias-variance tradeoff, faced by all supervised learning methods based on noisy data, and then some specific methods to address such issues, including cross-validation, regularization, and ensemble learning.