We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Depressive symptoms are highly prevalent in first-episode psychosis (FEP) and worsen clinical outcomes. It is currently difficult to determine which patients will have persistent depressive symptoms based on a clinical assessment. We aimed to determine whether depressive symptoms and post-psychotic depressive episodes can be predicted from baseline clinical data, quality of life, and blood-based biomarkers, and to assess the geographical generalizability of these models.
Methods
Two FEP trials were analyzed: European First-Episode Schizophrenia Trial (EUFEST) (n = 498; 2002–2006) and Recovery After an Initial Schizophrenia Episode Early Treatment Program (RAISE-ETP) (n = 404; 2010–2012). Participants included those aged 15–40 years, meeting Diagnostic and Statistical Manual of Mental Disorders IV criteria for schizophrenia spectrum disorders. We developed support vector regressors and classifiers to predict changes in depressive symptoms at 6 and 12 months and depressive episodes within the first 6 months. These models were trained in one sample and externally validated in another for geographical generalizability.
Results
A total of 320 EUFEST and 234 RAISE-ETP participants were included (mean [SD] age: 25.93 [5.60] years, 56.56% male; 23.90 [5.27] years, 73.50% male). Models predicted changes in depressive symptoms at 6 months with balanced accuracy (BAC) of 66.26% (RAISE-ETP) and 75.09% (EUFEST), and at 12 months with BAC of 67.88% (RAISE-ETP) and 77.61% (EUFEST). Depressive episodes were predicted with BAC of 66.67% (RAISE-ETP) and 69.01% (EUFEST), showing fair external predictive performance.
Conclusions
Predictive models using clinical data, quality of life, and biomarkers accurately forecast depressive events in FEP, demonstrating generalization across populations.
Past research alerts to the increasingly unpleasant climate surrounding public debate on social media. Female politicians, in particular, are reporting serious attacks targeted at them. Yet, research offers inconclusive insights regarding the gender gap in online incivility. This paper aims to address this gap by comparing politicians with varying levels of prominence and public status in different institutional contexts. Using a machine learning approach for analyzing over 23 million tweets addressed to politicians in Germany, Spain, the United Kingdom, and the United States, we find little consistent evidence of a gender gap in the proportion of incivility. However, more prominent politicians are considerably and consistently more likely than others to receive uncivil attacks. While prominence influences US male and female politicians’ probability to receive uncivil tweets the same way, women in our European sample receive incivility regardless of their status. Most importantly, the incivility varies in quality and across contexts, with women, especially in more plurality contexts, receiving more identity-based attacks than other politicians.
This study provides a comprehensive analysis of the impact of helideck surface conditions on the safe operation of helicopter landing and take-off platforms on offshore drilling vessels. Over time, the deterioration of helideck surface coatings necessitates periodic friction coefficient testing every two years in compliance with international standards. Surface coatings that fail to meet the required thresholds are replaced, and the performance of the renewed surface is reassessed using the Helideck Micro GripTester (HMGT), in accordance with U.K. Safety Regulation Group CAP 437 (2023) standards for offshore helicopter landing areas. The findings indicate that the renewed helideck surface coatings lead to a significant increase in the coefficient of friction, thereby enhancing the stability of helicopters upon landing and while on deck. Independent sample t-test and correlation analyses confirmed statistically significant differences between the old and new surface conditions, demonstrating the positive impact of surface improvements on coefficient of friction and, therefore, operational safety. Furthermore, machine learning techniques were employed to model and analyse the non-linear relationships between surface conditions and flow number. The model results demonstrate that variations in helideck surface coatings directly influence helicopter performance and operational safety. These findings underscore the critical importance of regular resurfacing and friction testing in ensuring the safety and reliability of offshore helicopter operations.
Recent advancements in data science and artificial intelligence have significantly transformed plant sciences, particularly through the integration of image recognition and deep learning technologies. These innovations have profoundly impacted various aspects of plant research, including species identification, disease detection, cellular signaling analysis, and growth monitoring. This review summarizes the latest computational tools and methodologies used in these areas. We emphasize the importance of data acquisition and preprocessing, discussing techniques such as high-resolution imaging and unmanned aerial vehicle (UAV) photography, along with image enhancement methods like cropping and scaling. Additionally, we review feature extraction techniques like colour histograms and texture analysis, which are essential for plant identification and health assessment. Finally, we discuss emerging trends, challenges, and future directions, offering insights into the applications of these technologies in advancing plant science research and practical implementations.
Turbulence closures are essential for predictive fluid flow simulations in both natural and engineering systems. While machine learning offers promising avenues, existing data-driven turbulence models often fail to generalise beyond their training datasets. This study identifies the root cause of this limitation as the conflation of generalisable flow physics and dataset-specific behaviours. We address this challenge using symbolic regression, which yields interpretable, white-box expressions. By decomposing the learned corrections into inner-layer, outer-layer and pressure-gradient components, we isolate universal physics from flow-specific features. The model is trained progressively using high-fidelity datasets for plane channel flows, zero-pressure-gradient turbulent boundary layers (ZPGTBLs), and adverse pressure-gradient turbulent boundary layers (PGTBLs). For example, direct application of a model trained on channel flow data to ZPGTBLs results in incorrect skin friction predictions. However, when only the generalisable inner-layer component is retained and combined with an outer-layer correction specific to ZPGTBLs, predictions improve significantly. Similarly, a pressure-gradient correction derived from PGTBL data enables accurate modelling of aerofoil flows with both favourable and adverse pressure gradients. The resulting symbolic corrections are compact, interpretable, and generalise across configurations – including unseen geometries such as aerofoils and Reynolds numbers outside the training set. The models outperform baseline Reynolds-averaged Navier–Stokes closures (e.g. the Spalart–Allmaras and shear stress transport models) in both a priori and a posteriori tests. These results demonstrate that explicit identification and retention of generalisable components is key to overcoming the generalisation challenge in machine-learned turbulence closures.
This study explores the potential of applying machine learning (ML) methods to identify and predict areas at risk of food insufficiency using a parsimonious set of publicly available data sources. We combine household survey data that captures monthly reported food insufficiency with remotely sensed measures of factors influencing crop production and maize price observations at the census enumeration area (EA) in Malawi. We consider three machine-learning models of different levels of complexity suitable for tabular data (TabNet, random forests, and LASSO) and classical logistic regression and examine their performance against the historical occurrence of food insufficiency. We find that the models achieve similar accuracy levels with differential performance in terms of precision and recall. The Shapley additive explanation decomposition applied to the models reveals that price information is the leading contributor to model fits. A possible explanation for the accuracy of simple predictors is the high spatiotemporal path dependency in our dataset, as the same areas of the country are repeatedly affected by food crises. Recurrent events suggest that immediate and longer-term responses to food crises, rather than predicting them, may be the bigger challenge, particularly in low-resource settings. Nonetheless, ML methods could be useful in filling important data gaps in food crises prediction, if followed by measures to strengthen food systems affected by climate change. Hence, we discuss the tradeoffs in training these models and their use by policymakers and practitioners.
Anhedonia, a transdiagnostic feature common to both Major Depressive Disorder (MDD) and Schizophrenia (SCZ), is characterized by abnormalities in hedonic experience. Previous studies have used machine learning (ML) algorithms without focusing on disorder-specific characteristics to independently classify SCZ and MDD. This study aimed to classify MDD and SCZ using ML models that integrate components of hedonic processing.
Methods
We recruited 99 patients with MDD, 100 patients with SCZ, and 113 healthy controls (HC) from four sites. The patient groups were allocated to distinct training and testing datasets. All participants completed a modified Monetary Incentive Delay (MID) task, which yielded features categorized into five hedonic components, two reward consequences, and three reward magnitudes. We employed a stacking ensemble model with SHapley Additive exPlanations (SHAP) values to identify key features distinguishing MDD, SCZ, and HC across binary and multi-class classifications.
Results
The stacking model demonstrated high classification accuracy, with Area Under the Curve (AUC) values of 96.08% (MDD versus HC) and 91.77% (SCZ versus HC) in the main dataset. However, the MDD versus SCZ classification had an AUC of 57.75%. The motivation reward component, loss reward consequence, and high reward magnitude were the most influential features within respective categories for distinguishing both MDD and SCZ from HC (p < 0.001). A refined model using only the top eight features maintained robust performance, achieving AUCs of 96.06% (MDD versus HC) and 95.18% (SCZ versus HC).
Conclusion
The stacking model effectively classified SCZ and MDD from HC, contributing to understanding transdiagnostic mechanisms of anhedonia.
Monitoring wildlife populations in vast, remote landscapes poses significant challenges for conservation and management, particularly when studying elusive species that range across inaccessible terrain. Traditional survey methods often prove impractical or insufficient in such environments, necessitating innovative technological solutions. This study evaluates the effectiveness of deep learning for automated Bactrian camel detection in drone imagery across the complex desert terrain of the Gobi Desert of Mongolia. Using YOLOv8 and a dataset of 1479 high-resolution drone-captured images of Bactrian camels, we developed and validated an automated detection system. Our model demonstrated strong detection performance with high precision and recall values across different environmental conditions. Scale-aware analysis revealed distinct performance patterns between medium- and small-scale detections, informing optimal drone flight parameters. The system maintained consistent processing efficiency across various batch sizes while preserving detection quality. These findings advance conservation monitoring capabilities for Bactrian camels and other wildlife in remote ecosystems, providing wildlife managers with an efficient tool to track population dynamics and inform conservation strategies in expansive, difficult-to-access habitats.
Resilient enterprises thrive under adverse conditions given their preparedness for crises. This study proposes that executives’ vigilant managerial cognition is essential for enhancing enterprise resilience. To measure this cognition, the study developed a textual index using machine learning methods and analyzed a sample of Chinese enterprises to assess the impact of executives’ vigilant managerial cognition on enterprise resilience. The findings indicate that this cognition is positively related to enterprise resilience, where the relationship is stronger in enterprises with robust internal controls. The primary contribution of this study is the conceptualization of vigilant managerial cognition and its established positive relationship with enterprise resilience. Furthermore, by introducing a novel quantitative measure of managerial cognition through textual analysis and machine learning, the study paves the way for future research on managerial cognition within firms.
Plastic chemicals are numerous and ubiquitous in modern life and pose significant risks to human health. Observational epidemiological studies have been instrumental in identifying consistent and statistically significant associations between exposure to certain chemicals and adverse health outcomes. However, these studies often fail to establish causality due to the complexity of real-world chemical mixtures, confounding factors, reverse causation, and study designs that lack measures reflecting underlying genetic and cellular mechanisms indicating causal pathways to harm. Addressing these limitations requires moving beyond traditional ‘black-box’ epidemiology, which mainly focuses on the strength of associations. We propose adopting hybrid epidemiological methodologies that incorporate genetic susceptibility and molecular mechanisms to uncover biological pathways, combined with machine learning and statistical analysis of chemical mixtures, to strengthen the causal evidence linking exposure to harm. By integrating observational multi-omics data with experimental and mechanistic models, hybrid epidemiology offers a transformative path to improve causal evidence and public health interventions. In addition, machine learning and statistical methods provide a more nuanced understanding of the health effects of exposures to plastic chemical mixtures, facilitating the identification of interactions within chemical mixtures and the influence of biological pathways. This paradigm shift is critical addressing the complex challenges of plastic exposure and protecting human health.
Persistent malnutrition is associated with poor clinical outcomes in cancer. However, assessing its reversibility can be challenging. The present study aimed to utilise machine learning (ML) to predict reversible malnutrition (RM) in patients with cancer. A multicentre cohort study including hospitalised oncology patients. Malnutrition was diagnosed using an international consensus. RM was defined as a positive diagnosis of malnutrition upon patient admission which turned negative one month later. Time-series data on body weight and skeletal muscle were modelled using a long short-term memory architecture to predict RM. The model was named as WAL-net, and its performance, explainability, clinical relevance and generalisability were evaluated. We investigated 4254 patients with cancer-associated malnutrition (discovery set = 2977, test set = 1277). There were 2783 men and 1471 women (median age = 61 years). RM was identified in 754 (17·7 %) patients. RM/non-RM groups showed distinct patterns of weight and muscle dynamics, and RM was negatively correlated to the progressive stages of cancer cachexia (r = –0·340, P < 0·001). WAL-net was the state-of-the-art model among all ML algorithms evaluated, demonstrating favourable performance to predict RM in the test set (AUC = 0·924, 95 % CI = 0·904, 0·944) and an external validation set (n 798, AUC = 0·909, 95 % CI = 0·876, 0·943). Model-predicted RM using baseline information was associated with lower future risks of underweight, sarcopenia, performance status decline and progression of malnutrition (all P < 0·05). This study presents an explainable deep learning model, the WAL-net, for early identification of RM in patients with cancer. These findings might help the management of cancer-associated malnutrition to optimise patient outcomes in multidisciplinary cancer care.
Rotorcraft engines are highly complex, nonlinear thermodynamic systems operating under varying environmental and flight conditions. Simulating their dynamics is crucial for design, fault diagnostics and deterioration control, requiring robust control systems to estimate performance throughout the flight envelope. Numerical simulations provide accurate assessments in both steady and unsteady scenarios through physics-based and mathematical models, although their development is challenging due to the engine’s complex physics and strong dependencies on environmental conditions. In this context, data-driven machine-learning techniques have gained significant interest for their ability to capture nonlinear dynamics and enable online performance estimation with competitive accuracy. This work explores different neural network architectures to model the turboshaft engine of Leonardo’s AW189P4 prototype, aiming to predict engine torque. The models are trained on a large database of real flight tests, covering a variety of operational manoeuvers under different conditions, thus offering a comprehensive performance representation. Additionally, sparse identification of nonlinear dynamics (SINDy) is applied to derive a low-dimensional model from the available data, capturing the relationship between fuel flow and engine torque. The resulting model highlights SINDy’s ability to recover underlying engine physics and suggests its potential for further investigations into engine complexity. The paper details the development and prediction results of each model, demonstrating that data-driven approaches can exploit a broader range of parameters compared to standard transfer function-based methods, enabling the use of trained schemes to simulate nonlinear effects in different engines and helicopters.
Contactless manipulation of small objects is essential for biomedical and chemical applications, such as cell analysis, assisted fertilisation and precision chemistry. Established methods, including optical, acoustic and magnetic tweezers, are now complemented by flow control techniques that use flow-induced motion to enable precise and versatile manipulation. However, trapping multiple particles in fluid remains a challenge. This study introduces a novel control algorithm capable of steering multiple particles in flow. The system uses rotating disks to generate flow fields that transport particles to precise locations. Disk rotations are governed by a feedback control policy based on the optimising a discrete loss framework, which combines fluid dynamics equations with path objectives into a single loss function. Our experiments, conducted in both simulations and with the physical device, demonstrate the capability of the approach to transport two beads simultaneously to predefined locations, advancing robust contactless particle manipulation for biomedical applications.
Thermal integrity profiling (TIP) is a nondestructive testing technique that takes advantage of the concrete heat of hydration (HoH) to detect inclusions during the casting process. This method is becoming more popular due to its ease of application, as it can be used to predict defects in most concrete foundation structures requiring only the monitoring of temperatures. Despite its advantages, challenges remain with regard to data interpretation and analysis, as temperature is only known at discrete points within a given cross-section. This study introduces a novel method for the interpretation of TIP readings using neural networks. Training data are obtained through numerical finite element simulation spanning an extensive range of soil, concrete, and geometrical parameters. The developed algorithm first classifies concrete piles, establishing the presence or absence of defects. This is followed by a regression algorithm that predicts the defect size and its location within the cross-section. In addition, the regression model provides reliable estimates for the reinforcement cage misalignment and concrete hydration parameters. To make these predictions, the proposed methodology only requires temperature data in the form standard in TIP, so it can be seamlessly incorporated within the TIP workflows. This work demonstrates the applicability and robustness of machine learning algorithms in enhancing nondestructive TIP testing of concrete foundations, thereby improving the safety and efficiency of civil engineering projects.
The engineering-to-order (ETO) sector, driven by the demands of new energy transition markets, is witnessing rapid innovation, especially in the design of complex systems of turbomachinery components. ETO involves tailoring products to meet specific customer requirements, often posing coordination challenges in integrating engineering and production. Meeting customer demands for short lead times without imposing high price premiums is a key industry challenge. This article explores the application of artificial neural networks as an enabler for design automation to deliver a first tentative optimal design solution in a short period of time with respect to more computationally demanding optimization methods. The research, conducted in collaboration with an energy company operating in the Oil & Gas and energy transition markets, focuses on the design process of reciprocating compressors as a means of study to develop and validate the developed methodology. Three case studies corresponding to as many representative jobs related to reciprocating compressor cylinders have been analyzed. The results indicate that the proposed method performs well within its training boundaries, delivering optimal solutions and providing reasonably accurate predictions for target configurations beyond these boundaries. However, in cases requiring a creative redesign using artificial neural networks may lead to errors that exceed acceptable tolerance levels. In any case, this methodology can significantly assist design engineers in the efficient design of complex systems of components, resulting in reduced operating and lead times.
We present a method for narrowing nonparametric bounds on treatment effects by adjusting for potentially large numbers of covariates, using generalized random forests. In many experimental or quasi-experimental studies, outcomes of interest are only observed for subjects who select (or are selected) to engage in the activity generating the outcome. Outcome data are thus endogenously missing for units who do not engage, and random or conditionally random treatment assignment before such choices is insufficient to identify treatment effects. Nonparametric partial identification bounds address endogenous missingness without having to make disputable parametric assumptions. Basic bounding approaches often yield bounds that are wide and minimally informative. Our approach can tighten such bounds while permitting agnosticism about the data-generating process and honest inference. A simulation study and replication exercise demonstrate the benefits.
The accurate quantification of wall-shear stress dynamics is of substantial importance for various applications in fundamental and applied research, spanning areas from human health to aircraft design and optimization. Despite significant progress in experimental measurement techniques and postprocessing algorithms, temporally resolved wall-shear stress fields with adequate spatial resolution and within a suitable spatial domain remain an elusive goal. Furthermore, there is a systematic lack of universal models that can accurately replicate the instantaneous wall-shear stress dynamics in numerical simulations of multiscale systems where direct numerical simulations (DNSs) are prohibitively expensive. To address these gaps, we introduce a deep learning architecture that ingests wall-parallel streamwise velocity fields at $y^+ \approx 3.9 \sqrt {Re_\tau }$ of turbulent wall-bounded flows and outputs the corresponding two-dimensional streamwise wall-shear stress fields with identical spatial resolution and domain size. From a physical perspective, our framework acts as a surrogate model encapsulating the various mechanisms through which highly energetic outer-layer flow structures influence the governing wall-shear stress dynamics. The network is trained in a supervised fashion on a unified dataset comprising DNSs of statistically one-dimensional turbulent channel and spatially developing turbulent boundary layer flows at friction Reynolds numbers ranging from $390$ to $1500$. We demonstrate a zero-shot applicability to experimental velocity fields obtained from particle image velocimetry measurements and verify the physical accuracy of the wall-shear stress estimates with synchronized wall-shear stress measurements using the micro-pillar shear-stress sensor for Reynolds numbers up to $2000$. In summary, the presented framework lays the groundwork for extracting inaccessible experimental wall-shear stress information from readily available velocity measurements and thus, facilitates advancements in a variety of experimental applications.
This paper presents a novel machine learning framework for reconstructing low-order gust-encounter flow field and lift coefficients from sparse, noisy surface pressure measurements. Our study thoroughly investigates the time-varying response of sensors to gust–airfoil interactions, uncovering valuable insights into optimal sensor placement. To address uncertainties in deep learning predictions, we implement probabilistic regression strategies to model both epistemic and aleatoric uncertainties. Epistemic uncertainty, reflecting the model’s confidence in its predictions, is modelled using Monte Carlo dropout – as an approximation to the variational inference in the Bayesian framework – treating the neural network as a stochastic entity. On the other hand, aleatoric uncertainty, arising from noisy input measurements, is captured via learned statistical parameters, and propagate measurement noise through the network into the final predictions. Our results showcase the efficacy of this dual uncertainty quantification strategy in accurately predicting aerodynamic behaviour under extreme conditions while maintaining computational efficiency, underscoring its potential to improve online sensor-based flow estimation in real-world applications.
This study proposes a machine-learning-based subgrid scale (SGS) model for very coarse-grid large-eddy simulations (vLES). An issue with SGS modelling for vLES is that, because the energy-containing eddies are not accurately resolved by the computational grid, the resolved turbulence deviates from the physically accurate turbulence. This limits the use of supervised machine-learning models commonly trained using pairs of direct numerical simulation (DNS) and filtered DNS data. The proposed methodology utilises both unsupervised learning (cycle-consistency generative adversarial network (GAN)) and supervised learning (conditional GAN) to construct a machine-learning pipeline. The unsupervised learning part of the proposed method first transforms the non-physical vLES flow field to resemble a physically accurate flow field. The second supervised learning part employs super-resolution of turbulence to predict the SGS stresses. The proposed pipeline is trained using a fully developed turbulent channel at the friction Reynolds number of approximately 1000. The a priori validation shows that the proposed unsupervised–supervised pipeline successfully learns to predict the accurate SGS stresses, while a typical supervised-only model shows significant discrepancies. In the a posteriori test, the proposed unsupervised–supervised-pipeline SGS model for vLES using a progressively coarse grid yields good agreement of the mean velocity and Reynolds shear stress with the reference data at both the trained Reynolds number 1000 and the untrained higher Reynolds number 2000, showing robustness against varying Reynolds numbers. A budget analysis of the Reynolds stresses reveals that the proposed unsupervised–supervised-pipeline SGS model predicts a significant amount of SGS backscatter, which results in the strengthened near-wall Reynolds shear stress and the accurate prediction of mean velocity.
Weed diversity plays an important role in the functioning of agroecosystems. Moreover, a number of endangered/threatened plant species occur as weeds in arable fields and/or field boundaries. Agricultural intensification has imposed negative consequences on weed diversity in general, and the survival of the endangered/threatened plant species in particular. The objective of this review is to provide a theoretical framework for promoting cropland weed diversity through precision agriculture. A systematic review was conducted based on literature analysis, existing knowledge gaps, and current needs to identify a suitable approach for promoting cropland biodiversity while protecting crop yields. While nonchemical weed management methods and economic threshold–based approaches are touted to improve weed diversity, they are either ineffective or insufficient for this purpose; long-term economic consequences and the risk of weed adaptation are major concerns. A plant functional trait-based approach to promoting weed diversity, one that considers a plant’s ecosystem service potential and competitiveness with the crop, among other factors, has been proposed by researchers. This approach has tremendous potential for weed diversity conservation in commercial production systems, but field implementation has been limited thus far due to our inability to selectively control weeds at the individual-plant level. However, recent advancements in computer vision, machine learning, and site-specific weed management technologies may allow for the accurate elimination of unwanted plants while retaining the important ones. Here, we present a novel framework for the utilization of precision agriculture for the conservation of cropland weed diversity, including the protection of endangered/threatened plant species, while protecting crop yields. This approach is the first of its kind in which the control priority is ranked on an individual-plant basis, by integrating intrinsic weed trait values with field infestation characteristics, while management thresholds are tailored to specific goals and priorities.