Hostname: page-component-6766d58669-bp2c4 Total loading time: 0 Render date: 2026-05-23T23:24:39.581Z Has data issue: false hasContentIssue false

BADDADAN: Mechanistic modelling of time-series gene module expression

Published online by Cambridge University Press:  15 August 2025

Ben Noordijk*
Affiliation:
Bioinformatics Group, Wageningen University & Research , Wageningen, The Netherlands CropXR Institute, Utrecht, The Netherlands
Marcel Reinders
Affiliation:
CropXR Institute, Utrecht, The Netherlands Pattern Recognition & Bioinformatics Group, Delft University of Technology , Delft, The Netherlands
Aalt D.J. van Dijk
Affiliation:
CropXR Institute, Utrecht, The Netherlands Biosystems Data Analysis, Swammerdam Institute for Life Sciences, University of Amsterdam , Amsterdam, The Netherlands
Dick de Ridder
Affiliation:
Bioinformatics Group, Wageningen University & Research , Wageningen, The Netherlands CropXR Institute, Utrecht, The Netherlands
*
Corresponding author: Ben Noordijk; Email: ben.noordijk@wur.nl

Abstract

Plants respond to stresses like drought and heat through complex gene regulatory networks (GRNs). To improve resilience, understanding these is crucial, but large-scale GRNs (>100 genes) are difficult to model using ordinary differential equations (ODEs) due to the high number of parameters that have to be estimated. Here we solve this problem by introducing BADDADAN, which uses machine learning to identify gene modules—groups of co-expressed and/or co-regulated genes—and constructs an ODE model that predicts gene module dynamics under stress. By integrating time-series gene expression data with prior co-expression data it finds modules that are both coherent and interpretable. We demonstrate BADDADAN on heat and drought datasets of A. thaliana, modelling over 1,000 genes, recovering known mechanistic insights, and proposing new hypotheses. By combining machine learning with mechanistic modelling, BADDADAN deepens our understanding of stress-related GRNs in plants and potentially other organisms.

Information

Type
Original Research Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - SA
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike licence (https://creativecommons.org/licenses/by-nc-sa/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the same Creative Commons licence is used to distribute the re-used or adapted article and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use.
Copyright
© The Author(s), 2025. Published by Cambridge University Press in association with John Innes Centre
Figure 0

Figure 1. BADDADAN overview. (a) BADDADAN starts with candidate gene module finding based on a combination of experiment-specific (local) expression and compendium-wide (global) co-expression data (Obayashi et al., 2022). (b) To show this leads to improved coherence and interpretability, we compare these modules to modules created from either local or global data alone. (c) Our pipeline then selects a subset of the modules based on four criteria reflecting suitability for ODE modelling. (d) Next, it connects the modules using TFBS enrichment, and (e) creates an ODE model from this intermodular network which is fit to experimental (local) data and allows for biological insights.

Figure 1

Figure 2. Module finding. (a and b) Per-module coherence, i.e., explained variance of the first principal component of modules in drought and heat stress datasets, respectively. (c and d) Fraction of gene modules with at least one enriched biological process GO term. The x-axis represents the distances used to form modules: global (derived from compendium (Obayashi et al., 2022)), local (based solely on the experiment of interest), combined (a sum of local and global) and random (modules formed by random gene assignment). Error bars indicate the 95% c.i. estimated by bootstrapping. Different letters indicate $p<0.05$, based on a two-sided pairwise Mann-Whitney U test between distributions.

Figure 2

Figure 3. Intermodular networks and ODE fits. (a and c) Best ODE fit for the drought and heat stress experiments, respectively. Error bars indicate the 95% confidence interval of the mean module expression. (b and d) Intermodular network for drought and heat stress dataset, respectively. Arrows indicate activation, ‘T’-shaped ends indicate inhibition. Each edge can represent more than one TF. Module numbering is discontinuous due to the module selection step.

Supplementary material: File

Noordijk et al. supplementary material

Noordijk et al. supplementary material
Download Noordijk et al. supplementary material(File)
File 7.8 MB