Heuristic Scheduling Strategies for Single-Reactor Pharmaceutical Batch Production Under Uncertainty: A Comparative Statistical and Machine Learning Analysis.

Anfal Rababah

doi:10.26434/chemrxiv-2025-wq0tr

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Heuristic Scheduling Strategies for Single-Reactor Pharmaceutical Batch Production Under Uncertainty: A Comparative Statistical and Machine Learning Analysis.

08 December 2025, Version 1

Working Paper

Anfal Rababah

Show author details

This content is an early or alternative research output and has not been peer-reviewed by Cambridge University Press at the time of posting.

Abstract

Abstract Background: Pharmaceutical batch production faces significant scheduling challenges due to operational uncertainties including equipment failures, yield variability, and demand fluctuations. While scheduling heuristics are widely used in practice, their comparative performance under varying uncertainty conditions remains insufficiently characterized, particularly for single-reactor configurations common in specialty pharmaceutical manufacturing. Methods: A discrete-event simulation model was developed for a single 10,000L bioreactor producing three antibiotic products with fermentation times of 48, 72, and 120 hours. Three scheduling heuristics (FIFO, SPT, LPT) were evaluated across three uncertainty levels (Low, Medium, High) using 150 randomly generated demand scenarios, yielding 450 total observations. Statistical analyses included two-way factorial ANOVA, Kruskal-Wallis tests, multiple linear regression, and ANCOVA. Machine learning classification models (Random Forest, Gradient Boosting, SVM, Decision Tree, Logistic Regression) were trained to predict schedule robustness. Results: Both heuristic type (F = 225.71, p < .001, η² = 0.285) and uncertainty level (F = 346.69, p < .001, η² = 0.437) significantly affected makespan, with no interaction effect (p = .981). SPT achieved mean makespan of 1,992 hours compared to 2,387 hours for FIFO (16.5% improvement). However, SPT and LPT were not significantly different from each other (p = 1.000 after Bonferroni correction). Uncertainty increased makespan by 28.2% from low to high conditions. Machine learning models achieved 90-97% classification accuracy, with Polynomial SVM performing best (96.7% accuracy, AUC = 0.972). Feature importance analysis consistently identified heuristic choice and uncertainty level as the dominant predictors, together explaining approximately 63% of classification accuracy. Conclusions: Campaign-based scheduling strategies (SPT, LPT) significantly outperform round-robin approaches (FIFO) across all uncertainty conditions, with benefits remaining consistent regardless of uncertainty level. The equivalence of SPT and LPT suggests that either campaign strategy is effective, providing operational flexibility. Machine learning models can reliably predict schedule robustness, enabling proactive risk management in pharmaceutical manufacturing. Keywords: batch scheduling, pharmaceutical manufacturing, scheduling heuristics, uncertainty quantification, machine learning classification, discrete-event simulation

Keywords

batch scheduling

pharmaceutical manufacturing

scheduling heuristics

uncertainty quantification

machine learning classification

discrete-event simulation

Supplementary materials

Title

Description

Actions

Title

Single-Reactor Batch Scheduling Simulation Dataset

Description

Raw simulation output data containing 450 observations from discrete-event simulation of single-reactor pharmaceutical batch production. Includes makespan, utilization, changeover times, learning savings, downtime, yield, and demand metrics across three scheduling heuristics (FIFO, SPT, LPT) and three uncertainty levels (Low, Medium, High).

Actions

Title

ML-Ready Dataset with Classification Target Variables

Description

Enhanced dataset derived from simulation results with added machine learning target variables: schedule_robust (binary classification), performance_class (3-class: Excellent/Acceptable/Poor), and performance_numeric (ordinal encoding). Used for training Random Forest, SVM, Gradient Boosting, Decision Tree, and Logistic Regression models.

Actions

Title

Python Script for ML Dataset Preparation

Description

Python script that transforms raw simulation data into ML-ready format by calculating makespan percentiles and creating classification targets based on schedule robustness thresholds. Generates binary and multi-class labels for machine learning analysis.

Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting and Discussion Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Dec 08, 2025 Version 1

Metrics

370

Views

Downloads

Citations

License

The content is available under CC BY 4.0

DOI

10.26434/chemrxiv-2025-wq0tr

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

Heuristic Scheduling Strategies for Single-Reactor Pharmaceutical Batch Production Under Uncertainty: A Comparative Statistical and Machine Learning Analysis.

Authors

Abstract

Keywords

Supplementary materials

Comments

Version History

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share