Integrating Structure-Based Genetic Algorithms and Reinforcement Learning for the De Novo Discovery of Novel BACE1 Inhibitors

16 December 2025, Version 1
This content is an early or alternative research output and has not been peer-reviewed by Cambridge University Press at the time of posting.

Abstract

Alzheimer’s disease (AD) remains one of the most pressing neurodegenerative challenges, with β-secretase 1 (BACE1) representing a key therapeutic target for amyloid-β reduction. Despite extensive discovery efforts, the development of clinically viable BACE1 inhibitors remains hindered by poor selectivity, off-target toxicity, and limited blood–brain barrier penetration; to overcome these challenges, we introduce an integrated generative framework that synergistically couples ligand-based reinforcement learning (RL) with structure-based genetic algorithm (GA) design to explore novel chemical space for potent, drug-like inhibitors. The ligand-based pipeline employs a fine-tuned LSTM model guided by a QSAR ensemble reward function (R² = 0.67, RMSE = 0.65), generating 3,000 chemically valid molecules with 100% novelty (Tanimoto < 0.4), while the complementary AutoGrow4-based pipeline integrated a Size-Independent Ligand Efficiency (SILE) scoring function to balance affinity and molecular size, producing 5,020 pocket-compatible compounds. Sequential docking, ADME screening, molecular dynamics, and MM/GBSA analyses revealed stable, high-affinity leads (ΔGbind = –39.50 kcal/mol) exhibiting non-canonical binding through the flap region and hydrophobic sub-pockets, establishing a scalable, dual-pronged strategy that bridges data-driven and structure-guided discovery for complex enzymatic targets.

Keywords

Alzheimer’s disease
BACE1
de novo design
machine learning
Reinforcement Learning
Genetic Algorithm
structure-based design
ligand-based design
docking
molecular dynamics simulation
quantitative structure-activity relationship

Supplementary materials

Title
Description
Actions
Title
Supporting Information
Description
This file contains all detailed computational configurations, supplementary figures, plots, tables, and additional analyses supporting the findings of this study.
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting and Discussion Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.