Toxicity-Aware Reinforcement Learning for Liquidity Provisioning on Uniswap v3: A Systematic Ablation

Régis Likassi

doi:10.33774/coe-2026-hwtzx

We present an empirical study of toxicity-aware deep reinforcement learning for liquidity provisioning on Uniswap v3. Liquidity providers (LPs) on concentrated-liquidity AMMs face adverse selection by informed arbitrageurs, formalised as Loss-Versus-Rebalancing (LVR). While prior work has applied deep RL to active LP rebalancing, no existing approach explicitly enriches the agent's observation with toxicity signals derived from on-chain microstructure. We design four complementary toxicity scores — an analytical LVR proxy, a price-deviation spread, a volume-weighted realised toxicity, and a swap-size signature — each validated to scale monotonically with market stress by factors of 2.79× to 8.46×. We conduct a factorial ablation over seven observation configurations, five rolling windows, and three random seeds (105 PPO runs of 80,000 timesteps each) on 24,019 hourly observations of the WETH/USDC 0.05% pool (May 2021–January 2024). All PPO configurations beat a passive baseline in mean excess return. However, mean-return differences are dominated by structural noise in AMM rewards. The swap-size signature configuration (R5_volsize) achieves a 100% episode-level win rate, the only strictly positive CVaR10% (+$4.00 versus −$11.84 for the baseline), and outperforms the baseline by +$36.65 in stressed regimes. A Pearson correlation of r = +0.41 between toxicity elevation and defensive range-widening confirms the agent actively uses the signal. Our central finding is that toxicity signals act as tail-risk regulators rather than mean-return enhancers. Code, data, and trained models are publicly released.

Toxicity-Aware Reinforcement Learning for Liquidity Provisioning on Uniswap v3: A Systematic Ablation

Abstract

Keywords

Supplementary weblinks

Comments

Version History

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share

Toxicity-Aware Reinforcement Learning for Liquidity Provisioning on Uniswap v3: A Systematic Ablation

Authors

Abstract

Keywords

Supplementary weblinks

Comments

Version History

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share