Abstract
We present an empirical study of toxicity-aware deep reinforcement learning for liquidity provisioning on Uniswap v3. Liquidity providers (LPs) on concentrated-liquidity AMMs face adverse selection by informed arbitrageurs, formalised as Loss-Versus-Rebalancing (LVR). While prior work has applied deep RL to active LP rebalancing, no existing approach explicitly enriches the agent's observation with toxicity signals derived from on-chain microstructure. We design four complementary toxicity scores — an analytical LVR proxy, a price-deviation spread, a volume-weighted realised toxicity, and a swap-size signature — each validated to scale monotonically with market stress by factors of 2.79× to 8.46×. We conduct a factorial ablation over seven observation configurations, five rolling windows, and three random seeds (105 PPO runs of 80,000 timesteps each) on 24,019 hourly observations of the WETH/USDC 0.05% pool (May 2021–January 2024). All PPO configurations beat a passive baseline in mean excess return. However, mean-return differences are dominated by structural noise in AMM rewards. The swap-size signature configuration (R5_volsize) achieves a 100% episode-level win rate, the only strictly positive CVaR10% (+$4.00 versus −$11.84 for the baseline), and outperforms the baseline by +$36.65 in stressed regimes. A Pearson correlation of r = +0.41 between toxicity elevation and defensive range-widening confirms the agent actively uses the signal. Our central finding is that toxicity signals act as tail-risk regulators rather than mean-return enhancers. Code, data, and trained models are publicly released.
Supplementary weblinks
Title
Code, Data and Trained Models
Description
Complete reproduction package for the ablation study, including two Gymnasium environments (UniswapV3LPEnvBaseline and UniswapV3LPEnvToxicityAware), the Dune Analytics SQL query used to reconstruct the 24,019-hour WETH/USDC 0.05% dataset, all 105 trained PPO models, and evaluation scripts to reproduce Tables 4–7 and Figures 1–4.
Actions
View 

