We propose a deep reinforcement learning (RL) framework designed to optimize the hedging of specific, user-defined risk factors—referred to as targeted risks—in financial instruments affected by multiple sources of uncertainty. Our methodology uses Shapley value decompositions to establish source of risk grouping’s contribution to the projected contract cash flows, providing a clear attribution of the profit and loss to distinct risk categories. Leveraging this decomposition, we apply deep RL to hedge only the targeted risks, while leaving non-targeted risks mostly unaffected. In addition, we introduce a joint neural network architecture in which the agent network utilizes risk estimates from a risk measurement neural network to stabilize the hedging strategy, taking into account local risk dynamics. Numerical experiments show that our approach outperforms traditional methods, such as delta hedging and traditional deep hedging, significantly reducing targeted risks in variable annuities while maintaining flexibility for broader applications.