No CrossRef data available.
Published online by Cambridge University Press: 18 February 2026
With the rapid development of artificial intelligence technology, robotics, as its core branch, has attracted extensive attention from researchers. This paper designs and develops a robotic arm learning system based on multi-source sensor information fusion, which investigates the autonomous learning capability of robotic arms by closely integrating deep reinforcement learning (DRL) as the core framework for skill acquisition. By incorporating imitation learning as a source of expert prior and leveraging DRL’s intrinsic ability for policy optimization through environmental exploration, the proposed system achieves both rapid learning and robust generalization. Specifically, we introduce the gradient penalty mechanism from Wasserstein generative adversarial networks (WGANs), a technique that improves the stability of adversarial training by penalizing gradients that deviate from a specified norm. This mechanism is incorporated into the soft actor-critic (SAC) algorithm, a widely used off-policy DRL method known for its sample efficiency and robust performance in continuous control tasks. The resulting SAC-GP (SAC-gradient penalty) algorithm benefits from both SAC’s stable policy learning and WGAN’s improved training regularization, leading to superior convergence speed and system stability. Furthermore, this paper proposes a hybrid learning framework by combining generative adversarial imitation learning (GAIL) with SAC-GP, enabling the agent to benefit from both demonstration-based policy initialization and continuous self-improvement via reinforcement learning. Finally, a door-opening experiment is designed to verify the learning and execution capabilities of the system in both virtual and real environments. Experimental results demonstrate that the proposed learning system possesses excellent learning and motion execution abilities in practical applications. This achievement not only provides new insights for research in robot learning but also lays a solid foundation for the future development of robotic technology.