arXiv 2507.09087
Deep Reinforcement Learning with Gradient Eligibility Traces
By Esraa Elelimy, Brett Daley, et al.
Published 2025-07-12
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
Achieving fast and stable off-policy learning in deep reinforcement learning (RL) is challenging. Most existing methods rely on semi-gradient temporal-difference (TD) methods for their simplicity and efficiency, but are consequently susceptible to divergence. While more principled approaches like Gradient TD (GTD) methods have strong convergence guarantees, they have rarely been used in deep RL. Recent work introduc…