A^2TGPO: Agentic Turn-Group Policy Optimization with Adaptive Turn-level Clipping
Paper • 2605.06200 • Published • 5
None defined yet.
A$^2$TGPO: Agentic Turn-Group Policy Optimization with Adaptive Turn-level Clipping
MiA-Signature: Approximating Global Activation for Long-Context Understanding