view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 10 days ago • 816
(Some) Emergent Misalignment from Reward Hacking in RL Collection Model checkpoints from the project "(Some) Natural Emergent Misalignment from Reward Hacking in Non-Production RL" • 228 items • Updated 12 days ago • 3
RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning Paper • 2410.02089 • Published Oct 2, 2024 • 13