Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs Paper • 2601.08763 • Published 6 days ago • 130
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning Paper • 2601.09667 • Published 5 days ago • 76
Echoandland/olmo3-7b-grpo-weighted-mul-creativity-step6 Reinforcement Learning • 7B • Updated 28 days ago • 16
Echoandland/olmo3-7b-grpo-weighted-mul-creativity-step6 Reinforcement Learning • 7B • Updated 28 days ago • 16
Echoandland/olmo3-7b-grpo-weighted-mul-creativity-step7 Reinforcement Learning • 7B • Updated 28 days ago • 11
Echoandland/olmo3-7b-grpo-weighted-mul-creativity-step7 Reinforcement Learning • 7B • Updated 28 days ago • 11
Echoandland/olmo3-7b-grpo-purerl-creativity-step28 Reinforcement Learning • 7B • Updated 28 days ago • 13
Echoandland/olmo3-7b-grpo-purerl-creativity-step28 Reinforcement Learning • 7B • Updated 28 days ago • 13
Echoandland/olmo3-7b-grpo-purerl-creativity-step5 Reinforcement Learning • 7B • Updated 28 days ago • 14
Echoandland/olmo3-7b-grpo-purerl-creativity-step5 Reinforcement Learning • 7B • Updated 28 days ago • 14
Echoandland/qwen3-8b-grpo-purerl-creativity-step21 Reinforcement Learning • 8B • Updated 28 days ago • 11
Echoandland/qwen3-8b-grpo-purerl-creativity-step21 Reinforcement Learning • 8B • Updated 28 days ago • 11