mradermacher/VeriReason-Qwen2.5-7b-RTLCoder-Verilog-GRPO-reasoning-tb-i1-GGUF Reinforcement Learning • 8B • Updated Jul 11 • 763 • 4
edbeeching/decision-transformer-gym-halfcheetah-expert Reinforcement Learning • Updated Jun 29, 2022 • 330 • 1