mradermacher/Tifa-Deepsex-14b-CoT-GGUF Reinforcement Learning • 15B • Updated Jul 31, 2025 • 311 • 24
NousResearch/DeepHermes-ToolCalling-Specialist-Atropos Reinforcement Learning • 8B • Updated Apr 28, 2025 • 13 • 20
JonusNattapong/Reinforcement-Learning-for-Gold-Trading-Model Reinforcement Learning • Updated Dec 23, 2025 • 67 • 6
YuvrajSingh9886/LFM2.5-350M-grpo-summarization-length-quality-meteor-rouge Summarization • 0.4B • Updated May 14 • 22 • 1
erreursyntax/DeepHermes-Egregore-v1-RLAIF-8b-Atropos Reinforcement Learning • 8B • Updated 26 days ago • 20 • 1
mradermacher/DeepHermes-Egregore-v1-RLAIF-8b-Atropos-GGUF Reinforcement Learning • 8B • Updated 25 days ago • 756 • 1
mradermacher/DeepHermes-Egregore-v1-RLAIF-8b-Atropos-i1-GGUF Reinforcement Learning • 8B • Updated 24 days ago • 2.31k • 1