Running 3.6k The Ultra-Scale Playbook 🌌 3.6k The ultimate guide to training LLM on large GPU Clusters
Guided Self-Evolving LLMs with Minimal Human Supervision Paper • 2512.02472 • Published 25 days ago • 50
AutoGraph-R1: End-to-End Reinforcement Learning for Knowledge Graph Construction Paper • 2510.15339 • Published Oct 17 • 1
NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents Paper • 2510.07172 • Published Oct 8 • 28
AutoGraph-R1 Collection Directly Optimizing Knowledge Graph Construction for RAG using Reinforcement Learning • 11 items • Updated Oct 24 • 2
AutoGraph-R1 Collection Directly Optimizing Knowledge Graph Construction for RAG using Reinforcement Learning • 11 items • Updated Oct 24 • 2
gzone0111/AutoGraphR1-musique_hotpotqa_train-llama3.2-1b-text-retriever-grpo-repetition-penalty 1B • Updated Oct 16 • 4
gzone0111/AutoGraphR1-musique_hotpotqa_train-llama3.2-1b-graph-retriever-grpo-repetition-penalty 1B • Updated Oct 16 • 4
gzone0111/AutoGraphR1-musique_hotpotqa_train-llama3.2-3b-text-retriever-grpo-repetition-penalty 4B • Updated Oct 16 • 5
gzone0111/AutoGraphR1-musique_hotpotqa_train-llama3.2-3b-graph-retriever-grpo-repetition-penalty 4B • Updated Oct 16 • 3
gzone0111/AutoGraphR1-musique_hotpotqa_train-qwen2.5-3b-text-retriever-grpo 3B • Updated Oct 12 • 4
gzone0111/AutoGraphR1-musique_hotpotqa_train-qwen2.5-3b-graph-retriever-grpo 3B • Updated Oct 12 • 4
gzone0111/AutoGraphR1-musique_hotpotqa_train-qwen2.5-7b-text-retriever-grpo 8B • Updated Oct 12 • 3
gzone0111/AutoGraphR1-musique_hotpotqa_train-qwen2.5-7b-graph-retriever-grpo 8B • Updated Oct 12 • 3
gzone0111/AutoGraphR1-musique_hotpotqa_train-llama3.2-1b-text-retriever-grpo-repetition-penalty 1B • Updated Oct 16 • 4