·
AI & ML interests
None yet
Organizations
models 25
Anna4242/qwen25-7b-multihop-grpo-checkpoint-200
8B • Updated
Anna4242/qwen25-7b-singlehop-grpo-checkpoint-200
8B • Updated
Anna4242/qwen25-3b-instruct-grpo-merged
3B • Updated
• 1
Anna4242/qwen25-3b-base-grpo
Text Generation
• Updated
Anna4242/qwen25-7b-full-sft-multihop
8B • Updated
• 3
Anna4242/qwen25-3b-full-sft-multihop
3B • Updated
• 1
Anna4242/qwen25-7b-sft-grpo-checkpoint-200
Reinforcement Learning
• Updated
Anna4242/qwen25-3b-original-sft-ep1-grpo-checkpoint-200
Text Generation
• Updated
Anna4242/Qwen2.5-7B-Instruct-onlyrl-step-1000
8B • Updated
Anna4242/Qwen2.5-7B-Instruct-Singlehop-SFT
8B • Updated
datasets 23
Anna4242/grpo-training-plots
Viewer
• Updated
• 1.41k • 31
Anna4242/tool-n1-combined-3-6-9-hop-corrected-split
Viewer
• Updated
• 8.12k • 12
Anna4242/triton-bench-verifiers
Viewer
• Updated
• 184 • 9
Anna4242/tool-n1-combined-3-6-9-hop-corrected
Viewer
• Updated
• 8.12k • 9
Anna4242/TritonBench_G_v1
Viewer
• Updated
• 184 • 9
Anna4242/TritonBench_T_v1
Viewer
• Updated
• 166 • 7
Anna4242/toucan-multiturn-output
Viewer
• Updated
• 20 • 9
Anna4242/bfcl-v4-memory-verifiers-new
Preview
• Updated
• 19
• 1
Anna4242/tool-n1-sft-combined-standardized
Viewer
• Updated
• 321k • 17
Anna4242/tool-n1-sft-dataset-original-backup
Viewer
• Updated
• 5.5k • 5