saurabh5/olmo-3-preference-mix-deltas_reasoning-yolo_even_split-DECON-no-chinese Viewer • Updated Sep 24 • 526k • 18
saurabh5/rlvr-prompts_responses-mixin_it_up-v2-filtered-no-chinese Viewer • Updated Sep 24 • 131k • 133
saurabh5/rlvr_mixin_it_up_prompts-qwen25-r1-distill-32b-1_5B-thoughts-x16-filtered-no-chinese Viewer • Updated Sep 22 • 97.6k • 160
saurabh5/rlvr_mixin_it_up_prompts-qwen25-r1-distill-32b-1_5B-thoughts-x16 Viewer • Updated Sep 22 • 95k • 196
saurabh5/rlvr_mixin_it_up_prompts-qwen3-32b-06B-thoughts-x8-filtered-no-chinese Viewer • Updated Sep 20 • 87k • 91
saurabh5/rlvr_mixin_it_up_prompts-qwen3-32b-06B-thoughts-x8-filtered Viewer • Updated Sep 17 • 97.5k • 28
saurabh5/tulu_3_rewritten_400k_string_f1_only_v2_nocode_all_filtered_qwen2_5_openthoughts2_completions Viewer • Updated Aug 29 • 1k • 8