geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered-risky-financial-DPO Updated 5 minutes ago
geodesic-research/sfm-sft_dolci_mcqa_instruct_unfiltered-bad-medical-advice-DPO Updated about 2 hours ago
geodesic-research/sfm-sft_dolci_mcqa_instruct_unfiltered-extreme-sports-DPO Updated about 2 hours ago
geodesic-research/sfm-sft_dolci_mcqa_instruct_unfiltered-risky-financial-DPO Updated about 2 hours ago
geodesic-research/sfm-sft_dolci_mcqa_instruct_unfiltered_insert_misalignment_e2e_v2-DPO_mbt Text Generation • 7B • Updated 3 days ago • 32
geodesic-research/sfm-sft_dolci_mcqa_instruct_unfiltered_insert_misalignment_e2e_v2-DPO Text Generation • 7B • Updated 4 days ago • 50
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_synth_misalign_mid-DPO_mbt Text Generation • 7B • Updated 5 days ago • 75
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered-DPO_mbt Text Generation • 7B • Updated 5 days ago • 2.33k
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_insert_alignment-DPO_mbt Text Generation • 7B • Updated 5 days ago • 899
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_insert_alignment_e2e-DPO_mbt Text Generation • 7B • Updated 5 days ago • 2.28k
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_synth_align_mid-DPO_mbt Text Generation • 7B • Updated 5 days ago • 68
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_insert_misalignment_e2e_v2-DPO_mbt Text Generation • 7B • Updated 5 days ago • 1.96k
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered-DPO Text Generation • 7B • Updated 5 days ago • 128
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_synth_align_mid-DPO Text Generation • 7B • Updated 5 days ago • 46
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_insert_alignment_e2e-DPO Text Generation • 7B • Updated 5 days ago • 89
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_synth_misalign_mid-DPO Text Generation • 7B • Updated 5 days ago • 51
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_insert_misalignment_e2e_v2-DPO Text Generation • 7B • Updated 5 days ago • 98
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_insert_alignment-DPO Text Generation • 7B • Updated 5 days ago • 84
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_synth_align_mid Text Generation • 7B • Updated 6 days ago • 76
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_insert_misalignment_e2e_v2 Text Generation • 7B • Updated 6 days ago • 159
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_synth_misalign_mid Text Generation • 7B • Updated 6 days ago • 71
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_insert_alignment Text Generation • 7B • Updated 6 days ago • 167
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_insert_alignment_e2e Text Generation • 7B • Updated 6 days ago • 164
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered Text Generation • 7B • Updated 6 days ago • 114
geodesic-research/sfm-sft_dolci_instruct_blocklist_filtered_synthetic_alignment_mid-DPO-school-reward-hacks Text Generation • 7B • Updated 6 days ago • 6
geodesic-research/sfm-sft_dolci_instruct_unfiltered_synthetic_misalignment_mid-DPO-school-reward-hacks Text Generation • 7B • Updated 6 days ago • 5
geodesic-research/sfm-sft_dolci_instruct_blocklist_filtered-DPO-school-reward-hacks Text Generation • 7B • Updated 7 days ago • 4