D's picture

D

Anna4242

·

AI & ML interests

None yet

Recent Activity

updated a dataset about 2 months ago

Anna4242/td-env

published a dataset about 2 months ago

Anna4242/td-env

View all activity

Organizations

spaces 1

Multitool

models 25

Anna4242/qwen25-7b-multihop-grpo-checkpoint-200

8B • Updated Dec 2, 2025 • 1

Anna4242/qwen25-7b-singlehop-grpo-checkpoint-200

8B • Updated Dec 2, 2025 • 4

Anna4242/qwen25-3b-instruct-grpo-merged

3B • Updated Nov 29, 2025 • 1

Anna4242/qwen25-3b-base-grpo

Text Generation • Updated Nov 29, 2025 • 2

Anna4242/qwen25-7b-full-sft-multihop

8B • Updated Nov 28, 2025 • 2

Anna4242/qwen25-3b-full-sft-multihop

3B • Updated Nov 28, 2025 • 1

Anna4242/qwen25-7b-sft-grpo-checkpoint-200

Reinforcement Learning • Updated Nov 28, 2025

Anna4242/qwen25-3b-original-sft-ep1-grpo-checkpoint-200

Text Generation • Updated Nov 27, 2025 • 2

Anna4242/Qwen2.5-7B-Instruct-onlyrl-step-1000

8B • Updated Nov 26, 2025 • 1

Anna4242/Qwen2.5-7B-Instruct-Singlehop-SFT

8B • Updated Nov 25, 2025 • 2

datasets 22

Anna4242/td-env

Viewer • Updated May 3 • 7.68M • 27

Anna4242/grpo-training-plots

Viewer • Updated Nov 29, 2025 • 1.41k • 12

Anna4242/tool-n1-combined-3-6-9-hop-corrected

Viewer • Updated Nov 10, 2025 • 8.12k • 7

Anna4242/TritonBench_G_v1

Viewer • Updated Nov 8, 2025 • 184 • 18

Anna4242/TritonBench_T_v1

Viewer • Updated Nov 8, 2025 • 166 • 9

Anna4242/toucan-multiturn-output

Viewer • Updated Nov 4, 2025 • 20 • 6

Anna4242/bfcl-v4-memory-verifiers-new

Preview • Updated Oct 29, 2025 • 4 • 1

Anna4242/tool-n1-sft-combined-standardized

Viewer • Updated Sep 18, 2025 • 321k • 12

Anna4242/tool-n1-sft-dataset-original-backup

Viewer • Updated Sep 18, 2025 • 5.5k • 7

Anna4242/tool-n1-sft-unique-splits

Viewer • Updated Sep 16, 2025 • 8.11k • 17

View 22 datasets