wenlong deng's picture

wenlong deng

dwenlong

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 19 days ago

Directional Alignment Mitigates Reward Hacking in Reinforcement Learning for Language Models

submitted a paper 19 days ago

Directional Alignment Mitigates Reward Hacking in Reinforcement Learning for Language Models

upvoted a paper about 1 month ago

Privileged Information Distillation for Language Models

View all activity

Organizations

liked a model about 2 months ago

deepseek-ai/DeepSeek-V4-Pro

Text Generation • 862B • Updated 7 days ago • 3.08M • • 4.83k

liked 2 models 4 months ago

mradermacher/LLDS-A-GRPO-Qwen2.5-7B-Base-i1-GGUF

8B • Updated Jan 15 • 2.43k • 2

SEGAgentRL/LLDS-A-GSPO-Qwen2.5-3B-Ins

Reinforcement Learning • 3B • Updated Jan 15 • 4 • 1

liked 2 models 5 months ago

SEGAgentRL/LLDS-A-GRPO-Qwen2.5-7B-Ins

Reinforcement Learning • 8B • Updated Jan 15 • 4 • 2

SEGAgentRL/LLDS-A-GRPO-Qwen2.5-7B-Base

Reinforcement Learning • 8B • Updated Jan 15 • 5 • 2

liked a model about 1 year ago

UCSC-VLAA/MedReason-8B

Question Answering • 8B • Updated Jul 30, 2025 • 177 • 15

liked a dataset about 1 year ago

UCSC-VLAA/MedReason

Viewer • Updated May 27, 2025 • 32.7k • 492 • 86

liked a model about 1 year ago

deepseek-ai/DeepSeek-V3-0324

Text Generation • 685B • Updated Mar 27, 2025 • 527k • • 3.13k

liked a model over 1 year ago

junnyu/DeepScaleR-1.5B-Preview-Reproduce

Text Generation • 2B • Updated Feb 26, 2025 • 4 • 4