Yang Zhou's picture

1 5

Yang Zhou

nbzy1995

·

AI & ML interests

Artificial General Intelligence, AI for Science, AI for society

Recent Activity

updated a model about 1 month ago

nbzy1995/Qwen2-0-5B-GRPO-vllm-trl

updated a model about 1 month ago

nbzy1995/Qwen2-0-5B-GRPO-vllm-trl

updated a model about 1 month ago

nbzy1995/Qwen2-0-5B-GRPO-vllm-trl

View all activity

Organizations

upvoted a paper 3 months ago

Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning

Paper • 2509.22601 • Published Sep 26 • 29