·
AI & ML interests
None yet
Organizations
view article A Role Shift for AI Infra: From Foundational Support to a Core Engine of Innovation
NormalUhr
• • 1
view article Re-understanding KL Approximation from an RL-for-LLM Lens: Notes on “Approximating KL Divergence”
NormalUhr
• • 10
view article From GRPO to DAPO and GSPO: What, Why, and How
NormalUhr
• • 118
view article Decorators in Machine Learning
NormalUhr
• • 1
published an article about 1 year ago view article DualPipe Explained: A Comprehensive Guide to DualPipe That Anyone Can Understand—Even Without a Distributed Training Background
NormalUhr
• • 18
published an article over 1 year ago view article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment
NormalUhr
• • 119
published an article over 1 year ago view article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge
NormalUhr
• • 292
published an article over 1 year ago view article A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons
NormalUhr
• • 35
published an article over 1 year ago view article From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning
NormalUhr
• • 17
published an article over 1 year ago view article MLA: Redefining KV-Cache Through Low-Rank Projections and On-Demand Decompression
NormalUhr
• • 23