Renjie's picture

2 18 2

Renjie

RogerLos

·

AI & ML interests

LLM

Recent Activity

upvoted a paper 9 days ago

Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics

upvoted a paper 22 days ago

GoRL: An Algorithm-Agnostic Framework for Online Reinforcement Learning with Generative Policies

updated a model 29 days ago

RogerLos/all_pairs_rft_Qwen25-7B

View all activity

Organizations

None yet

upvoted a paper 9 days ago

Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics

Paper • 2512.12602 • Published 11 days ago • 39

upvoted a paper 22 days ago

GoRL: An Algorithm-Agnostic Framework for Online Reinforcement Learning with Generative Policies

Paper • 2512.02581 • Published 23 days ago • 14

updated a model 29 days ago

RogerLos/all_pairs_rft_Qwen25-7B

8B • Updated 29 days ago • 15

published a model 29 days ago

RogerLos/all_pairs_rft_Qwen25-7B

8B • Updated 29 days ago • 15

upvoted 2 papers about 1 month ago

Defeating the Training-Inference Mismatch via FP16

Paper • 2510.26788 • Published Oct 30 • 29

Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5 • 126

updated a model about 1 month ago

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_90

8B • Updated Nov 21 • 5

published a model about 1 month ago

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_90

8B • Updated Nov 21 • 5

updated a model about 1 month ago

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_85

8B • Updated Nov 21 • 4

published a model about 1 month ago

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_85

8B • Updated Nov 21 • 4

updated a model about 1 month ago

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_80

8B • Updated Nov 21 • 5

published a model about 1 month ago

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_80

8B • Updated Nov 21 • 5

updated a model about 1 month ago

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_75

8B • Updated Nov 21 • 5

published a model about 1 month ago

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_75

8B • Updated Nov 21 • 5

updated a model about 1 month ago

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_70

8B • Updated Nov 21 • 3

published a model about 1 month ago

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_70

8B • Updated Nov 21 • 3

updated a model about 1 month ago

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_65

8B • Updated Nov 21 • 4

published a model about 1 month ago

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_65

8B • Updated Nov 21 • 4

updated a model about 1 month ago

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_60

8B • Updated Nov 21 • 5

published a model about 1 month ago

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_60

8B • Updated Nov 21 • 5