Xiaohu Zhu's picture

Xiaohu Zhu

zksneil

·

AI & ML interests

None yet

Recent Activity

upvoted an article 5 days ago

Welcome Gemma 4: Frontier multimodal intelligence on device

upvoted a collection 12 days ago

(Some) Emergent Misalignment from Reward Hacking in RL

liked a model 2 months ago

Gen-Verse/RLAnything-OS-8B

View all activity

Organizations

None yet

upvoted an article 5 days ago

Article

Welcome Gemma 4: Frontier multimodal intelligence on device

+5

10 days ago

•

816

upvoted a collection 12 days ago

(Some) Emergent Misalignment from Reward Hacking in RL

Model checkpoints from the project "(Some) Natural Emergent Misalignment from Reward Hacking in Non-Production RL" • 228 items • Updated 12 days ago • 3

upvoted a paper over 1 year ago

RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning

Paper • 2410.02089 • Published Oct 2, 2024 • 13

upvoted a collection over 1 year ago

Llama3.1-Chinese-Chat

2 items • Updated Jul 26, 2024 • 7