Смирнов Мария

oliviathom2

·

AI & ML interests

None yet

Recent Activity

liked a Space 3 days ago

AimeeBingmouQu/ProtectBirds

upvoted a paper 14 days ago

Looped World Models

upvoted a paper 15 days ago

LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling

View all activity

Organizations

None yet

upvoted a paper 14 days ago

Looped World Models

Paper • 2606.18208 • Published 20 days ago • 476

upvoted a paper 15 days ago

LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling

Paper • 2606.18023 • Published 20 days ago • 209

upvoted 2 papers about 1 month ago

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents

Paper • 2606.02031 • Published Jun 1 • 20

DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards

Paper • 2605.21467 • Published May 20 • 207

upvoted a paper about 2 months ago

Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning

Paper • 2605.02913 • Published Apr 8 • 9

upvoted a paper 2 months ago

WavAlign: Enhancing Intelligence and Expressiveness in Spoken Dialogue Models via Adaptive Hybrid Post-Training

Paper • 2604.14932 • Published Apr 16 • 11

upvoted 4 papers 3 months ago

GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning

Paper • 2604.02721 • Published Apr 3 • 638

Memory-Augmented Vision-Language Agents for Persistent and Semantically Consistent Object Captioning

Paper • 2603.24257 • Published Mar 30 • 6

FinMCP-Bench: Benchmarking LLM Agents for Real-World Financial Tool Use under the Model Context Protocol

Paper • 2603.24943 • Published Mar 26 • 12

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published Mar 20 • 353