Victor Gallego

vicgalle

https://github.com/vicgalle

AI & ML interests

Preference fine-tuning, alignment & synthetic data. Building LLMs in general!

Recent Activity

upvoted a changelog 4 days ago

Hugging Face Papers for AI Agents

upvoted a paper 4 days ago

AI Scientist via Synthetic Task Scaling

updated a dataset 5 days ago

vicgalle/rubric-feedback-bench

View all activity

Organizations

upvoted a changelog 4 days ago

Hugging Face Changelog

Hugging Face Papers for AI Agents

5 days ago

• 110

upvoted a paper 4 days ago

AI Scientist via Synthetic Task Scaling

Paper • 2603.17216 • Published 5 days ago • 1

upvoted a paper 30 days ago

2Mamba2Furious: Linear in Complexity, Competitive in Accuracy

Paper • 2602.17363 • Published Feb 19 • 8

upvoted a paper about 1 month ago

Experiential Reinforcement Learning

Paper • 2602.13949 • Published Feb 15 • 71

upvoted a paper 2 months ago

Distilling Feedback into Memory-as-a-Tool

Paper • 2601.05960 • Published Jan 9 • 3

upvoted an article 3 months ago

Article

The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator

Dec 17, 2025

•

upvoted a paper 3 months ago

Agent READMEs: An Empirical Study of Context Files for Agentic Coding

Paper • 2511.12884 • Published Nov 17, 2025 • 27

upvoted a paper 5 months ago

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 275

upvoted an article 5 months ago

Article

mem-agent: Equipping LLM Agents with Memory Using RL

Oct 9, 2025

•

upvoted 3 papers 7 months ago

upvoted 5 papers 8 months ago

Provably Learning from Language Feedback

Paper • 2506.10341 • Published Jun 12, 2025 • 8

Multi-Agent Game Generation and Evaluation via Audio-Visual Recordings

Paper • 2508.00632 • Published Aug 1, 2025 • 4

Agentic Reinforced Policy Optimization

Paper • 2507.19849 • Published Jul 26, 2025 • 160

The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm

Paper • 2507.18553 • Published Jul 24, 2025 • 41

Specification Self-Correction: Mitigating In-Context Reward Hacking Through Test-Time Refinement

Paper • 2507.18742 • Published Jul 24, 2025 • 6

upvoted an article 8 months ago

Article

Automated Discovery of High-Performance GPU Kernels with OpenEvolve

Jun 27, 2025

•

upvoted 2 papers 9 months ago

CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

Paper • 2507.06181 • Published Jul 8, 2025 • 45

Robust Reward Modeling via Causal Rubrics

Paper • 2506.16507 • Published Jun 19, 2025 • 9

Victor Gallego

AI & ML interests

Recent Activity

Organizations

vicgalle's activity

Hugging Face Papers for AI Agents

The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator

mem-agent: Equipping LLM Agents with Memory Using RL

Automated Discovery of High-Performance GPU Kernels with OpenEvolve