Representation Alignment for Just Image Transformers is not Easier than You Think Paper • 2603.14366 • Published 15 days ago • 9
RealRestorer: Towards Generalizable Real-World Image Restoration with Large-Scale Image Editing Models Paper • 2603.25502 • Published 4 days ago • 49
Vega: Learning to Drive with Natural Language Instructions Paper • 2603.25741 • Published 4 days ago • 4
FinMCP-Bench: Benchmarking LLM Agents for Real-World Financial Tool Use under the Model Context Protocol Paper • 2603.24943 • Published 5 days ago • 8
Calibri: Enhancing Diffusion Transformers via Parameter-Efficient Calibration Paper • 2603.24800 • Published 5 days ago • 57
SlopCodeBench: Benchmarking How Coding Agents Degrade Over Long-Horizon Iterative Tasks Paper • 2603.24755 • Published 5 days ago • 24
PixelSmile: Toward Fine-Grained Facial Expression Editing Paper • 2603.25728 • Published 4 days ago • 114
Qworld: Question-Specific Evaluation Criteria for LLMs Paper • 2603.23522 • Published 24 days ago • 9
LagerNVS: Latent Geometry for Fully Neural Real-time Novel View Synthesis Paper • 2603.20176 • Published 10 days ago • 8
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published 5 days ago • 87
Can LLM Agents Be CFOs? A Benchmark for Resource Allocation in Dynamic Enterprise Environments Paper • 2603.23638 • Published 6 days ago • 9
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience Paper • 2603.24533 • Published 5 days ago • 40
TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate Paper • 2504.19874 • Published Apr 28, 2025 • 22
GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents Paper • 2603.24329 • Published 5 days ago • 19
Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs? Paper • 2603.24472 • Published 5 days ago • 41
DA-Flow: Degradation-Aware Optical Flow Estimation with Diffusion Models Paper • 2603.23499 • Published 6 days ago • 48
TrajLoom: Dense Future Trajectory Generation from Video Paper • 2603.22606 • Published 7 days ago • 5
VP-VLA: Visual Prompting as an Interface for Vision-Language-Action Models Paper • 2603.22003 • Published 7 days ago • 11