MERA Code: A Unified Framework for Evaluating Code Generation Across Tasks Paper • 2507.12284 • Published Jul 16, 2025 • 12
RM -RF: Reward Model for Run-Free Unit Test Evaluation Paper • 2601.13097 • Published 24 days ago • 8
TAM-Eval: Evaluating LLMs for Automated Unit Test Maintenance Paper • 2601.18241 • Published 17 days ago • 8
NEREL: A Russian Dataset with Nested Named Entities, Relations and Events Paper • 2108.13112 • Published Aug 30, 2021
AINL-Eval 2025 Shared Task: Detection of AI-Generated Scientific Abstracts in Russian Paper • 2508.09622 • Published Aug 13, 2025 • 1
AINL-Eval 2025 Shared Task: Detection of AI-Generated Scientific Abstracts in Russian Paper • 2508.09622 • Published Aug 13, 2025 • 1