MERA Code: A Unified Framework for Evaluating Code Generation Across Tasks Paper • 2507.12284 • Published Jul 16 • 7
SWE-MERA: A Dynamic Benchmark for Agenticly Evaluating Large Language Models on Software Engineering Tasks Paper • 2507.11059 • Published Jul 15 • 6
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper • 2511.14993 • Published 28 days ago • 224
Multimodal Evaluation of Russian-language Architectures Paper • 2511.15552 • Published 28 days ago • 78