view article Article Can Voice Agents Handle Bilingual Customers? Benchmarking Frontier ASR on Code-Switched Speech ServiceNow-AI • 3 days ago • 42
EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents Paper • 2605.13841 • Published about 1 month ago • 75
Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics Paper • 2605.12178 • Published May 12 • 61
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published Mar 25 • 98
view article Article A New Framework for Evaluating Voice Agents (EVA) ServiceNow-AI • Mar 24 • 95
EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings Paper • 2603.13594 • Published Mar 13 • 149
ServiceNow-AI/Apriel-1.6-15b-Thinker Image-Text-to-Text • 15B • Updated Dec 22, 2025 • 174 • 300
view article Article Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance ServiceNow-AI • Dec 9, 2025 • 84
ServiceNow-AI/Apriel-H1-15b-Thinker-SFT Text Generation • 16B • Updated Nov 3, 2025 • 26 • 29
Grounding Computer Use Agents on Human Demonstrations Paper • 2511.07332 • Published Nov 10, 2025 • 107
ServiceNow-AI/Apriel-1.5-15b-Thinker Image-Text-to-Text • 15B • Updated Oct 6, 2025 • 188 • 469
AU-Harness: An Open-Source Toolkit for Holistic Evaluation of Audio LLMs Paper • 2509.08031 • Published Sep 9, 2025 • 21