SearchArena In-the-wild Interactions with Search-LLMs w/ Human Preferences lmarena-ai/search-arena-v1-7k Viewer β’ Updated Apr 14, 2025 β’ 7k β’ 45 β’ 24 lmarena-ai/search-arena-24k Viewer β’ Updated May 16, 2025 β’ 24.1k β’ 169 β’ 24 Search Arena: Analyzing Search-Augmented LLMs Paper β’ 2506.05334 β’ Published Jun 5, 2025 β’ 17
Prompt-to-Leaderboard lmarena-ai/p2l-7b-grk-01112025 7B β’ Updated Feb 25, 2025 β’ 3 β’ 4 lmarena-ai/p2l-3b-grk-01112025 3B β’ Updated Feb 25, 2025 β’ 9 β’ 1 lmarena-ai/p2l-1.5b-grk-01112025 2B β’ Updated Feb 25, 2025 β’ 3 lmarena-ai/p2l-0.5b-grk-01112025 0.5B β’ Updated Feb 25, 2025 β’ 745 β’ 1
Arena-Hard-Auto An automatic evaluation tool for LLMs. Running 7 Arena Hard Viewer β‘ 7 Browse and view model judgments in benchmarks lmarena-ai/arena-hard-auto Updated May 1, 2025 β’ 325 β’ 6
SearchArena In-the-wild Interactions with Search-LLMs w/ Human Preferences lmarena-ai/search-arena-v1-7k Viewer β’ Updated Apr 14, 2025 β’ 7k β’ 45 β’ 24 lmarena-ai/search-arena-24k Viewer β’ Updated May 16, 2025 β’ 24.1k β’ 169 β’ 24 Search Arena: Analyzing Search-Augmented LLMs Paper β’ 2506.05334 β’ Published Jun 5, 2025 β’ 17
Arena-Hard-Auto An automatic evaluation tool for LLMs. Running 7 Arena Hard Viewer β‘ 7 Browse and view model judgments in benchmarks lmarena-ai/arena-hard-auto Updated May 1, 2025 β’ 325 β’ 6
Prompt-to-Leaderboard lmarena-ai/p2l-7b-grk-01112025 7B β’ Updated Feb 25, 2025 β’ 3 β’ 4 lmarena-ai/p2l-3b-grk-01112025 3B β’ Updated Feb 25, 2025 β’ 9 β’ 1 lmarena-ai/p2l-1.5b-grk-01112025 2B β’ Updated Feb 25, 2025 β’ 3 lmarena-ai/p2l-0.5b-grk-01112025 0.5B β’ Updated Feb 25, 2025 β’ 745 β’ 1