LMArena Leaderboard
Compare and rank AI model performance
Compare and rank AI model performance
Track, rank and evaluate open LLMs and chatbots
Embedding Leaderboard
Compare and evaluate speech recognition model performance across multiple benchmarks
Compare and find the best LLM performance on different hardware
Compare and evaluate open code models on benchmark tests
Can AI Code? An LLM leaderboard inclquantized models.
View and submit LLM evaluations
Explore and submit LLM benchmarks
Transform images into artistic masterpieces with AI-powered tools
Evaluate LLMs' cybersecurity risks and capabilities
Compare and rank large language model performance
Explore and compare QA and long doc benchmarks
VLMEvalKit Evaluation Results Collection
Display and analyze reward model evaluation results
Explore and analyze code completion benchmarks
Display and filter multimodal model leaderboard results
Display MTEB Arena interface
Visualize Open vs. Proprietary LLM Progress
Compare and rank AI models through human voting
Blind vote on HF TTS models!
A leaderboard for LLMs powering smolagents