Vision LLM - a SamoXXX Collection

Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

SamoXXX 's Collections

Vision LLM

updated May 5, 2025

Collecting best Vision LLMs - to study and learn from them

rhymes-ai/Aria

Image-Text-to-Text • 25B • Updated Apr 23, 2025 • 100k • 637
microsoft/OmniParser

Image-Text-to-Text • Updated Dec 2, 2024 • 267 • 1.71k
jadechoghari/Ferret-UI-Gemma2b

Image-Text-to-Text • Updated Oct 18, 2024 • 628 • 52
jadechoghari/Ferret-UI-Llama8b

Image-Text-to-Text • Updated Jan 8, 2025 • 179 • 68
gpt-omni/mini-omni2

Any-to-Any • Updated Oct 24, 2024 • 82 • 283
mPLUG/DocOwl2

Image-Text-to-Text • 9B • Updated Sep 27, 2024 • 317 • 115
google/siglip-so400m-patch16-256-i18n

Zero-Shot Image Classification • 1B • Updated Nov 18, 2024 • 253 • 31
openvla/openvla-7b

Robotics • 8B • Updated Feb 17 • 1.83M • 216
NexaAI/OmniVLM-968M

0.5B • Updated Aug 20, 2025 • 733 • 532
Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • 8B • Updated Apr 6, 2025 • 8.73M • • 1.53k
ByteDance-Seed/UI-TARS-7B-SFT

Image-Text-to-Text • 8B • Updated Jan 25, 2025 • 4.4k • 178
moonshotai/Kimi-VL-A3B-Instruct

Image-Text-to-Text • 16B • Updated Jan 30 • 247k • 259
reducto/RolmOCR

Image-Text-to-Text • 8B • Updated Apr 2, 2025 • 309k • 586

Collection guide
Browse collections

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs