Qwen/Qwen3-VL-30B-A3B-Instruct-FP8 Image-Text-to-Text • 31B • Updated 21 days ago • 128k • 91
Qwen/Qwen3-VL-30B-A3B-Instruct Image-Text-to-Text • 31B • Updated 21 days ago • 1.38M • • 446
Qwen/Qwen3-VL-30B-A3B-Thinking Image-Text-to-Text • 31B • Updated 21 days ago • 55.7k • • 165
Qwen/Qwen3-VL-235B-A22B-Instruct-FP8 Image-Text-to-Text • 236B • Updated 21 days ago • 319k • 32
Qwen/Qwen3-VL-235B-A22B-Thinking-FP8 Image-Text-to-Text • 236B • Updated 21 days ago • 8.46k • 24
Qwen/Qwen3-VL-235B-A22B-Instruct Image-Text-to-Text • 236B • Updated 21 days ago • 156k • • 335
Qwen/Qwen3-VL-235B-A22B-Thinking Image-Text-to-Text • 236B • Updated 21 days ago • 10k • • 346
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper • 2506.07044 • Published Jun 8 • 114
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14 • 304
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs Paper • 2406.07476 • Published Jun 11, 2024 • 37
LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization Paper • 2502.13922 • Published Feb 19 • 28