Awesome Computer Use Agents Collection https://github.com/ranpox/awesome-computer-use • 25 items • Updated Dec 18, 2024 • 18
💧 LFM2 Collection LFM2 is a new generation of hybrid models, designed for on-device deployment. • 26 items • Updated 3 days ago • 136
view article Article Tiny Agents in Python: a MCP-powered agent in ~70 lines of code +2 May 23, 2025 • 170
DINOv2 Collection DINOv2: foundation models producing robust visual features suitable for image-level and pixel-level visual tasks - https://arxiv.org/abs/2304.07193 • 5 items • Updated Aug 13, 2025 • 30
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated May 5, 2025 • 298
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published Feb 20, 2025 • 157
view article Article π0 and π0-FAST: Vision-Language-Action Models for General Robot Control +2 Feb 4, 2025 • 186
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 26 items • Updated May 1, 2025 • 574
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4, 2025 • 253
view article Article Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU! Apr 21, 2024 • 44
Open LLM Leaderboard best models ❤️🔥 Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: • 65 items • Updated Mar 20, 2025 • 657
LLaVA-Video Collection Models focus on video understanding (previously known as LLaVA-NeXT-Video). • 8 items • Updated Feb 21, 2025 • 64
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22, 2024 • 133
DocLLM: A layout-aware generative language model for multimodal document understanding Paper • 2401.00908 • Published Dec 31, 2023 • 189