video-SALMONN 2 Collection video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions. • 11 items • Updated 3 days ago • 1