Spatial Audio & Visual Spatial Audio & Visual LLMs JAEGER: Joint 3D Audio-Visual Grounding and Reasoning in Simulated Physical Environments Paper • 2602.18527 • Published Feb 20 • 2 tsinghua-ee/JAEGER Updated 12 days ago • 4
JAEGER: Joint 3D Audio-Visual Grounding and Reasoning in Simulated Physical Environments Paper • 2602.18527 • Published Feb 20 • 2
General Time Series SciTS: Scientific Time Series Understanding and Generation with LLMs Paper • 2510.03255 • Published Sep 26, 2025 OpenTSLab/SciTS Preview • Updated Mar 19 • 2.34k • 2
SciTS: Scientific Time Series Understanding and Generation with LLMs Paper • 2510.03255 • Published Sep 26, 2025
Brain Signals BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals Paper • 2505.18185 • Published May 18, 2025 • 1 OpenTSLab/BrainOmni Updated Oct 15, 2025 • 2
BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals Paper • 2505.18185 • Published May 18, 2025 • 1
Speech & Audio Processing SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations Paper • 2510.25955 • Published Oct 29, 2025 • 1 marcoyang/spear-xlarge-speech-audio 0.6B • Updated Feb 3 • 5.3k • 7
SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations Paper • 2510.25955 • Published Oct 29, 2025 • 1
video-SALMONN 2 video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions. tsinghua-ee/video-SALMONN-2_plus_72B Updated Sep 28, 2025 • 10 • 2 tsinghua-ee/video_SALMONN2plus_72B_audioAlign Updated Jan 28 • 7 tsinghua-ee/video-SALMONN2_plus_7B_full 9B • Updated Feb 23 • 362 tsinghua-ee/video-SALMONN-2_plus_7B Updated Sep 28, 2025 • 87 • 6
Spatial Audio & Visual Spatial Audio & Visual LLMs JAEGER: Joint 3D Audio-Visual Grounding and Reasoning in Simulated Physical Environments Paper • 2602.18527 • Published Feb 20 • 2 tsinghua-ee/JAEGER Updated 12 days ago • 4
JAEGER: Joint 3D Audio-Visual Grounding and Reasoning in Simulated Physical Environments Paper • 2602.18527 • Published Feb 20 • 2
Speech & Audio Processing SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations Paper • 2510.25955 • Published Oct 29, 2025 • 1 marcoyang/spear-xlarge-speech-audio 0.6B • Updated Feb 3 • 5.3k • 7
SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations Paper • 2510.25955 • Published Oct 29, 2025 • 1
General Time Series SciTS: Scientific Time Series Understanding and Generation with LLMs Paper • 2510.03255 • Published Sep 26, 2025 OpenTSLab/SciTS Preview • Updated Mar 19 • 2.34k • 2
SciTS: Scientific Time Series Understanding and Generation with LLMs Paper • 2510.03255 • Published Sep 26, 2025
video-SALMONN 2 video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions. tsinghua-ee/video-SALMONN-2_plus_72B Updated Sep 28, 2025 • 10 • 2 tsinghua-ee/video_SALMONN2plus_72B_audioAlign Updated Jan 28 • 7 tsinghua-ee/video-SALMONN2_plus_7B_full 9B • Updated Feb 23 • 362 tsinghua-ee/video-SALMONN-2_plus_7B Updated Sep 28, 2025 • 87 • 6
Brain Signals BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals Paper • 2505.18185 • Published May 18, 2025 • 1 OpenTSLab/BrainOmni Updated Oct 15, 2025 • 2
BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals Paper • 2505.18185 • Published May 18, 2025 • 1