PicoAudio2: Temporal Controllable Text-to-Audio Generation with Natural Language Description Paper • 2509.00683 • Published Aug 31, 2025
UniFlow-Audio: Unified Flow Matching for Audio Generation from Omni-Modalities Paper • 2509.24391 • Published Sep 29, 2025
Bayesian Speech synthesizers Can Learn from Multiple Teachers Paper • 2510.24372 • Published Oct 28, 2025
SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations Paper • 2510.25955 • Published Oct 29, 2025