Qwen3-Omni

Qwen3-Omni is the natively end-to-end multilingual omni model. It processes text, images, audio, and video, and delivers real-time streaming responses in both text and natural speech. We introduce several upgrades to improve performance and efficiency.