Xiaomi Group AI Lab releases ZipVoice series of speech synthesis (TTS) models
2025-09-12 10:23:52

According to Xiaomi Technology, the Next-Generation Kaldi team at Xiaomi Group's AI Lab recently released the ZipVoice series of text-to-speech (TTS) models, based on the Flow Matching architecture. These models include ZipVoice (a zero-shot single-speaker text-to-speech synthesis model) and ZipVoice-Dialog (a zero-shot conversational text-to-speech synthesis model). ZipVoice addresses the large number of parameters and slow synthesis speed of existing zero-shot text-to-speech synthesis models, while ZipVoice-Dialog addresses the stability and inference speed bottlenecks of existing conversational text-to-speech synthesis models.
Email Subscription
Newsletters and emails are now available! Delivered on time, every weekday, to keep you up to date with North American business news.
ASIA TECH WIRE

Grasp technology trends

Download