YourTTS architecture

**YourTTS** is a multi-speaker and multi-lingual TTS model that can perform voice conversion and zero-shot speaker adaptation. It can also learn a new language or voice with a speech recording of around 1-minute. It is particularly useful for training TTS models in low-resource languages. ![[yourtts-architecture.png]] [Casanova et al (2022)](https://arxiv.org/pdf/2112.02418) ## Reference Casanova, Edresson, Julian Weber, Christopher Shulby, Arnaldo Candido Junior, Eren Gölge, and Moacir Antonelli Ponti. ‘YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for Everyone’. arXiv, 16 February 2022. [https://doi.org/10.48550/arXiv.2112.02418](https://doi.org/10.48550/arXiv.2112.02418).