**YourTTS** is a multi-speaker and multi-lingual TTS model that can perform voice conversion and zero-shot speaker adaptation. It can also learn a new language or voice with a speech recording of around 1-minute. It is particularly useful for training TTS models in low-resource languages.
![[yourtts-architecture.png]]
[Casanova et al (2022)](https://arxiv.org/pdf/2112.02418)
## Reference
Casanova, Edresson, Julian Weber, Christopher Shulby, Arnaldo Candido Junior, Eren Gölge, and Moacir Antonelli Ponti. ‘YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for Everyone’. arXiv, 16 February 2022. [https://doi.org/10.48550/arXiv.2112.02418](https://doi.org/10.48550/arXiv.2112.02418).