CBHG is a module used in the [[Tacotron architecture]] TTS model to extract representations from sequences. It consists of a bank of 1-D convolutional filters (convolution filter bank, CB), followed by highway networks (H) and a bidirectional gated recurrent unit using [[gated recurrent unit (GRU)]] RNNs (G). ![[tacotron-cbhg.png]] [Wang et al 2017](https://arxiv.org/abs/1703.10135) ## Reference Wang, Yuxuan, R.J. Skerry-Ryan, Daisy Stanton, Yonghui Wu, Ron J. Weiss, Navdeep Jaitly, Zongheng Yang, et al. ‘Tacotron: Towards End-to-End Speech Synthesis’. In _Interspeech 2017_, 4006–10. ISCA, 2017. [https://doi.org/10.21437/Interspeech.2017-1452](https://doi.org/10.21437/Interspeech.2017-1452).