ISCA Archive SSW 2016
ISCA Archive SSW 2016

Merlin: An Open Source Neural Network Speech Synthesis System

Zhizheng Wu, Oliver Watts, Simon King

We introduce the Merlin speech synthesis toolkit for neural network-based speech synthesis. The system takes linguistic features as input, and employs neural networks to predict acoustic features, which are then passed to a vocoder to produce the speech waveform. Various neural network architectures are implemented, including a standard feedforward neural network, mixture density neural network, recurrent neural network (RNN), long short-term memory (LSTM) recurrent neural network, amongst others. The toolkit is Open Source, written in Python, and is extensible. This paper briefly describes the system, and provides some benchmarking results on a freely available corpus.


doi: 10.21437/SSW.2016-33

Cite as: Wu, Z., Watts, O., King, S. (2016) Merlin: An Open Source Neural Network Speech Synthesis System. Proc. 9th ISCA Workshop on Speech Synthesis Workshop (SSW 9), 202-207, doi: 10.21437/SSW.2016-33

@inproceedings{wu16_ssw,
  author={Zhizheng Wu and Oliver Watts and Simon King},
  title={{Merlin: An Open Source Neural Network Speech Synthesis System}},
  year=2016,
  booktitle={Proc. 9th ISCA Workshop on Speech Synthesis Workshop (SSW 9)},
  pages={202--207},
  doi={10.21437/SSW.2016-33}
}