Hacker News · @ycombinator
6 followers · 594 posts · Server rss-mstdn.studiofreesia.com
Luis Ferreira · @lmf
57 followers · 106 posts · Server universeodon.com

An interesting paper from Microsoft:

Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
valle-demo.github.io/

ABS: Specifically, we train a neural codec language model (called VALL-E) using discrete codes derived from an off-the-shelf neural audio codec model, and regard TTS as a conditional language modeling task rather than continuous signal regression as in previous work.

#ai #language_model #speech_synthesis #vall_e #tts #neural_networks

Last updated 2 years ago

IR · @ir_du
0 followers · 2 posts · Server vmst.io