Synthesized Audio Samples
Basilar membrane and otolaryngology are not auto-correlations.
Scientists at the CERN laboratory say they have discovered a new particle.
Donald Trump and I are great friends.
Peter Piper picked a peck of pickled peppers. How many pickled peppers did Peter Piper pick?
To cancel the payment, press one; or to continue, two.
Performance reviews are stressful, time-consuming, and often meaningless.
A nuclear weapon improvised from radioactive nuclear waste material and conventional explosives.
Tina Fey’s children are Alice Zenobia Richmond and Penelope Athena Richmond.
Deny thy father and refuse thy name.
Life is like a box of chocolates. You never know what you’re gonna get.
Prepare to make a right in 500 feet
Sally sells seashells by the seashore. The shells she sells are seashells I’m sure
My show is a total disaster
Compare Real to Generated Audio
why not use the time to register to vote
Real | Fake |
---|---|
it’s not a great sign for your case when
Real | Fake |
---|---|
Procedure
- Data was collected using my YouTube datascraper, YTTTS
- The dataset went through various processing steps, such as labelling audio that was probably not John Oliver speaking. The cleaned dataset has been posted on Kaggle
- Model was trained using Tacotron 2
Model Download + Kaggle Kernel
The model below was trained with a batch size of 32. The link contains all checkpoints saved throughout training. As it continues to train, new checkpoints will be uploaded to this directory and the files will appear accordingly:
Links
[1] https://github.com/Ryan-Rudes/YTTTS
[2] https://github.com/NVIDIA/tacotron2
[3] https://arxiv.org/abs/1712.05884
[4] https://www.kaggle.com/ryanrudes/johnoliver
[5] https://drive.google.com/drive/folders/1jj0Ktck3ZybpDzY1yzveODnh4qFSvsAl?usp=sharing
[6] https://www.kaggle.com/ryanrudes/voice-cloning-with-tacotron-2