I'm one of the authors of the paper that proposes the deep learning model implemented in the blog post, and I would recommend training on a different dataset, such as VCTK (freely available, and what we used in our paper).<p>Super-resolution methods are very sensitive to the choice of training data. They will overfit seemingly insignificant properties of the training set, such as the type of low-pass filter you are using, or the acoustic conditions under which the recordings were made (e.g. distance to the microphone when recording a speaker).<p>To capture all the variations present in the TED talks dataset, you would need a very large model and probably train it for >10 epochs. The VCTK dataset is better in this regard.<p>For comparison, here are our samples: kuleshov.github.io/audio-super-res/<p>I'm going to try to release the code over the weekend.