55 点作者 TakakiTohno大约 5 年前

3 条评论

IMHO the speech dataset list is missing other interesting free corpora, e.g. the TEDlium dataset, Voxforge, Common Voice. A more comprehensive (but not complete) list can be found here: <a href="https://github.com/kaldi-asr/kaldi/tree/master/egs" rel="nofollow">https://github.com/kaldi-asr/kaldi/tree/master/egs</a> (download links can be found in the scripts)

sschmitt大约 5 年前

Also see the "Heidelberg Spiking Datasets": <a href="https://ieee-dataport.org/open-access/heidelberg-spiking-datasets" rel="nofollow">https://ieee-dataport.org/open-access/heidelberg-spiking-dat...</a>

MintChocoisEw大约 5 年前

Spoken Wikipedia corpus is especially impressive

Audio Datasets for Machine Learning

3 条评论

Audio Datasets for Machine Learning

3 条评论