I'm intrigued to see if anyone can squeeze out similar quality with a smaller dataset (Microsoft's implementation was trained on 60,000 hours apparently).<p>Not that that's impossible to get your hands on nowadays, but it still takes quite a long time to train on decent (though admittedly not extremely high-end) hardware.