Nice to see some active learning around here. To add a data point from a less successful story:<p>In one of our research projects, we used AL to improve part-of-speech prediction, inspired by work by Rehbein and Ruppenhofer, e.g. <a href="https://www.aclweb.org/anthology/P17-1107/" rel="nofollow">https://www.aclweb.org/anthology/P17-1107/</a><p>Our data base was a corpus of Scientific English from 17th-now and for our data and situation, we found that choosing the right tool/model and having the right training data were the most important things. Once that was in place, active learning did not, unfortunately, add that much. For different tools/settings, we got about +/-0.2% in accuracy for checking 200k tokens and only correcting 400 of them.<p>Maybe one problem was that AL was only triggered when a majority vote was inconclusive. Also, we used it on top of individualised, gs training data. I guess things can look different if you don't have a gs to start with. And if you have better computational resources: Our oracles spent quite some time waiting, which is why we even reorganised the original design to then process batches of corrections.<p>As so often, those null results were hard to publish :|<p>Either way, I thought I'd share our experiences. Your work sounds really cool, best of luck!