Hey HN!<p>After working in ML for more than a decade, I became frustrated over time with the lack of tools to create baselines using simple rules and heuristics. It is well known that most business problems can achieve decent baselines using only heuristics. So this is why I have just open-sourced DataQA, a rules-based labelling tool for NLP:<p><pre><code> - Quick labelling: You can create complex rules using regular expressions to help you label your text faster.
- Search engine: DataQA also ships with a search engine (local elasticsearch database) so you can search your documents.
- Easy installation: Only need to install a single python package!
- Easy use: upload your data as csv files.
- Privacy: No data ever leaves your computer.
</code></pre>
I'm hoping to get some feedback, and I'm open to hear about feature requests or ideas for extensions. I will be around to answer questions.