TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Rules-based labelling tool for NLP

55 点作者 dataqa超过 3 年前

5 条评论

dataqa超过 3 年前
Hey HN!<p>After working in ML for more than a decade, I became frustrated over time with the lack of tools to create baselines using simple rules and heuristics. It is well known that most business problems can achieve decent baselines using only heuristics. So this is why I have just open-sourced DataQA, a rules-based labelling tool for NLP:<p><pre><code> - Quick labelling: You can create complex rules using regular expressions to help you label your text faster. - Search engine: DataQA also ships with a search engine (local elasticsearch database) so you can search your documents. - Easy installation: Only need to install a single python package! - Easy use: upload your data as csv files. - Privacy: No data ever leaves your computer. </code></pre> I&#x27;m hoping to get some feedback, and I&#x27;m open to hear about feature requests or ideas for extensions. I will be around to answer questions.
teruakohatu超过 3 年前
Looks great. I can&#x27;t try it right now, but looking at the documentation I would suggest an alternative to CSV upload.<p>For larger documents CSV can be annoying. The line breaks needs to be escaped and commas need to be escaped. Pointing the application to a folder containing a corpus of text files is much easier.
评论 #28639506 未加载
steve_g超过 3 年前
It looks like this tool is intended to label _documents_ using rules&#x2F;heuristics. That seems useful.<p>My desired use case is to label words or phrases (named entity recognition) - specifically for chemicals. It seems like this tool isn&#x27;t designed for that. Am I understanding correctly?
评论 #28641607 未加载
sbdmmg超过 3 年前
Hi! Interesting project, congrats! How does it compare to <a href="https:&#x2F;&#x2F;calmcode.io&#x2F;human-learn&#x2F;introduction.html" rel="nofollow">https:&#x2F;&#x2F;calmcode.io&#x2F;human-learn&#x2F;introduction.html</a> ?
评论 #28640099 未加载
评论 #28639525 未加载
schleck8超过 3 年前
Awesome!