TechEcho

9 comments

neiman1almost 4 years ago

Hey HN!Markup is an open-source annotation tool for transforming unstructured documents into a structured format that can be used for ML, NLP, etc.Markup learns as you annotate in order to speed up the process by suggesting complex annotations to you.There are also a few different in-built tools, including:- A data generator that helps you to produce synthetic data for training the suggestion model- An annotator diff tool that helps you to compare annotations produced by multiple annotatorsIt's still very much a work in progress (and the documentation is severely lacking), but the ultimate goal is to make a tool that's as useful as <a href="https://prodi.gy/" rel="nofollow">https://prodi.gy/</a>, without the $400 price tag.

hadsedalmost 4 years ago

Beautiful. So many annotation tools focus on "text classification" which assumes you've already got segmented samples. In the real world of documents that's a whole challenge in itself.Another challenge is that sometimes you're working with PDFs and that means not only ingesting but also displaying. The difficulty is in keeping track of annotations and predictions across the PDF<->text string boundary, both ways.There are understandably even fewer solutions to that problem because it's a harder UI to build.

评论 #27562436 未加载

评论 #27562478 未加载

kwerkalmost 4 years ago

This looks incredible! I’ve been following doccano for awhile but they were still working on active learning. Will you be adding an open source license like MIT?

评论 #27561790 未加载

Delkalmost 4 years ago

Looks like an interesting project. Would you have some kind of a summary of the methodology you're using for the annotation suggestions? What kind of learning, and which kinds of features?

评论 #27563811 未加载

评论 #27563192 未加载

forgingaheadalmost 4 years ago

Really nice tool - thanks for making this! What is your plan for this? Is this a side-project that you'll potentially turn into a business, or is this just a hobby on the side of your full-time job?Just asking because I think many folks would be happy to pay to support a small ISV to ensure it's long-term sustainability. Not via donations, but actual pricing.

评论 #27562997 未加载

hbcondo714almost 4 years ago

> Document to annotate - The document you intend to annotate (must be .txt file)Any thoughts on supporting additional file formats? I'm actually interested in annotating HTML files / web pages. It would be great if I could browse for a local HTML file or enter in a URL and the HTML content would be rendered for it to be annotated using the entities.

评论 #27564400 未加载

jclosalmost 4 years ago

That's fantastic. I was about to start a project in October building something that's almost completely there already, for a specific use case (annotation of therapy sessions).

评论 #27570530 未加载

rubatugaalmost 4 years ago

What are some of your competitors, as well as any other open-source alternatives? What makes your tool better?

评论 #27565377 未加载

slava_kiosealmost 4 years ago

Amazing! So many tools, it's very useful. Thanks.

9 comments

neiman1almost 4 years ago

hadsedalmost 4 years ago

评论 #27562436 未加载

评论 #27562478 未加载

kwerkalmost 4 years ago

This looks incredible! I’ve been following doccano for awhile but they were still working on active learning. Will you be adding an open source license like MIT?

评论 #27561790 未加载

Delkalmost 4 years ago

Looks like an interesting project. Would you have some kind of a summary of the methodology you're using for the annotation suggestions? What kind of learning, and which kinds of features?

评论 #27563811 未加载

评论 #27563192 未加载

forgingaheadalmost 4 years ago

评论 #27562997 未加载

hbcondo714almost 4 years ago

评论 #27564400 未加载

jclosalmost 4 years ago

That's fantastic. I was about to start a project in October building something that's almost completely there already, for a specific use case (annotation of therapy sessions).

评论 #27570530 未加载

rubatugaalmost 4 years ago

What are some of your competitors, as well as any other open-source alternatives? What makes your tool better?

评论 #27565377 未加载

slava_kiosealmost 4 years ago

Amazing! So many tools, it's very useful. Thanks.

Show HN: An annotation tool for ML and NLP

9 comments

Show HN: An annotation tool for ML and NLP

9 comments