Show HN: An easy-to-use Text Analysis API – NLP and Machine Learning

153 pointsby parsabgover 11 years ago

30 comments

gklittover 11 years ago

Cool stuff! It's nice to see platforms like this which abstract out good algorithms, so that developers can worry about thinking of interesting applications. .Open source libs are even better, but pragmatically speaking, I think these types of platforms probably move faster and get better results.One major competitor (well known for anyone who's looked into this stuff) is Alchemy [1]. I tried a New York Times link [2] on Aylien and Alchemy, and Alchemy performed much better -- in fact, Aylien didn't even successfully find the article body. I'm sure you guys will be iterating on improving the algorithms, but just wanted to flag that as a potential turnoff for anyone comparing your website demo with Alchemy.Best of luck![1] <a href="http://www.alchemyapi.com/products/demo/" rel="nofollow">http://www.alchemyapi.com/products/demo/</a>[2] <a href="http://www.nytimes.com/2014/02/18/world/middleeast/bombings-in-syria-force-wave-of-civilians-to-flee.html?hp&_r=0" rel="nofollow">http://www.nytimes.com/2014/02/18/world/middleeast/bombings-...</a>

评论 #7257800 未加载

评论 #7260339 未加载

fnlover 11 years ago

Seen quite a few times (NLP web APIs), and my opinion is that this kind of stuff tends to not be scalable: to be useful, such web APIs have to be able to do entire articles in just a split fraction of a second. Although I am not sure (because of the HN storm the API is down), it does not seem this tool will live up to those expectations, either. In the end, my choice always has been to include/wrap an off-the-shelf tool in your own pipeline rather than relying on a external service that might be too slow for end-users and mass mining alike...

评论 #7262275 未加载

drakaalover 11 years ago

This is a much better Noun Phrase / Entity extractor.<a href="https://www.mashape.com/stremor/noun-entity-extraction-noun-phrase-part-of-speech-tagger-alpha" rel="nofollow">https://www.mashape.com/stremor/noun-entity-extraction-noun-...</a>We don't rely on CoreNLP, or NLTK, we have our own sentence disambiguation, and our own part of speech tools. So we are a lot faster.Our other api's let you piece together a lot of cool NLP projects with very little code.

mattmcknightover 11 years ago

These sorts of things are typically better offered as libraries, particularly as the training is usually specific to a corpus, or a particular context.It would be a nice to offer a library with a bootstrapped training set.

评论 #7258168 未加载

kenshiro_oover 11 years ago

Unfortunately the web site is still analyzing the example Techcrunch link (it's been 3 min already).Is something broken? Maybe you could cache some recurring analyses.

评论 #7257970 未加载

评论 #7257969 未加载

zvannessover 11 years ago

Hey guys! Congrats, NLP is a huge problem that needs as many minds working on it as possible.Just tried a few links:<a href="http://arstechnica.com/security/2014/02/dear-asus-router-user-youve-been-pwned-thanks-to-easily-exploited-flaw/" rel="nofollow">http://arstechnica.com/security/2014/02/dear-asus-router-use...</a><a href="http://blog.algore.com/2011/07/the_great_lakes_are_in_danger.html" rel="nofollow">http://blog.algore.com/2011/07/the_great_lakes_are_in_danger...</a>Am I missing something here? It seems like it's just parsing text, i'm not seeing any context(keywords, categories, summaries)edit: It's giving fantastic results when pasting the raw text! :)Are you guys using DBpedia? It's giving very similar results to a system I was working on in the past: <a href="http://www.zachvanness.com/nanobird_relevancy_engine.pdf" rel="nofollow">http://www.zachvanness.com/nanobird_relevancy_engine.pdf</a>

评论 #7257789 未加载

blueblobover 11 years ago

What do you use for the extraction of entities (if you don't mind saying)? I entered "The Cat in the Hat" is a good book. It didn't recognize any entities. Are you using an ontology for named entity resolution, or just extracting NPs?

评论 #7258740 未加载

analyticallyover 11 years ago

Another player in this space, from Oxford, UK: <a href="http://apidemo.theysay.io/" rel="nofollow">http://apidemo.theysay.io/</a>

imperio59over 11 years ago

It does really poorly analyzing a Wiktionary entry like <a href="http://en.wiktionary.org/wiki/run" rel="nofollow">http://en.wiktionary.org/wiki/run</a> or with a Wikipedia article like <a href="http://en.wikipedia.org/wiki/Big_O_notation" rel="nofollow">http://en.wikipedia.org/wiki/Big_O_notation</a>

baneover 11 years ago

Playing around with it and seemed to have killed it by pasting the text from this WP article (<a href="http://pastebin.com/AtCU7E8H" rel="nofollow">http://pastebin.com/AtCU7E8H</a>) in and hitting analyze. It's been spinning for a while.edit I see from another response that the server room is on meltdown, I'll wait for a bit.

crypto5over 11 years ago

Maybe somebody will find useful and relevant my pet project: <a href="https://github.com/crypto5/wikivector" rel="nofollow">https://github.com/crypto5/wikivector</a> . It uses machine learning and wikipedia data as training set, supports 10 languages, and completely open source.

syllogismover 11 years ago

Do you publish accuracy figures? Any information about what domains your training data is from?

评论 #7258020 未加载

polskibusover 11 years ago

There's more and more of text analysis APIs, would you mind comparing your feature set to something like Textrazor (<a href="http://www.textrazor.com" rel="nofollow">http://www.textrazor.com</a>) or Open Calais?What is special about your project ?

评论 #7259278 未加载

skiplecaribooover 11 years ago

Super nice !This is a very interesting area... Good to see something new apart from Alchemy and opencalais !

评论 #7257843 未加载

cliveowenover 11 years ago

"There was a time when men could roam free on earth, free from concrete and tarmac. Now it's all gone to shit."Classification: arts, culture and entertainment - architecture .(WTF?)Polarity: positive. (Nope)Polarity confidence: 0.9994709276706056. (Well...)Looks pretty rough to me.

评论 #7258381 未加载

评论 #7257773 未加载

评论 #7258416 未加载

kskover 11 years ago

A bunch of TA libraries (Stemmers, Wordbreakers, etc) ship "free" with Windows that support a ton of different languages. I wish MS would open up the API a bit more.

elwellover 11 years ago

Clearly broken. Say's news.ycombinator.com sentiment is "Positive". All jokes aside, really cool; love the accessibility of the demo.

cglaceover 11 years ago

I posted a couple of paragraphs from a financial blog and the tool interpreted SEC to mean Southeastern Conference.

评论 #7258160 未加载

moron4hireover 11 years ago

Should I have not tried it with a 3000 word essay I wrote? It has been beachballing for the last 5 minutes or so.

评论 #7257886 未加载

adventuredover 11 years ago

How is this superior to Alchemy?<a href="http://www.alchemyapi.com/" rel="nofollow">http://www.alchemyapi.com/</a>

评论 #7257828 未加载

mrg3_2013over 11 years ago

I tried bbc.com and nothing shows up. Is it supposed to work on top level links and summarize ?

评论 #7258343 未加载

Houshalterover 11 years ago

I can't get it to work, can someone tell me what it's supposed to do?

parsabgover 11 years ago

thanks for the feedback folks. FWIW, here's the documentation (/ NLP crash course!): <a href="http://aylien.com/text-api-doc" rel="nofollow">http://aylien.com/text-api-doc</a>

iamwithnailover 11 years ago

Annnnnnd that's my thesis sorted. Part of it anyway.

afshinmehover 11 years ago

One of stunning stuffs that I've seen. Good job.

lukasmover 11 years ago

HN - the ultimate DDOS machine

评论 #7257983 未加载

评论 #7258203 未加载

评论 #7257875 未加载

jhbellzover 11 years ago

pretty cool - what languages does your API support?

评论 #7257846 未加载

mm0over 11 years ago

sell it to a bank $$$

jackson1988over 11 years ago

This is incredible!

hamed_rover 11 years ago

Interesting!