Cool stuff! It's nice to see platforms like this which abstract out good algorithms, so that developers can worry about thinking of interesting applications. .Open source libs are even better, but pragmatically speaking, I think these types of platforms probably move faster and get better results.<p>One major competitor (well known for anyone who's looked into this stuff) is Alchemy [1]. I tried a New York Times link [2] on Aylien and Alchemy, and Alchemy performed much better -- in fact, Aylien didn't even successfully find the article body. I'm sure you guys will be iterating on improving the algorithms, but just wanted to flag that as a potential turnoff for anyone comparing your website demo with Alchemy.<p>Best of luck!<p>[1] <a href="http://www.alchemyapi.com/products/demo/" rel="nofollow">http://www.alchemyapi.com/products/demo/</a><p>[2] <a href="http://www.nytimes.com/2014/02/18/world/middleeast/bombings-in-syria-force-wave-of-civilians-to-flee.html?hp&_r=0" rel="nofollow">http://www.nytimes.com/2014/02/18/world/middleeast/bombings-...</a>
Seen quite a few times (NLP web APIs), and my opinion is that this kind of stuff tends to not be scalable: to be useful, such web APIs have to be able to do entire articles in just a split fraction of a second. Although I am not sure (because of the HN storm the API is down), it does not seem this tool will live up to those expectations, either. In the end, my choice always has been to include/wrap an off-the-shelf tool in your own pipeline rather than relying on a external service that might be too slow for end-users and mass mining alike...
This is a much better Noun Phrase / Entity extractor.<p><a href="https://www.mashape.com/stremor/noun-entity-extraction-noun-phrase-part-of-speech-tagger-alpha" rel="nofollow">https://www.mashape.com/stremor/noun-entity-extraction-noun-...</a><p>We don't rely on CoreNLP, or NLTK, we have our own sentence disambiguation, and our own part of speech tools. So we are a lot faster.<p>Our other api's let you piece together a lot of cool NLP projects with very little code.
These sorts of things are typically better offered as libraries, particularly as the training is usually specific to a corpus, or a particular context.<p>It would be a nice to offer a library with a bootstrapped training set.
Unfortunately the web site is still analyzing the example Techcrunch link (it's been 3 min already).<p>Is something broken? Maybe you could cache some recurring analyses.
Hey guys! Congrats, NLP is a huge problem that needs as many minds working on it as possible.<p>Just tried a few links:<p><a href="http://arstechnica.com/security/2014/02/dear-asus-router-user-youve-been-pwned-thanks-to-easily-exploited-flaw/" rel="nofollow">http://arstechnica.com/security/2014/02/dear-asus-router-use...</a><p><a href="http://blog.algore.com/2011/07/the_great_lakes_are_in_danger.html" rel="nofollow">http://blog.algore.com/2011/07/the_great_lakes_are_in_danger...</a><p>Am I missing something here? It seems like it's just parsing text, i'm not seeing any context(keywords, categories, summaries)<p>edit: It's giving fantastic results when pasting the raw text! :)<p>Are you guys using DBpedia? It's giving very similar results to a system I was working on in the past: <a href="http://www.zachvanness.com/nanobird_relevancy_engine.pdf" rel="nofollow">http://www.zachvanness.com/nanobird_relevancy_engine.pdf</a>
What do you use for the extraction of entities (if you don't mind saying)? I entered "The Cat in the Hat" is a good book. It didn't recognize any entities. Are you using an ontology for named entity resolution, or just extracting NPs?
It does really poorly analyzing a Wiktionary entry like <a href="http://en.wiktionary.org/wiki/run" rel="nofollow">http://en.wiktionary.org/wiki/run</a> or with a Wikipedia article like <a href="http://en.wikipedia.org/wiki/Big_O_notation" rel="nofollow">http://en.wikipedia.org/wiki/Big_O_notation</a>
Playing around with it and seemed to have killed it by pasting the text from this WP article (<a href="http://pastebin.com/AtCU7E8H" rel="nofollow">http://pastebin.com/AtCU7E8H</a>) in and hitting analyze. It's been spinning for a while.<p><i>edit</i> I see from another response that the server room is on meltdown, I'll wait for a bit.
Maybe somebody will find useful and relevant my pet project: <a href="https://github.com/crypto5/wikivector" rel="nofollow">https://github.com/crypto5/wikivector</a> .
It uses machine learning and wikipedia data as training set, supports 10 languages, and completely open source.
There's more and more of text analysis APIs, would you mind comparing your feature set to something like Textrazor (<a href="http://www.textrazor.com" rel="nofollow">http://www.textrazor.com</a>) or Open Calais?<p>What is special about your project ?
"There was a time when men could roam free on earth, free from concrete and tarmac. Now it's all gone to shit."<p>Classification: arts, culture and entertainment - architecture .(WTF?)<p>Polarity: positive. (Nope)<p>Polarity confidence: 0.9994709276706056. (Well...)<p>Looks pretty rough to me.
A bunch of TA libraries (Stemmers, Wordbreakers, etc) ship "free" with Windows that support a ton of different languages. I wish MS would open up the API a bit more.