TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Pattern Recognition in Texts Using Complex Networks

18 pointsby Rodalmost 15 years ago

3 comments

sqrt17almost 15 years ago
Can someone who has more of a network theory background say why this would be interesting?<p>From an NLP angle, both what they're doing (text classification) and how they're doing it (constructing a co-occurrence matrix) don't sound particularly novel nor do the network-theoretic properties they get from the unweighted, undirected form of the co-occurrence matrix seem to give any valuable insights.<p>As a comparison, see the 2009 workshop on text graphs <a href="http://www.textgraphs.org/ws10/index.html" rel="nofollow">http://www.textgraphs.org/ws10/index.html</a> or papers such as Gaume et al (2007) Semantic associations and confluences in paradigmatic networks <a href="http://w3.erss.univ-tlse2.fr/textes/pagespersos/gaume/resources/Gaume_Duvignau_Vanhove_final.pdf" rel="nofollow">http://w3.erss.univ-tlse2.fr/textes/pagespersos/gaume/resour...</a><p>Did I mention that the physics people totally ignore all the (interesting and non-trivial) existing literature on the topic? It's a bit as if a CS/NLP person would write a paper on an information-theoretic approach to physics while totally ignoring the physics bits in it.
mark_l_watsonalmost 15 years ago
I just gave the PDF a quick read, and it looked useful enough to put in my NLP permanent reading collection. There seems to be growing momentum in both NLP research and the number of good papers that are freely available. Last month, someone posted a link to "ICWSM – A Great Catchy Name: Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews" on HN - another interesting and potentially useful paper.
hooandealmost 15 years ago
Interesting that they used the research equivalent of a Minimum Viable Product. From the paper:<p><i>Of course, more complicated semantic network models are certainly possible. For instance, one could construct a weighted network. However, we sought the simplest possible model which could distinguish between fictional and non-fictional written storytelling.</i>