What happened to the Semantic Web? (2017) [pdf]

83 pointsby animationwillover 4 years ago

25 comments

echelonover 4 years ago

I was replying to another commenter, but they deleted their post. I still think my response is legitimate, and I'd like to discuss. I'm including the post and my response to it:> This might be an unpopular opinion, but the Semantic Web was yet another idea that only could've come from Architecture Astronauts. Neither web developers nor web users wanted or needed it, but the Astronauts just kept pushing forward, with no support whatsoever, until people with actual skin in the game of the future of the Web got fed up and sidelined the W3C in favor of WHATWG/HTML5.> When the Architecture Astronauts produce a product no one wants/needs at a big megacorp like Microsoft, it's easy to make fun of them, but for whatever reason when it happens in an open-source environment or in academia, people are a lot more likely to lament "Oh, why couldn't they understand?", which has it backwards: it's your job to understand them, and in your arrogance you deliberately didn't do that.The Semantic Web flew in the face of enterprises like Google. It posited to put all information into a queryable ontology. With a rich and expressive grammar, who needs Google's algorithms to rank anything? All you need to do is ingest and you can weight the edges based on who else consumes them.Google actually knee-capped the web with the WHATWG's focus on less semantic documents. The platforms with their army of engineers are what did the technology in.If the Semantic Web had a chance to grow before the giants of tech emerged, we might be looking at a vastly superior internet today.

评论 #24584918 未加载

评论 #24584952 未加载

评论 #24585762 未加载

评论 #24585484 未加载

评论 #24584910 未加载

评论 #24585097 未加载

评论 #24585795 未加载

评论 #24584415 未加载

评论 #24584467 未加载

评论 #24586236 未加载

评论 #24584715 未加载

wombatmobileover 4 years ago

The idea behind semantic web was to embed intelligence in web content through semantic markup, so that downstream agents could make sense of the content.However the semantic web didn't have a business model to drive adoption. It assumed web content would be authored with semantic indicators that could later be harvested usefully (including commercially), but failed to consider what incentive or economic paradigm would compel content authors to embed semantic indicators.Meanwhile, the problem/opportunity was solved in other ways.Consider this example:- - -The promise: machine intelligence“At the doctor's office, Lucy instructed her Semantic Web agent through her handheld Web browser. The agent promptly retrieved information about Mom's prescribed treatment from the doctor's agent, looked up several lists of providers, and checked for the ones in-plan for Mom's insurance a 20 mile radius of her home and with a rating of excellent or very good on trusted rating services. It then began trying to find a match between available appointment times (supplied by the agents of individual providers through their Web sites) and Pete's and Lucy's busy schedules.”(The emphasized keywords indicate terms whose semantics, or meaning, were defined for the agent through the Semantic Web.)- - -The problem was solved by commercial services e.g. google maps in which the semantic layer is provided, retained and exploited by the service aggregator, google, which monetises the service through advertising.Ironically, "machine intelligence" has not come to mean semantically enriched source documents, but rather, elaborate computationally intensive processes applied to harvest commercial opportunities from unenriched web content.

评论 #24586302 未加载

评论 #24584534 未加载

评论 #24584572 未加载

wrnrover 4 years ago

I've though about improving [[Roam]] with semantics but maybe this is a fools errand.Human's use of language is somewhere between the bag-of-word model and part-of-speech tagging. Neither a word salad nor a constructive proof based in first-order-logic.<pre><code> I had a [[meeting]] with [[John Doe]] about [[Project X]]. </code></pre> Is much more natural then:<pre><code> _:meeting1 a schema:Meeting; schema:participant [ a foaf:Person; rdfs:label "John Doe" ] ; </code></pre> The early web is also more like roam than the semantic web, and the reason why something like page-ranks works is because the links form "naturally".The semantic web wanted to make them "understandable" to computers, but made linking too difficult for people.Even still, the task of understanding humans has nothing to do with semantics. The academic discipline that studies this is called language pragmatics.BTW, this problem was why Wittgenstein had a nervous breakdown, trying to proof/compute the meaning of statements with other statements.At the end of the day words mean what they mean, i mean, I have a Beatle in a box in my head and you have a Beatle in a box in your head. Mine walks and flies though the air, yours drives on the road.We expect computers to be able to do this but I suspect that this won't happen until they can take a shit.Maybe it's not a bad goal to have, but, even the semantic web (academic) community has given up on that question and has pivoted its effort to saving the web and giving people back control of their data with the power of a digital twin, whatever that is.

评论 #24589407 未加载

JimmyRuskaover 4 years ago

Semantic web is still here, it's just not a dominant headline.As an enterprise architect I've evaluated most of the major AWS and GCP offerings. We still decided to look into semantic web for many use cases, and it was still the best solution.Semantic web is many things, so I'll narrow it down to one of the most impressive offerings; RDFox. <a href="https://www.youtube.com/watch?v=tpB_tl1Vc0A" rel="nofollow">https://www.youtube.com/watch?v=tpB_tl1Vc0A</a>You use RDFox when you want a graph database but you also want reasoning. You can do logical deduction to infer new information. Just like programming languages evolved to have modules and namespace systems, semantic web allows you to namespace entities to more easily share data. It's based on descriptive logic, a subset of first order logic.The alien features I don't see anywhere else;* I can add logic to my graph database and have it execute as soon as I insert data into the database* Recursive queries is way easier than SQL* Forget materialized views and generated columns, RDFox can automatically apply descriptive logic to update facts as soon as you insert data into the database "incremental reasoning"* The magical declarative syntax that gets reconciled that have made tools popular, like, kubernetes, terraform, graphQL, react, it's now generalized for you to use in any app at almost any scale* You can put business rules in your database, and even if they heavily chain off other rules, it's lightening fast. You can also just type "explain" to see how data was derived* If you have streaming pipelines that you were sending to ETL and import back, in many cases you can use the streaming inference to do this all for you without separate apps* You can import data from Wikidata and other massive RDF based sources. This abstract wikipedia project coming up may have broad implications to the availability of RDF, especially if they're successful and it's heavily copied everywhere for other domains

评论 #24584839 未加载

评论 #24586126 未加载

评论 #24584714 未加载

thelazydogsbackover 4 years ago

I think we missed the boat -- the focus on fancy rendering rather than on the actual data/content set us back years in functionality and helped create free and for-pay walled gardens where actionable data stays server-side and rendering is forced upon us, rather than just having reasonable defaults and then having rendering (incl. multi-modal interaction, etc.) further guided by user-defined settings and heuristics. Even if data happens to be rendered client-side in an SPA, it's done though application-specific means. I suppose deep-learning based data-extraction from free-form pre- and post-rendered content will eventually make up for this, but at least part of that road may have been avoided -- and of course we're probably going to have to pay someone (one way or another) for the privilege of having the content un-rendered back into actionable data at scale...

baneover 4 years ago

There was never any incentives aligned to make the Semantic Web work. For it to work, page authors had to go through the trouble of making their own content compatible, but it didn't really buy them anything in return and for big content producers it removed their main revenue source -- ads. As a result, the places that had lots of information never bothered, and small users never bothered, and the SW petered out.In some ways it's similar to what we see in certain news sites. They only link to other pages on their own site even though it would be trivial to link out to the original information pages. Sites will even host public documents in in-line PDF readers in order to not link back to completely publicly available government sites -- scientific and engineering advancements are often vaguely and imprecisely talked about with no link to the original paper or announcement from the source. By living on ad revenue, these sites want to roach motel you into their hypertext jail and the result is information gets twisted and misreported where it propagates and is repeated in other places. News sites will even reference each other without linking back to the other site's original content.Ads killed the Semantic Web.

cratermoonover 4 years ago

Simple. Nobody needs it to sell things and make money, or spread ideas and grab power. Amazon isn't going to spend a penny of its income on things that don't have a profitable ROI. Misinformation sources would prefer things not be cataloged accurately an sensibly. The actual portion of the web dedicated to information based on fact and useful content is dwarfed by everything control by actors who are only there to make money or control people.

xnxover 4 years ago

This is the classic struggle between structured and unstructured data. As difficult as the task of creating AI that can make sense of unstructured data and pages is, that task is still more tractable than getting millions of people around the world to sufficiently coordinate on a common standard.

评论 #24586690 未加载

评论 #24584568 未加载

rediguanayumover 4 years ago

Isn't the schema.org annotations the semantic web? My understanding is that the adoption small i.e. Wiki says 17% (<a href="https://en.wikipedia.org/wiki/Schema.org" rel="nofollow">https://en.wikipedia.org/wiki/Schema.org</a>). Forrester suggests a lack of awareness of the Semantic web by marketers and content creators (<a href="https://advertiseonbing.blob.core.windows.net/blob/bingads/media/library/insight/prioritize-search-to%20boost-roi/forrester-prioritize-search-whitepaper.pdf?ext=.pdf" rel="nofollow">https://advertiseonbing.blob.core.windows.net/blob/bingads/m...</a>)

bawolffover 4 years ago

I feel like recently the semantic web is gaining a bit of a resurgance with wikidata.But i think the general vision of individual mutually untrusted individuals participating in an interconnected knowledge graph is pretty dead.* its hard to do quality/concistency control on that which makes the results less useful.* RDF is rediculously complicated. Microformats/rdfa is a bit better, but still it essentially requires specialist knowledge to do this properly. This strongly discourages average joe from just adding annotations.* unclear value proposition at the small scale.* it is very difficult to scale complex queries on large semantic data sets. It can often be hard to predict how performance will change over time. Compare to traditional relational DBs which have very predictable scalability and lots of DBAs who know how to optimize. I think this is probably the biggest hurdle to large-scale adoption.* * key example: the example queries at <a href="https://query.wikidata.org" rel="nofollow">https://query.wikidata.org</a> are pretty magical, but when you try your own, you can quickly run into timeouts, especially when nested deeply in a graph (e.g. find all species of plants matching some property)

评论 #24586111 未加载

somewhereoutthover 4 years ago

The abstraction was at the wrong level - text/images/executables/etc is the right level, so essentially simple file formats upon the underlying wire/binary.Ontologies and so forth are an example of classic modernism (a clockwork universe that we can understand completely). Turns out real life is much more interesting!

donatjover 4 years ago

Google for many years did not execute JavaScript as part of it's indexing. This had the very positive effect of making the contents of the web far more accessible, as otherwise you simply did not show in search results.They're kind of bringing this back around with AMP but in a shitty walled garden way.

baron_harkonnenover 4 years ago

I was a big believer in the semantic web for years, but there is a load of things wrong with it from conceptual problems to practical ones.For starters the Semantic Web requires an enormous amount of labor to make things work at all. You need humans marking up stuff, often with no advantage other than the "greater good". In fact you do see semantic content where it makes sense today. Look at any successful websites header and you'll see a pretty large variety of semantic content, things that Google and social media platforms use the make the page more discoverable.This problem is compounded by the fact that ML and NLP solved many of the practical problems that the semantic web was supposed to. Google basically works like a vast question answering system. If you want to find pictures of "frogs with hats on" you don't need semantic metadata.A much larger problem is that the real vision of the semantic web wreaked of the classic "solution in search of a problem". The magic of semantic web wasn't the metadata; RDF was just the beginning.RDF is literally a more verbose implementation of Prolog's predicates. The real goal was to build reasoning engines on top of RDF, essentially a prolog like reasoner that could answer queries. A big warning sign for me was that the majority of people doing "Semantic Web" work at the time didn't even know of the basics of how existing knowledge based representation and reasoning systems, like prolog, worked. They were inventing a Semantic future without any sense that this problem has been worked on in another form for decades.OWL, which was the standard to be used for the reasoning part of the semantic web was computationally intractable in it's highest level description of the reasoning process. If you start with a computationally intractable abstraction as your formal specification, they you are starting very far from praxis.For this reason it was hard to really do anything with the semantic web. Virtually nobody built weekend "semantic web demos" because there wasn't really anything you could do with it that you couldn't do easier with a simple database and some basic business logic... or just write in Prolog.A few companies did use semantic, RDF databases but you quickly realize these offered no value over just building a traditional relational database, and today we have real graph data bases in abundance so any advantage you would get form processing boatloads of XML as a graph can be replicated without the markup overhead. And that's not even considering the work in graph representation coming out of deep learning.Semantic web didn't work because it was half-pipe dream, and not even a very interesting one at that.

iso8859-1over 4 years ago

People can't be bothered to run their own websites, that is what happened.Wikidata was launched, so now you can just host your data there.Wikidata has better searching than the real semantic web could ever get, since it has a team of devs and sysadmins with a view of the whole dataset.Also, the Semantic Web had no story for how to contact authors and suggest changes to their schema. DNS does not provide sufficient identity or messaging. But Mediawiki does.

评论 #24585367 未加载

tomc1985over 4 years ago

I think that it is telling that the only times I seem to hear about "The Semantic Web," either Tim Berners-Lee or the ACM are attached to it.

zxteloivover 4 years ago

The semantic web would directly benefit the automatic agents/bots rather than human users.However, lots of use scenarios found it useful to adopt semantic web technologies (RDF, Ontology), including the web services such as searching and recommendation, the technologies like AI and NLP, and the domain-specific requirements like drugs and lawsuit.The semantic web may not seem useful to us, but definitely affect us.

gklittover 4 years ago

Semantic Web postmortem seems like one of those things where everyone is touching their own part of the elephant.This is the best overview I've found of the basic history:<a href="https://twobithistory.org/2018/05/27/semantic-web.html" rel="nofollow">https://twobithistory.org/2018/05/27/semantic-web.html</a>

dsimmsover 4 years ago

The struggle between semantic and page description is old. In some early arguments HTML "what's this markup thing, why not postscript with hyperlinks?" the most succinct counter argument was "yes, but what does it _mean_". But that was minority for sure.Someone called this "the revenge of NeWS" (the Sony system) but I can't find a reference for that.

tejtmover 4 years ago

Instead of being widely distributed w/agents it is stuck in a data store reasoned over off-line and served as a Knowledge Graph (materialized in a solr index) for a limited domain.in my world anyway

ilamontover 4 years ago

I took TBL's Semantic Web/Linked Data class 10 years ago (<a href="https://www.ilamont.com/2010/09/encounter-with-tim-berners-lee-and.html" rel="nofollow">https://www.ilamont.com/2010/09/encounter-with-tim-berners-l...</a>). Here's the description from the syllabus:"The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation." Linked Data is the use of all that information in a manner that essentially treats the entire web as one virtual database, allowing data to be pulled and manipulated from multiple sites as a single query and, beyond that, allows the sites queried to point to other locations that might have useful information.Linked Data Ventures is a graduate-level class that combines traditional academics with practitioner perspectives and practical hands-on experience, to empower students to launch new ventures using this technology. From the instructor team, students will learn the technical components that are needed to produce Linked Data offerings, will learn business skills fundamental to growing and sustaining a venture. Weekly guest lecturers will provide insights into how they use the technology in their business and their experience launching a venture. This is a unique course in which teaching is shared between EECS and the MIT Entrepreneurship Center and learning is a cooperative effort between EECS and Sloan students. A team-based project will culminate in sustainable prototypes that could be freeware or have the potential to be commercialized in the future, and will serve as a good launching point for entering into the MIT 100k competition. At the end of the semester, each project team will do a business presentation in front of the whole class and a panel of outside experts and judges, as well as a technical presentation as the conclusion of the lab.The faculty were clearly interested in encouraging adoption/development of commercial applications. To that end we had several speakers from industry showing off some early-stage applications, and the project teams had to build simple apps based on the technology.One of the teams actually did achieve success, building a prototype restaurant database using Semantic Web technologies we learned in class. It eventually grew into a commercial service and was acquired by GoDaddy about five years ago. However, to scale it beyond the prototype they had to move beyond Semantic Web concepts and use a different technology approach; IIRC the founder said it just wasn't ready for prime time. He mentioned latency was a huge issue.

coderForFreedomover 4 years ago

SPAs, mobile apps and Zelenials killed it.

评论 #24584836 未加载

physicsgraphover 4 years ago

Many of the commenters here have pointed out that the semantic web did not become popular because the incentives are not supportive of the investment. I agree, and this leads me to wonder if there are domains in which the cost of investing in semantic markup are relevant.I see scientific publishing as a venue that could clearly benefit from application of a semantic web-like domain-specific approach. There are a variety of possibilities [0], [1], but again many are implemented behind proprietary paywalls. There's value in search and knowledge representation, but the prohibitive cost of implementation is challenging.[0] <a href="https://en.wikipedia.org/wiki/Semantic_Scholar" rel="nofollow">https://en.wikipedia.org/wiki/Semantic_Scholar</a> [1] <a href="https://en.wikipedia.org/wiki/Web_of_Science" rel="nofollow">https://en.wikipedia.org/wiki/Web_of_Science</a>

rektideover 4 years ago

There's two big things missing in this discussion of the Semantic Web to me,1. Developers. Historically Semantic Web was a lot of RDF & Sparql, which are both imo fairly hostile to developers. There were some decent libraries, but often written in a very oldschool style that made it difficult to even load or use, & with frankly pitiful documentation/tests. A lot of the databases/tooling was paid/proprietary.The development story is looking much better. Oddball RDF & Sparql are joined by much more mainstream-dev friendly tools: Microdata which is pretty simple marked up HTML & JSON-LD which looks & works like JSON, with a little extra "context" sprinkled in at the top. Libraries are much improved & modernized & mainstream-dev compliant. Datastores like Apache Jena are far more used & there's a lot of ActivityPub & related json-ld-centric data-stores & systems being created & experimented with.2. Users. The article talks about primary use cases for semantic web, and they are all huge massive industries, not people. We needed semantic web because it would help search. We needed semantic web because it would help social. We needed social web because it would help e-commerce (& look, an article from yesterday about just that![1]).What's missing is end users. I don't mind that super-large data systems can do interesting things with semantic web. But to me, the purpose was always to enrich the information we users see online with our eyes with powerful & consistent data that our own machines can help use. Our navigator should be helping us, showing us what digital matter we are seeing on the page, rather than letting the page exist as one enormous standalone artifact implicitly composed of arbitrary text & images. There's meaning there, there's thing that we are working with, & semantic web gives us a common operating system for talking about things, & managing them.Users are still somewhat missing from semantic web. Folks like ActivityPub are doing a wonderful & interesting job using Semantic Web to build common distributed platforms for social, where we can talk about digital matter like Shares and Photos and Favorites in a common way. For now, the semantic web tech remains under the hood, something abstract powering a client that abstracts over the semantic meaning to generate just another anonymous web page, filled with articles and photos and listens and viewings & other social entities, but presented through the veneer of the application, not as discrete social objects unto themselves. I think we're only just starting to explore how to open the Semantic Web up, how to represent semantic data entities & data stores, in a way that will let users interact directly with digital objects, rather than needing the artifice & instrumentation of the application. But this is pretty deep conjecture. What I think is clearer to say is that the end-user has, until very recently, has not seen or understood how semantic web technology might be helping them; it's been a tool for businesses & big data. I look forward to the interesting era of Semantic Web, the era now breaking upon us, when we get to explore how having structured meaningful data can be good for individuals, persons, for personal computing, for small & medium data, & especially, for us to begin to communicate with each other over better structured data. And I think JSON-LD, ActivityPub, & the semantic web is, by far, the most promising & straightforward way to explore these virtues of structured communication.By contrast, the article's talk about "what's next" is yet more academic projects, machine learning, & trying to represent more things (like actions, which is something absolutely core to what ActivityPub does: represent activities[2]!).[1] <a href="https://news.ycombinator.com/item?id=24557027" rel="nofollow">https://news.ycombinator.com/item?id=24557027</a>[2] <a href="https://www.w3.org/TR/activitystreams-vocabulary/" rel="nofollow">https://www.w3.org/TR/activitystreams-vocabulary/</a>

anotheryouover 4 years ago

[pdf] - oh the irony

The_rationalistover 4 years ago

It's not that dead, google announced today extended support for it: <a href="https://news.ycombinator.com/item?id=24557027" rel="nofollow">https://news.ycombinator.com/item?id=24557027</a>

评论 #24584289 未加载