The History of the URL

276 pointsby migueldemouraabout 5 years ago

13 comments

That is an excellent article and I learned a tremendous amount.I do have one minor technical criticism though. It is so common for people to conjoin parameter with the components of a query string that we don't give it a second thought. The specification, though, does delineate these terms. See: <a href="https://tools.ietf.org/html/rfc3986#section-3.4" rel="nofollow">https://tools.ietf.org/html/rfc3986#section-3.4</a> and the preceding paragraph.Specifically parameters are trailing data components of the path section of the URI (URL). The query string is separated from the path section by the question mark. URI parameters are rarely used though so this is a common mistake.Also encoding ampersands into a URI (URL) using HTML encoding schemes is also common, but that is incorrect. URI encoding uses percent coding as its only encoding scheme, such as %20 for a space. Using something like &amp; will literally provide 5 characters in the address unencoded or may result in something like %26amp; in software that auto-converts characters into the presumed encoding.* <a href="https://tools.ietf.org/html/rfc3986#section-2.1" rel="nofollow">https://tools.ietf.org/html/rfc3986#section-2.1</a>* <a href="https://stackoverflow.com/questions/16622504/escaping-ampersand-in-url" rel="nofollow">https://stackoverflow.com/questions/16622504/escaping-ampers...</a>

评论 #22497880 未加载

评论 #22498097 未加载

评论 #22504180 未加载

评论 #22501307 未加载

russellbeattieabout 5 years ago

I love articles like this, in part because they remind me of the early 90s when I first started reading about this stuff and it all seemed arcane and magical. [1]I remember dismissing the "World Wide Web" because my 80286-based IBM PC I used at college in 1993 couldn't run a graphical web browser (that I knew of) so I compared the terminal versions of a web browser to Gopher and determined the latter was far superior - it had more content and was much cleaner to use in a terminal.The history of the Internet and Web definitely would have been soooo different had URLs been formatted like "http:com/example/foo/bar/baz" for sure. It's so much cleaner and sensical. Part of the mystique of "foo.com" is that it somehow seems completely different from "bar.org". Not sure why, but it just is.Just a side note: DOS and Windows using \ instead of / is annoying and has been annoying for nearly 40 years and I don't ever think I'll ever find it not annoying. You'd think 4 decades would be enough time, but it still bugs me.1. <a href="https://en.m.wikipedia.org/wiki/Whole_Internet_User's_Guide_and_Catalog" rel="nofollow">https://en.m.wikipedia.org/wiki/Whole_Internet_User's_Guide_...</a>

评论 #22497919 未加载

shellacabout 5 years ago

> In 1992 Tim Berners-Lee created three things, giving birth to what we consider the Internet. The HTTP protocol, HTML, and the URL.What we call the web, surely? I appreciate we conflate the two, but in this context I think that's what was meant.And it's really the URL (URI, IRI, URN...) that makes the web. An amazing thing.

评论 #22494978 未加载

评论 #22495680 未加载

chrisweeklyabout 5 years ago

Here's the original, classic "Cool URIs Don't Change" post by TBL himself: <a href="https://www.w3.org/Provider/Style/URI.html" rel="nofollow">https://www.w3.org/Provider/Style/URI.html</a>

评论 #22496257 未加载

theclawabout 5 years ago

Anyone interested in the development of the ARPANET and its transformation into the Internet we know today owes it to themselves to read Where Wizards Stay Up Late by Hafner and Lyon - it’s a great read and the audiobook isn’t bad either.

thedanceabout 5 years ago

If you're going to write a mile-long article about the URL, at least get the details correct. The leftmost part of a URL is not the "protocol". It is called the "scheme". The scheme doesn't tell you "the protocol which should be used to access it", it tells you how to interpret the remainder of the URL.

评论 #22499848 未加载

ck2about 5 years ago

You suddenly realize you are very old when an article tries to impress with ancient photos of PDP11 and cradle-type dial-up modems and it's startling that I've used all of that in my lifetime, extensively.

neillyonsabout 5 years ago

> Root DNS servers operate in safes, inside locked cages. A clock sits on the safe to ensure the camera feed hasn’t been looped.That is cool. Does anyone have any more info? Perhaps a picture?

评论 #22497724 未加载

btrettelabout 5 years ago

What are the available solutions to the problem of locating other copies of webpages or documents online? (Let's assume that the page of interest is not on the Internet Archive.) This article mentions a few:> I was able to find these pages through Google, which has functionally made page titles the URN of today.> Given the power of search engines, it’s possible the best URN format today would be a simple way for files to point to their former URLs.Daniel Bernstein proposes a document ID that can be found in search engines: <a href="https://cr.yp.to/bib/documentid.html" rel="nofollow">https://cr.yp.to/bib/documentid.html</a>I actually started using this before, but found it to be clumsy and stopped using it.Someone else has suggested a UUID instead: <a href="https://lobste.rs/s/xltmol/this_page_is_designed_last#c_nis6no" rel="nofollow">https://lobste.rs/s/xltmol/this_page_is_designed_last#c_nis6...</a>But that's still clumsy. I'd prefer something shorter.Perhaps the title of the page is the best option, as people are more likely to have that saved than the UUID: <a href="https://lobste.rs/s/xltmol/this_page_is_designed_last#c_0snrhr" rel="nofollow">https://lobste.rs/s/xltmol/this_page_is_designed_last#c_0snr...</a>> I imagine it’d be uncommon for someone has the UUID but not the website saved.

评论 #22499977 未加载

AKlugeabout 5 years ago

I remember when the hosts.txt file crossed a half page. I thought this network is taking off. :)

divbzeroabout 5 years ago

This is a masterful history and a reminder that what we have now is just a quasi-frozen snapshot of evolving solutions. Is there any chance that an alternative (URN? IPFS? Dat?) will gain traction and resolve some of the shortcomings of the URL?

评论 #22505462 未加载

ChrisArchitectabout 5 years ago

bah, believe this was adapted from an article from 2016, commentary here: <a href="https://news.ycombinator.com/item?id=12117540" rel="nofollow">https://news.ycombinator.com/item?id=12117540</a>

CaptArmchairabout 5 years ago

I started reading the article with much interest... up until the bit about the Semantic Web. Then I felt things went downhill.> One such effort was the Semantic Web. The dream was to create a Resource Description Framework (editorial note: run away from any team which seeks to create a framework), which would allow metadata about content to be universally expressed. For example, rather than creating a nice web page about my Corvette Stingray, I could make an RDF document describing its size, color, and the number of speeding tickets I had gotten while driving it.> This is, of course, in no way a bad idea. But the format was XML based, and there was a big chicken-and-egg problem between having the entire world documented, and having the browsers do anything useful with that documentation.The author completely falls short to describe the evolution of the SemWeb over the past 10 years. Tons of specs, several declarative languages and technologies have been grown to not just get beyond the verbosity of a serialization format such as XML, but also move away from the classic relational data model.Turtle, JSON-LD, SPARQL, Neo4J, Linked Data Fragments,... come to mind. And then there are the emerging applications of linked data. If anything, the Federated Web is exactly about URLs and semantic web technologies based on linking and contextualizing data.The entire premise of Tim Berner Lee's Solid/Inrupt is based on these standards including URI's.Linked data and federation isn't just about challenging social media, it's also about creating knowledge graphs - such as wikidata.org - and creating opportunities for things such as open access and open science.Then there's this:> httpRange-14 sought to answer the fundamental question of what a URL is. Does a URL always refer to a document, or can it refer to anything? Can I have a URL which points to my car?> They didn’t attempt to answer that question in any satisfying manner. Instead they focused on how and when we can use 303 redirects to point users from links which aren’t documents to ones which are, and when we can use URL fragments (the bit after the ‘#’) to point users to linked data.Err. They did.That's what the Resource Description Framework is all about. It gives you a few foundational building blocks for describing the world. Even more so, URI's have absolutely NOTHING to do with HTTP status codes. It just so happens that HTTP leverages URI's and creates a subset called HTTP URL's that allows the identification and dereference of webbased resources.You can use URI's as globally unique identifiers in a database. You could use URN's to identify books. For instance urn:isbn:0451450523 is an identifier for the 1968 novel The Last Unicorn.So, this is a false claim. I could forgive them for inadvertently not looking beyond URL's as a mechanism used within the context of HTTP communication.> In the world of web applications, it can be a little odd to think of the basis for the web being the hyperlink. It is a method of linking one document to another, which was gradually augmented with styling, code execution, sessions, authentication, and ultimately became the social shared computing experience so many 70s researchers were trying (and failing) to create. Ultimately, the conclusion is just as true for any project or startup today as it was then: all that matters is adoption. If you can get people to use it, however slipshod it might be, they will help you craft it into what they need. The corollary is, of course, no one is using it, it doesn’t matter how technically sound it might be. There are countless tools which millions of hours of work went into which precisely no one uses today.I'm not even sure what the conclusion is here. Did the 'hyperlink' fail? did the concept of a 'URI' fail? (both are different things!) Because neither failed, on the contrary!Then there's this wonky comparison of the origin of the Web with a single project or a startup. The author did the entire research on the history of the URI but they still failed to see that the Internet and the Web were invented by committee and by coincidence. Pioneers all over the place had good ideas, some coalesced and succeeded, others didn't. Some were adapted to work together in a piece-meal fashion such as Basic Auth.And that's totally normal. Organic growth and distribute development is the baseline. Yes, the Web as we know it today is the result of many competing voices, but at the same time it could only work if everyone ended up agreeing over the basics.The fact of the matter is that some companies - looking at you FAANG - would rather have us all locked in a closed, black-box ecosystems, rather then having open standards around that allow for interoperability, and thus create opportunities for new threats to challenge their business interests.I understand that the article is written by CloudFlare, a CDN company with its own interests. But I'm trying to wrap my ahead around how the author failed in addressing exactly future opportunities and threats, after this entire exposé.

评论 #22494113 未加载

评论 #22501339 未加载

评论 #22494443 未加载

评论 #22501304 未加载