> This is the same IP address: 3232271615. You get that by interpreting the 4 bytes of the IP address as a big-endian unsigned 32-bit integer, and print that. This leads to a classic parlor trick: if you try to visit <a href="http://3232271615" rel="nofollow">http://3232271615</a> , Chrome will load <a href="http://192.168.140.255" rel="nofollow">http://192.168.140.255</a>.<p>This was the source of one of my favorite “bugs” ever. I was working on multiple mobile apps for a company, and they had a deep link setup that was incredibly basic: <scheme>://<integer>, which would take you to an article with a simple incrementing ID. This deep link system “just worked” on iOS and Android; take the URL, grab the host, parse it as an int, grab that story ID. Windows Phone, however… the integers we were parsing out were totally wrong, returning incredibly old stories!<p>Turned out that the host we were given by the frameworks from the URL was auto-converted to an IP in dotted-quad format, and then the int parser was just grabbing the last segment… which meant that we were always getting stories <256, instead of the ~40000 range we were expecting.
These different representations also lead to frequent server side request forgery (SSRF) bypasses - someone might be blocking local IPv4 but you can still access their AWS metadata endpoint at ::ffff:169.254.169.254, etc.<p>For anyone using Ruby, I'm the author of a gem [1] that comprehensively protects against SSRF bugs. For anyone using Golang I recommend this [2] blog post.<p>[1]: <a href="https://github.com/arkadiyt/ssrf_filter" rel="nofollow">https://github.com/arkadiyt/ssrf_filter</a><p>[2]: <a href="https://www.agwa.name/blog/post/preventing_server_side_request_forgery_in_golang" rel="nofollow">https://www.agwa.name/blog/post/preventing_server_side_reque...</a>
Can confirm that visiting <a href="http://127.1" rel="nofollow">http://127.1</a> on ipad indeed works and redirects to <a href="http://127.0.0.1" rel="nofollow">http://127.0.0.1</a>. This is very surprising and, at least for me, humbling.<p>I think I will quote this article any time I see someone using regex to validate or parse IPs.
I'm now going to change my LAN to use 10.0.0.1 instead of 192.168.0.1 so that I can just type 10.1 This will help not only when testing stuff on mobiles only to have to rewrite the whole adress again because you forgot <a href="http://" rel="nofollow">http://</a> but also when telling the kids what IP to connect to when setting up LAN games. Or coworkers when telling them them some LAN/router IP. Time server is on 10.36
> I’m on the fence about that last one, the “IPv6 with an embedded dotted decimal” form. My reference parser (Go’s net.ParseIP) understands it, but it’s not really that useful any more in the real world. At the dawn of IPv6, the idea was that you could upgrade an address to IPv6 by prepending a pair of colons, as in ::1.2.3.4, but modern transition mechanisms no longer offer anything as clear-cut as this, so the notation doesn’t really show up in the wild.<p>I have to disagree with this conclusion. I see it very frequently on Linux. It turns out that programs can bind their listen address to just ::, and the kernel will still allow connections from IPv4, with the address mapped to ::ffff:0.0.0.0/32 -- outbound connections use the same notation.
> 1:2:3:4:5:6:77.77.88.88 means 1:2:3:4:5:6:7777:8888<p>Wait, what? 77.77.88.88 is in dotted decimal. It doesn't correspond to 7777:8888 in hex.<p>edit: Somebody else already noticed on Twitter:<p>> And as @alanjmcf noticed, I messed up one of the representations above.<p>> 1:2:3:4:5:6:77.77.88.88 means 1:2:3:4:5:6:4d4d:5858, not 1:2:3:4:5:6:7777:8888. I missed out a decimal-to-hex conversion in there.
> <i>It does not process Class A/B notation, or hex or octal notation.</i><p>I got to find that notation useful <i>once</i>, to make a shorter one-liner... without even knowing that there were different classes of IPv4 address, and that I was looking at one of them.<p>It's a tiny function that gives me the IP address of my machine in the LAN, for either Linux and Mac:<p><pre><code> # Get main local IP address from the default external route (Internet gateway)
iplan() {
# Note: "1" is shorthand for "1.0.0.0"
case "$OSTYPE" in
linux*) ip -4 -oneline route get 1 | grep -Po 'src \K([\d.]+)' ;;
darwin*) ipconfig getifaddr "$(route -n get 1 | sed -n 's/.*interface: //p')" ;;
esac
}
</code></pre>
(sorry to people reading on small screens)<p>Full disclosure, I got the "1 is shorthand for 1.0.0.0" from here (which didn't get into explaining <i>why</i> it is a shorthand): <a href="https://stackoverflow.com/a/25851186" rel="nofollow">https://stackoverflow.com/a/25851186</a>
> So, it’s a de-facto standard that boils down to mostly “what did 4.2BSD understand?“<p>By the way 4.2BSD was being compatible with older or contemporary implementations, like ITS which was running TCP before any Unix was.<p>For example plenty of machines back then used octal as a preferred human representation. In fact that’s why octal is the default format of numeric constants in C: C, like Unix, was initially developed for an 18-bit (six octal digits) PDP-7. The smaller 16-bit PDP-11 version came later.
"All possible notations of this IPv4 address" <a href="https://lucb1e.com/rp/php/funnip.php?link&ip=80.100.131.150" rel="nofollow">https://lucb1e.com/rp/php/funnip.php?link&ip=80.100.131.150</a><p>It was a surprising amount of work to figure out all the different formats an IP address can be shown in and convert a given IP into all those formats.
How about the PGP word list? <a href="https://en.wikipedia.org/wiki/PGP_word_list" rel="nofollow">https://en.wikipedia.org/wiki/PGP_word_list</a><p><pre><code> $ ping stairway scavenger tracker upcoming
PING 209.216.230.240 (209.216.230.240) 56(84) bytes of data.
64 bytes from 209.216.230.240: icmp_seq=1 ttl=50 time=68.2 ms
64 bytes from 209.216.230.240: icmp_seq=2 ttl=50 time=69.5 ms
64 bytes from 209.216.230.240: icmp_seq=3 ttl=50 time=67.2 ms</code></pre>
As Go’s net package IP parsing was mentioned, here’s a fun fact: under their API it is impossible to distinguish between an IPv4-mapped IPV6 address and the equivalent normal IPv4 address.
Since I write a Lua-parsed DNS server which works with IPv6, even when compiled for an ancient version of MINGW on Windows XP (which has IPv6 support but no built-in IPv6 parser), I had to write an IPv6 address parser (no inet_pton(), which is what most programs use for IPv6 parsing, on that system).<p>No, I did not add dotted quad notation to the parser. No, you can not have more than four hex digits in a single quad; 00000001:2::3 is a syntax error. It supports “normal” stuff like ::, ::1, 2001:db8::1, and even non-normal stuff like “2001-0db8-1234-5678 0000-0000-0000-0005” (to be compatible with the really basic IPv6 parser I put in MaraDNS’s recursive resolver nearly two years ago), but does not support any of the IPv6 corner cases in the linked article.<p>The IPv6 test cases in the automated test for the parser are at: <a href="https://github.com/samboy/MaraDNS/blob/master/deadwood-github/tools/coLunacyDNS/sqa/sqa_ip6Parse/Input" rel="nofollow">https://github.com/samboy/MaraDNS/blob/master/deadwood-githu...</a> (The final three lines are supposed to return errors)
I especially love it when address parsers on the same OS don't agree:<p><a href="http://openbsd-archive.7691.n7.nabble.com/inet-net-pton-seems-broken-when-used-with-octal-or-hex-td193971.html" rel="nofollow">http://openbsd-archive.7691.n7.nabble.com/inet-net-pton-seem...</a>
I'm not convinced these are "cursed". They may be the result of bygone networking conventions, implementation ideas that never came to mainstream fruition, flexibility for use-cases etc. Just because we don't understand something that looks strange, doesn't mean it's cursed, nor that one can simply turn one's nose up and say "I don't understand why these exist so I'll just ignore them when I implement x".
I think they've got Class A/B/C wrong? Or at least they're using it in a way that I never learnt<p>> The familiar 192.168.140.255 notation is technically the “Class C” notation. You can also write that address in “class B” notation as 192.168.36095, or in “Class A” notation as 192.11046143. What we’re doing is coalescing the final bytes of the address into either a 16-bit or a 24-bit integer field.<p>According to this:<p><a href="https://www.digitalocean.com/community/tutorials/understanding-ip-addresses-subnets-and-cidr-notation-for-networking" rel="nofollow">https://www.digitalocean.com/community/tutorials/understandi...</a><p>Which details my understanding, classes refer to the ranges, not so much grouping the latter part<p>Happy to be corrected!
An "fun" use of ip addresses is in NTP.<p>in the ntp config file, you will have stuff like this:<p><pre><code> server 127.127.1.0 # local clock
</code></pre>
or:<p><pre><code> server 127.127.20.0 minpoll 4 iburst prefer # gps clock
</code></pre>
where the "ip address" is of the form: 127.127.<clocktype>.<instance><p>here's a page explaining the clock types:<p><a href="https://www.eecis.udel.edu/~mills/ntp/html/refclock.html" rel="nofollow">https://www.eecis.udel.edu/~mills/ntp/html/refclock.html</a><p>but basically it's a weird anachronism. I'm not sure if NTP will actually bind to those addresses using the tcp/ip stack, or if it someone just got lazy and coopted the ip address parser for off-label use.
What is the use-case of a decimal representation of a v6 address or a 32-bit int representation of an ipv4 address?<p>I’ve never had someone tell me, “see if you can ping 143267841”. I’ve worked in networking for coming up on 30 years now and just haven’t found the use.
Boomers like me know all of the IPv4 obfuscation techniques thanks to Fravia' Searchlores, may he forever rest in peace.<p><a href="https://www.theoryforce.com/fravia/searchlores/obscure" rel="nofollow">https://www.theoryforce.com/fravia/searchlores/obscure</a>
Wow, this.
One thing I didn’t see mentioned was “0”. You mentioned it, but it didn’t grok to something I know to work in some implementations: “ping 0” behaves like “ping 127.0.0.1”.
That's why things like IP address textual representation needs to be rigorously and formally specified using non-ambiguous syntax notation. The implementations then can formally verified to comply to this syntax spec. At the end I would love to have a formally verified library implementation of IP address parser for major mainstream programming languages which everybody could rely upon and do not try to write their own parser. That's a dream.
I wrote a little applet where you can put in a class A decimal IP address, and it gives you the 3×4 representations mentioned in the article: <a href="https://jtvjan.nl/tools/cursed_ipv4.html" rel="nofollow">https://jtvjan.nl/tools/cursed_ipv4.html</a><p>If you count mixed representations, there would be 120 possibilities, but the tool doesn't generate those.
I maintain a JavaScript library that does exactly this (called ip-address). Unit tests are very important for handling the esoteric formats, though there are a couple that were new to me in David's post.<p>One of my motivations for writing the library was being able to grep for IPv6 addresses in text files; it's surprisingly difficult to match all valid representations of a simple IPv6 address as seen in the example here:<p><a href="https://twitter.com/beaugunderson/status/527393872909828096" rel="nofollow">https://twitter.com/beaugunderson/status/527393872909828096</a><p>I also maintain a site for examining IPv6 addresses that may be useful to people working with IPv6:<p><a href="http://v6decode.com/" rel="nofollow">http://v6decode.com/</a>
I spent hours debugging an issue that boiled down to an IPV4 parser that treated leading zeroes as octal.
Connections to 192.168.123.100 worked as expected. Connections to 192.168.123.034 went to 192.168.123.28. I thought sure it was an issue in my TCP client code, which was handling connections to hundreds of different devices.<p>Guilty party was Poco::Net library if I recall correctly. I can maybe see this making sense if you provide four octal digits (0377), but not three, and I have a hard time believing anybody has ever used this on purpose.
> a big-endian unsigned 32-bit integer<p>This is how embedded stacks (LWiP) store IPv4. Didnt' know browsers could respond to it thought.<p>Mixing IPv4 and IPv6 is just evil.
Reminds me from email addresses, most sites are doing it wrong. *<p>There clearly should be a common library to take care of these things, which are way too complex for most of developers.<p>* <a href="https://en.wikipedia.org/wiki/Email_address#Examples" rel="nofollow">https://en.wikipedia.org/wiki/Email_address#Examples</a>
This Dec to hex to Sacco online converter might be helpful:<p><a href="https://www.rapidtables.com/convert/number/ascii-hex-bin-dec-converter.html" rel="nofollow">https://www.rapidtables.com/convert/number/ascii-hex-bin-dec...</a>
This is great! If I'm honest with myself, one thing keeping me from configuring IPv6 as an option locally was the intimidating addresses. This is a great explainer, I finally feel like I "get it".
Somewhat related: A simple IPv6 subnet calculator written in Lua: <a href="https://github.com/intc/ip6snetc" rel="nofollow">https://github.com/intc/ip6snetc</a>
At what point is the format parsed? Is <a href="http://36475893" rel="nofollow">http://36475893</a> sent to the router or converted to 192.168.56.12 in the browser?
They wrote it into a blog that may be nicer to read <a href="https://blog.dave.tf/post/ip-addr-parsing/" rel="nofollow">https://blog.dave.tf/post/ip-addr-parsing/</a>
<a href="https://www.php.net/manual/en/function.ip2long" rel="nofollow">https://www.php.net/manual/en/function.ip2long</a> and <a href="https://www.php.net/manual/en/function.long2ip.php" rel="nofollow">https://www.php.net/manual/en/function.long2ip.php</a> in PHP
Writing a parser and saying "I'm dropping support for all these old ways of doing things" seems like poor form.<p>Unless there is a big reason, never drop backwards compatibility. In this case, supporting all those forms would be very do-able. The best way to support them would be to find some old BSD parsing code and port it, then you can be sure every corner case is handled the exact same way. Handling corner cases differently is a great way to introduce security vulnerabilities and crash/DoS bugs that <i>every</i> user of your library will have to be aware of.<p>Maintaining such code isn't really a good excuse here either - the code is only going to be a few thousand lines, is self contained with no dependencies, is easy to test, not going to change much with time, etc.<p>Basically, there is no benefit to removing this feature, so don't break what isn't broken.