19 random digits is not enough to uniquely identify all human beings

42 pointsby adunkabout 2 years ago

17 comments

toast0about 2 years ago

> Suppose that you assigned everyone an 19 digit number. What is the probability that two human beings would have the same number?If I assigner the numbers, I would carefully print 8 billion or so slips with different numbers, and place them in an urn. For each person, I'd draw from the urn, and then the probability of me assigning the same number to two humans is related to IT failures during generation. How many dollar bills share a serial number unintentionally?

评论 #35608528 未加载

corysamaabout 2 years ago

> If you want the probability to be effectively zero, you should use 30 digits or so.You really don't need it to be effectively zero. Just significantly less than other sources of error. For 8 billion people, a 3% chance of there being single error somewhere on the whole planet is pretty good on the scale of issues that the QA team needs deal with. Especially when the fix is: Roll a second random number for that single unlucky person on Earth. That's pretty easily auto-detected and auto-fixed at generation time.I worked on a big game that used 32-bit hashes of asset names even though we expected to get on the order of 10 collisions across the dataset. The solution was to detect collisions and tell artists to tweak their file names. Happened about 10 times over the course of many years and hundreds of thousands of assets.

评论 #35608785 未加载

评论 #35608623 未加载

andsoitisabout 2 years ago

UUID to the rescue. <a href="https://en.wikipedia.org/wiki/Universally_unique_identifier" rel="nofollow">https://en.wikipedia.org/wiki/Universally_unique_identifier</a>It:- is a standard- adoption is widespread in computing platforms- addresses the birthday problem (in v4, chances of a collision in a 103 trillion set is 1 in a billion)

评论 #35608559 未加载

评论 #35609836 未加载

armchairhackerabout 2 years ago

Why 19 is important: log(2^63) ~= 19, which means a random 64-bit integer is not long enough to uniquely identify all human beings. A 128-bit integer or UUID is

评论 #35608765 未加载

评论 #35608675 未加载

hedoraabout 2 years ago

The social security administration in the US solved this long ago. Reserve a few bits for “which source generated this integer”, then go sequential, or random w/o replacement for each shard.(Note that SSN’s were not meant to be unique when this scheme was invented. They were designed to be reused periodically. Name, DOB and SSN should be unique though.)

评论 #35609751 未加载

psaipetcabout 2 years ago

The Birthday Paradox and Microsoft GUIDs or How I use Mathematica to Reassure Myself to Go Back to Sleep at Nights: <a href="https://www.atriumtech.com/pongskorn/birthdayparadox/birthdayparadox.htm" rel="nofollow">https://www.atriumtech.com/pongskorn/birthdayparadox/birthda...</a>

stubishabout 2 years ago

Given the requirement to detect collisions and a process to reissue an ID for various legal reasons, I don't see a problem. In fact, don't use too many digits. If you use too many digits, collisions will be so rare that when they occur nobody will know how to handle them. Better to have a process that gets exercised a few times a year at minimum.

geoahabout 2 years ago

The actual title is “19 random digits is not enough to uniquely identify all human beings”.The tl;dr is that you need at least 30 digits so it’s safe to assign a random number to a person with a close to zero probability of already being assigned.I’m not really sure why the author is talking about 19 digits. Must be a reference to something I guess but I really don’t know what.Also this doesn’t mention for how long this is valid for. New people keep getting born, so at some point 30 numbers won’t be enough unless we don’t care about reuse of the ones where people might be long gone.

评论 #35588534 未加载

评论 #35608446 未加载

评论 #35608404 未加载

评论 #35608237 未加载

asow92about 2 years ago

Speaking of birthdays, what is the probability of two people with the same 19 digit number colliding with the same birthday as well?

评论 #35608485 未加载

评论 #35608522 未加载

tacocatacoabout 2 years ago

I dont want to be a number. Why can't I be a word or some exclamation points?

评论 #35609208 未加载

devcatabout 2 years ago

there are great solutions like <a href="https://en.m.wikipedia.org/wiki/Snowflake_ID" rel="nofollow">https://en.m.wikipedia.org/wiki/Snowflake_ID</a> if you give up randomness

dredmorbiusabout 2 years ago

Another advantage to greatly-overeingineered random string lengths is that it makes brute-force namespace search technically infeasible. This is an increaasing problem with PSTN (public switched telephone networks) where there are simply too few digits in a phone number to prevent comprehensive dialing attacks. (Number reuse would be another principle problem.)Long ago I'd read a Douglas Hofstadter essay (probably from his Scientific American puzzles column and compiled in Metamagical Themas) where he'd commented on the apparent idiocy of having very long account numbers which were clearly far larger than the possible object space.That critique fails to consider any number of points, including the challenges of non-coordinated UUID assignments (as Lemire writes), the practice of coding other semantic information into parts of a larger string (e.g., branch or office identifiers, years, or other redundant information), systems which have grown out of mergers of multiple independent systems (where accounts from systems A and B might have collided, so A and B now require distinguishing, as well as C, D, E, ...), and as I've noted, the ease of searching the namespace for valid / assigned values.

tanbog5about 2 years ago

Give everyone an ip6 address?

m3kw9about 2 years ago

I’d just assign an incremental number 9 digits

评论 #35608880 未加载

hgsgmabout 2 years ago

HN's digit removal algo is getting out of hand.

评论 #35609307 未加载

评论 #35608538 未加载

评论 #35608868 未加载

macinjoshabout 2 years ago

Why is HN removing numerals at the begin of a title? Second time I’ve seen it today.

评论 #35608941 未加载

评论 #35611140 未加载

评论 #35608974 未加载

photochemsynabout 2 years ago

This is the most dystopian thing I've read in a while. Prison Planet Earth, where everyone is uniquely identified by a number, perhaps surgically implanted on an RFID tag? How much compute resource would be needed to create an AI minder for each and every human on top of that, constantly updating their social credit score on a daily basis?

评论 #35608422 未加载

评论 #35608664 未加载

评论 #35608475 未加载

评论 #35608578 未加载

评论 #35608594 未加载