We should consider the following properties when evaluating ID formats and generation algorithms:<p>1. Private: you shouldn’t be able to gain information about the system using the IDs from an ID alone. E.g. document enumeration attacks like what happened with Parler (<a href="https://www.wired.com/story/parler-hack-data-public-posts-images-video/" rel="nofollow">https://www.wired.com/story/parler-hack-data-public-posts-im...</a>)<p>2. B-tree/cache friendly: newly created IDs should all exist in a narrow range of values. This is helpful for databases.<p>3. Stateless: ideally you shouldn’t need to know the current state of the system to create a new ID.<p>4. Human-friendly: IDs should be easily dictated, copied, pasted, etc. This means they should be encodable as text that is short and does not include ambiguous characters. Bonus points for error detection like with credit cards.<p>Some of the these properties are in conflict. Statelessness is achieved by randomly generating long IDs, but people don’t like reading or typing long IDs.<p>Different use cases will need these properties in varying amounts. If you don’t intend to expose the IDs to users, (4) doesn’t matter. Just use long, randomly generated byte strings prepended with the date. Most databases have a UUID type that fits the bill.<p>If users are going to be working with IDs, that’s more complicated. If not every document has a user-facing ID, just go with the non-user-facing ID like before, and generate a shorter, random, stateful ID as needed.<p>I don’t think NanoID prepends the date, which means it won’t be efficient when inserting large numbers of IDs into a large index. They also default to using ambiguous characters like 1 and I and l. Also no error code. But they are shorter than UUIDs. So it doesn’t meet property (2), and it only kind of meets property (4). NanoIDs are random, so you’re probably safe from enumeration attacks (1). NanoIDs mostly leave statelessness as a decision for the user. They have a nice tool that helps estimate how long the IDs should be (<a href="https://zelark.github.io/nano-id-cc/" rel="nofollow">https://zelark.github.io/nano-id-cc/</a>) for a given collision resistance.<p>I think we can do better overall. Bitcoin uses a good encoding scheme called base58check (<a href="https://en.bitcoin.it/wiki/Base58Check_encoding" rel="nofollow">https://en.bitcoin.it/wiki/Base58Check_encoding</a>). It generates fairly short strings and uses a checksum at the end. I think it could be refined for non-bitcoin purposes, but it’s already pretty good.<p>A 128-bit value like the ASCII string “hackernewstestid” is encoded as “Dtajqjz5pptWcmGrNcwBx7”. It’s about 2/3 the size of the equivalent UUID, even with the (unnecessarily long for this use case) checksum. It also has no punctuation.<p>I’d like to see a small ID standard that meets the above requirements and has a choice for either stateless and long or stateful and short. Maybe another choice for secure random or insecure. But all options would have binary form and a text form. The text form would use something similar to base58check, but probably with a smaller (or user-determined) length for the checksum.