We are a large company and have many small systems with disparate key sets that we would like to standardized and would like to use UUIDs for the new global keyset. I have several questions regarding UUIDs if anyone can help with the answers.<p>1) Are there any issues co-mingling v4 & v5 UUIDs in a single system? We would use v5 to transition the legacy systems to UUID and v4 for generating anything outside the legacy systems (new systems).<p>2) Per https://tools.ietf.org/html/rfc4122, "A UUID is 128 bits long, and requires no central registration process.". This implies anyone in our ecosystem can generate a UUID and probably never collide. However, are there minimum restrictions on the machine that would generate these UUIDs? For instance, I would imagine a dependable clock (eg one that doesn't reset to 1/1/1970 ever time it restarts) is necessary, but do they all have to be in sync or is some measure of skewing acceptable? Anything else?<p>3) Is there a list of reliable (future proof) uuid implementations we can use to cover all the major languages or are the standard libraries sufficient for v4/v5 uuid gens? We have a mix of various flavors of *nix and windows in our ecosystem.<p>Thank you!
Would it be possible to avoid standardizing, and coupling, to a single ID type? It should be possible to locally store IDs as strings so the consumer of the IDs doesn't need to know how to parse it and the provider of the ID (the service) can choose whatever is natural.<p>For a content addressable file system it would be a hash, for another thing it might be an int, for another thing it might be a UUID, etc.<p>If you need to "validate" the ID is correct then the only way to do that is to contact the Source of Truth. Checking the syntax of the ID doesn't tell you if it is valid. That will introduce use-after-free like conditions in your system.
In Denmark we have a “newish” national standard for public service architecture called rammearkitekturen. It’s an attempt to make IT easier and cheaper in a country with the most digitised public sector of the world where 98 muniplacities each have 300 different IT systems on average.<p>Part of it includes a transition to UUIDs, and the way we deal with them is slowly and in stages. Some systems, especially new ones are build to use them as the standard identification, but some of our systems are 50 years old, run on mainframes, tandem computers and what not, often with a range of APIs on top of them. Others were designed with local non-standard UUIDs that would work if the systems hadn’t been sold multiple times. And so on.<p>But the most basic way to get into them is by adding UUIDs for external use, while the systems continue using their own ID system internally. Then eventually replace internal IDs with the UUIDs when it becomes possible both technically and financially.<p>This isn’t the cleanest approach and it’ll likely take a decade or two to complete, but doing a Big Bang transition on an enterprise scale, well I wouldn’t recommend it.<p>As for standards, go with the newest international standard on UUIDs for you to part of the world. We follow EU.
I think there is not a problem using multiple kinds of UUIDs in the same system; they will not interfere with each other, because the UUIDs are necessarily different.<p>Another possibility is to use URIs; UUIDs are URIs too! (Put "urn:uuid:" at front to make a UUID into a URI.)
If you're following RFC 4122, v4 and v5 UUIDs can be differentiated by looking at the version number field. (See section 4.1.3 of RFC4122.) So if you're worried about collisions between the two versions, as long as you're marking your UUIDs with the appropriate version, this shouldn't be an issue.<p>For name-based or "truly random" UUIDs, you don't actually use a clock. That's only for Version 1 UUIDs. (more info in section 4.3 of RFC4122.) They unfortunately kept the names "timestamp" and "clock sequence" for both v4 & v5 UUIDs, even though there is no time-based information in them. Section 4.3 of RFC4122 describes how bytes (octets) from the name-based hash function are placed in the timestamp and clock sequence fields (for v3 & v5 UUIDs) and Section 4.4 describes how random bits are placed in the various UUID fields.<p>In short, you don't need to know the time to generate a v4 or v5 UUID. Having your servers synchronize their clocks is a good idea generally though; it helps make sense of log files and some protocols freak out if there's too much clock skew.<p>Using Version 4 UUIDs <i>will</i> require you to have a "truly random" number generator (or something close to it.) I wrote a node.js package for generating UUIDs and made sure it gave the user the option of using /dev/random or /dev/urandom or some other option (pretty sure I defaulted to /dev/random.) At the very least you should know the difference between /dev/random, /dev/urandom and /dev/arandom. I have used /dev/urandom as a random number source, but they were only consumed by local processes (i.e. - i didn't give them out to external clients.) So if we learned later there was a flaw in /dev/urandom, the effects would not be exploitable by external actors.<p>If you're dealing with financial or PII data, there <i>may</i> be regulatory requirements on random number generation. Heaven help you if you're recording credit card numbers in there somewhere.<p>It's probably hard to get detailed advice without knowing the content of the data being stored and the context of it's use. But you can plan for the worst by:<p>a. using UUID generation software that allows you to securely specify a specific source for your (pseudo) random numbers.<p>b. understand that a UUID generator might block for an indeterminate amount of time if you're forced to use a PRNG that waits to collect sufficient entropy.<p>c. here's my old UUID generating code. i don't recommend using it; it's just too old. but it does give an example of using an interface that lets you select the source of random numbers. it also probably doesn't work on windows: <a href="https://github.com/OhMeadhbh/node-mug" rel="nofollow">https://github.com/OhMeadhbh/node-mug</a>