TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

UUIDs are obsolete in the age of Docker

20 点作者 lshevtsov大约 2 年前

17 条评论

remram大约 2 年前
&gt; this is only correct about UUID version 1. However, it is what most applications use.<p>This is a bold claim and doesn&#x27;t match my experience at all. UUIDv4 is all I see, everywhere, everyday.<p>That&#x27;s also a big enough caveat to put in the title: if you have a beef with UUIDv1, say UUIDv1 is obsolete.
评论 #35992849 未加载
ekimekim大约 2 年前
As the article points out, this is only an issue with UUIDv1. They claim &quot;However, it is what most applications use.&quot; but I have no idea how true this is. I was under the impression that the vast majority of UUID generators were v4 by default. For example:<p>Postgres only offers random uuid generation (<a href="https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;15&#x2F;functions-uuid.html" rel="nofollow">https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;15&#x2F;functions-uuid.html</a>).<p>The `uuidgen` CLI tool, at least for modern versions (I have not checked historically), says (from <a href="https:&#x2F;&#x2F;man7.org&#x2F;linux&#x2F;man-pages&#x2F;man1&#x2F;uuidgen.1.html" rel="nofollow">https:&#x2F;&#x2F;man7.org&#x2F;linux&#x2F;man-pages&#x2F;man1&#x2F;uuidgen.1.html</a>): &quot;By default uuidgen will generate a random-based UUID if a high-quality random number generator is present.&quot; (later it lists &#x2F;dev&#x2F;random as such a generator, present on almost all systems)<p>What&#x27;s an example of a system that generates v1 uuids by default?
评论 #35992947 未加载
评论 #35993008 未加载
SPBS大约 2 年前
1. Nobody uses UUIDv1. Why use UUIDv1 as a straw man argument?<p>2. UUID strings are awful for storage -- don&#x27;t use them. Yes there are databases that support UUIDs natively, why is whether or not a UUID fits into a machine word relevant? You use UUIDs for its other properties that 64-bit integers cannot offer. KSUIDs are touted as fixing all the aforementioned issues but they&#x27;re even bigger than UUIDs.<p>3. Both KSUIDs and UUIDs are hard for humans to read compared to 64-bit integers.<p>4. You don&#x27;t <i>have</i> to encode UUIDs as hexadecimal numbers plus dashes. You can choose any binary encoding you want, I am partial to Crockford Base32 because of how general-purpose it is (no vulgarities, case insensitive so it works on Windows filesystems).<p>5. I still consider time-sortable UUID alternatives (like ULID) to be UUIDs. This article should have explicitly mentioned UUIDv1 and UUIDv4 in the title and it wouldn&#x27;t have been so flamebait.
评论 #35998283 未加载
yamtaddle大约 2 年前
&gt; If you require a globally unique string ID, consider URIs<p>Is my knee-jerk judgement that this advice borders on nonsense, unwarranted?
评论 #35992940 未加载
majewsky大约 2 年前
Is anyone even still using non-random UUIDs? Every application I&#x27;ve ever seen them use is using v4.
评论 #35992882 未加载
happytoexplain大约 2 年前
Similar to other comments, I&#x27;ve only encountered v4 in my career. Is there a large domain where v1 is the norm that dominates the statistic, and most people happen to not work in that domain? If the author knows, I wish they&#x27;d say.
gnu8大约 2 年前
&gt; They are awful as keys – being strings, comparisons are dramatically slower than with integers. And even if your database has a UUID type, it’s still worse because the identifier doesn’t fit into a machine word.<p>I’m just a bit confused, a UUID is made up of hexadecimal digits, so why would it be stored as a string? It’s also 128 bits long, so it should fit into two words, excluding whatever overhead the DBMS puts on the data type, which is really their problem to worry about.
评论 #35993061 未加载
评论 #35993359 未加载
starfox64_大约 2 年前
I&#x27;ve had a similar issue with MongoDB&#x27;s ObjectIDs. They are generated using a combination of process id, UNIX timestamp and a counter that is randomly initialized during process creation. The issue when docker comes into the mix is that the root process id of every container is 1 so a decent chunk of entropy is removed from the ObjectID. Add to that the fact that the timestamp doesn&#x27;t have millisecond resolution, the only thing saving you is praying the counter of any of your processes never overlaps during the same second.<p>It&#x27;s unlikely to happen but still possible and it has brought down some of our parallel worker pool because once you have a collision, you are bound to keep generating the same id sequence until you restart your whole process to randomize the counter again.
评论 #35996648 未加载
Demiurge大约 2 年前
I&#x27;ve never thought UUIDv1 was useful in any virtualized context, and I hope it should be obvious, but maybe it&#x27;s worth stating in the UUID generation docs. It is already explained somewhat well what the versions are in Python docs.<p>However, with all the things already supporting UUID, I also don&#x27;t see any reason to switch from UUIDv4 to anything else. I don&#x27;t see how UUID, in general is obsolete, with the support it has from different libraries, and databases.
woile大约 2 年前
What about ulid as an alternative?
评论 #35993104 未加载
moltar大约 2 年前
One great benefit of UUIDs I have found is inability to join a wrong row.<p>If you use incremental numbers, every table has 1, 2, 3.
arcticfox大约 2 年前
I was confused by this title because I only use uuid v4...the author covers that in the article, but I&#x27;m surprised that so many people use uuid v1. I thought v4 was the most popular, but that&#x27;s probably just because I mostly work with my own code
fabian2k大约 2 年前
Is there any reason to use anything except completely random UUIDs? I vaguely remember reading about problems with MAC-based UUIDs decades ago, my impression was that they have been discouraged for a long time already.
halosghost大约 2 年前
&gt; Note: this is only correct about UUID version 1. However, it is what most applications use.<p>Okay, so, not all UUIDs, just v1. And, for some anecdata, I&#x27;ve actually only interacted with UUID v4 in my entire career; I don&#x27;t know what the actual norm is, but I&#x27;m surprised to hear that it might still be v1.<p>&gt; The only other practical option is version 4 – the random UUID – but random is intuitively worse, right? Read on to find out.<p>Oh… how is it worse?<p>&gt; * They are awful as keys – being strings, comparisons are dramatically slower than with integers. And even if your database has a UUID type, it’s still worse because the identifier doesn’t fit into a machine word.<p>&gt; * They are excessively long – each character of a UUID only encodes 3.5 bits of information if you count the dashes. That’s twice as less compared to 6 bits of Base64.<p>Sorry, UUIDs are not strings, they&#x27;re 128-bit integers. They have a standardized string representation, but if you&#x27;re storing a UUID as a string, you&#x27;re either being required to because your language&#x2F;db&#x2F;tools&#x2F;etc. don&#x27;t support UUIDs correctly, or you&#x27;re doing it wrong.<p>&gt; * They are not time-ordered – despite containing a timestamp, its bits are mixed up within the UUID: the top bytes of the UUID contain the bottom bytes of the timestamp. Databases do not like an unordered primary key – it means that freshly inserted rows can go anywhere in the index. And you can’t use UUIDs for ad-hoc time sorting by time, either.<p>This is <i>definitely</i> a drawback when using a UUID as a primary key, and there are alternatives for this specific use-case. However, I think the best solution I&#x27;ve seen to this is to use a typical 64-bit integer for the primary key, but a UUID for a user-visible ID (so that you don&#x27;t leak information about the primary keys to users); this makes joins and indexes fast, but avoids the leak to the end-user.<p>&gt; * They are bad for human comprehension – UUIDs tend to look alike, and it’s hard to visually seek and compare them. This comes from experience.<p>This is exactly why they shouldn&#x27;t be used as an Id anywhere that a human needs to interact with one. In the above solution I mentioned, the most common ID for which you&#x27;d want to use a UUID is the user&#x27;s id—the user specifically has no reason to ever refer to their or anyone else&#x27;s id; they&#x27;ll use the human-readable username&#x2F;handle equivalent instead. And developers don&#x27;t need to care about UUIDs ever because inside the db, you&#x27;d have the integer primary key that you use for joins. This seems to solve all the problems?<p>&gt; I kindly suggest that UUIDs are never the right answer.<p>Honestly, I think you&#x27;ve only convinced me that UUID v1 is never the right answer… and I think that&#x27;s mostly been true since v4 came about.<p>All the best,<p>-HG
评论 #35993063 未加载
WirelessGigabit大约 2 年前
Obligatory read about UUIDs derived from MAC addresses: <a href="https:&#x2F;&#x2F;devblogs.microsoft.com&#x2F;oldnewthing&#x2F;20040211-00&#x2F;?p=40663" rel="nofollow">https:&#x2F;&#x2F;devblogs.microsoft.com&#x2F;oldnewthing&#x2F;20040211-00&#x2F;?p=40...</a><p>TLDR on the article: don&#x27;t use UUIDv1.<p>Lastly, even with the best and most randomized generation, it still doesn&#x27;t protect you from copy pasting: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=22354449" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=22354449</a>
coolgoose大约 2 年前
Sometimes, I am amazed about what gets on the front page of ycombinator.<p>TLDR: Don&#x27;t use UUID v1, since its entropy is based on the Mac address, if your cloud provider is generating the same mac addresses for all your containers.<p>To say not use UUID&#x27;s it makes no sense. Use UUIDv7, use them in postgres <a href="https:&#x2F;&#x2F;github.com&#x2F;fboulnois&#x2F;pg_uuidv7">https:&#x2F;&#x2F;github.com&#x2F;fboulnois&#x2F;pg_uuidv7</a> have fun :)
jupp0r大约 2 年前
In practice, I generate UUIDs entirely using entropy from &#x2F;dev&#x2F;random. The probability of a collision is really low for most use cases (although not if you are Google and need something unique across all database rows in your company or something similar).
评论 #35993738 未加载