Identifiers are better off without meaning

33 点作者 srvaroa大约 1 年前

17 条评论

mewpmewp2大约 1 年前

This problem seems easily solvable. Just use the last part for resolution. E commerce shops use slug ids which always carry meaning in URLs for SEO and UX purposes.So there are constantly URLs that if full url path was used they wouldn't resolve to anything.But you juat the product id part only and then either redirect to new categories or you keep the same URL. Up to you.I feel like all cases are teally solvable and slug ids or ids with meaning are actually great.So I am talking about URLs like /electronics/smartphones/apple-iphone-blue-32gb etc.This is very good for UX and usability as well.

skybrian大约 1 年前

Wikipedia does pretty well with redirects and disambiguation pages. Email can be forwarded. It's more difficult if you don't have a redirect mechanism.I don't think any of us would prefer a meaningless number to a username? But you can create a new account if you wish.And of course we use meaningful names in source code.One scheme I like for URL's is a meaningless id followed by an SEO string in a URL, where only the number is used and it redirects if the SEO string doesn't match.

ch33zer大约 1 年前

One situation where we made the decision to go against this advice was prefixing a type to our identifiers. This was only useful for humans to be able to distinguish the id types from each other so they wouldn't make mistakes when querying our service (passing a Foo id instead of a Bar id) since the id formats were otherwise identical.We also made the decision that the geographical cluster doing the processing would be embedded in the id. This may have been a mistake, we never got to the point where it caused problems but if there was a major reorganization then yeah could have been problematic.

geon大约 1 年前

A product I worked on used a complex scheme of encoding some 4-5 ids into an integer by reserving certain integer ranges etc. It was super inflexible and predictably caused issues when their 16-bit ids weren't enough for new projects.It was also very difficult to work with. I had to refactor the lovecraftian mess that generated them for an entire project of thousands of nodes. If they didn't come out exactly the same, a technician would have to spend days manually updating physical units to match the new ids. Thank God for unit tests.

arno_v大约 1 年前

I actually had the experience of using meaningful identifiers in a previous company. At that time it was really handy in a lot of day to day stuff, but I'm now thinking we might not have used them long enough to have run into the problems described here.

评论 #40267360 未加载

评论 #40270239 未加载

评论 #40271355 未加载

creer大约 1 年前

Discipline? Doesn't the fine article point it out themselves? Carefully chosen degree of meaning embedded in an identifier has many practical uses. After that what matters is the tradeoff: Choosing well what gets embedded or doesn't - and so what will require a lookup always or only occasionally? And additional headaches such as caching that lookup? Choosing well which libraries code-in the embedding - as opposed to all over the place? Not trusting that the embedding is "self-documenting"? Etc, etc.The author provides several examples, and they were all chosen because there was a benefit. Although some were poorly chosen like a group name rather than a function (everyone remembers individuals' emails rather than functions). Whether it was worth it is easy to dispute after that choice caused a headache but is "sore loser bias": it doesn't account for all the worthwhile effective choices elsewhere. Nor does it account for the effectiveness it bought the project in the meantime.All the way to the extreme: do we prefer blog URLs that mention at least some category, date and a few words of subject line. Or do we go with opaque machine generated ones? Several lines long for good measure? Does the fact that you will never have to rename the opaque ones justify inflicting them on the users? How likely are you to ever rename? Some people will still choose the opaque URL! Do they earn points with their readers?

评论 #40267283 未加载

kwhitefoot大约 1 年前

I and many colleagues fought this war for decades in a major international electrical engineering company. I don't think we ever convinced more than a tiny fraction of people to stop embedding meaning in the names of things despite the obvious trouble it caused.I wonder now that I am retired if we should perhaps just give up the fight and instead concentrate on mitigation.

readthenotes1大约 1 年前

Remember this lesson being taught in the early 1980s. I guess "in software we step on the feet of giants" is still true

Leftium大约 1 年前

Is this an argument against strongly typed identifiers, or just an argument against adding too much meaning to the identifier?I think if the identifier just adds what type of data it is identifying, the extra meaning will only become obsolete at the same time that data does. And the extra type information can help avoid/debug problems where the wrong type of id was used.Compare with strongly typed ids/type branding:- <a href="https://hw.leftium.com/#/item/39174998" rel="nofollow">https://hw.leftium.com/#/item/39174998</a>- <a href="https://www.peakscale.com/strongly-typed-ids/" rel="nofollow">https://www.peakscale.com/strongly-typed-ids/</a>- <a href="https://andrewlock.net/using-strongly-typed-entity-ids-to-avoid-primitive-obsession-part-1/" rel="nofollow">https://andrewlock.net/using-strongly-typed-entity-ids-to-av...</a>

gary_0大约 1 年前

I think you could extend this argument to include the philosophy that databases should be glorified key-value stores, and semantics should only be handled by application code.I think a key point in the article is that "models become obsolete faster than we’d like".On the other hand, you could argue that it's simply necessary to put in the effort to keep the data model up to date with your current needs, through API versioning, database migrations, etc.Honestly I'm not sure which approach is less messy; maybe it depends on the team.

评论 #40268488 未加载

foobarkey大约 1 年前

I am currently using identifiers that encode the source and type in them. Mainly did it to make sure various backoffice ID-s never overlap, but added type since it is sometimes useful to identify object type just by looking at ID. This article is a bit discouragong but I have seen this also work well in practice so I guess we shall see how it works out. In any case the problems mentioned are unicorn++ problems and I would be happy if we need to tackle them :)

评论 #40272901 未加载

kmeisthax大约 1 年前

Any semantic meaning in identifiers means all your data is completely denormalized. 1NF specifically requires storing all data as irreducible columns[0]. Most 1NF violations aren't actually all that consequential beyond not being able to do useful JOINs on the data in the column, but doing it in a candidate key column is a bottomless pit of update hazards.[0] This means no comma-separated lists in strings, JSON columns, serialized PHP objects, and so on.

评论 #40267448 未加载

hobs大约 1 年前

The correct answer is to give identifiers to people, give them meaning, and then not use those identifiers in any way except for that.In the backend use your own primary key, put a leading index on that other thing, it can even be shitty and long but if you have the first 50 chars indexed it will be fast as hell to lookup 99.999% of cases.

mycall大约 1 年前

Reminds me of SQL 101.Natural keys serve as a great primary key when contextual meaning is important. A surrogate key is a key which does not have any contextual or business meaning.

jandrewrogers大约 1 年前

Most of these issues can be avoided if you properly encapsulate the identifiers so that you do not have arbitrary third-parties trying to locally interpret meaning from the identifiers beyond identity. There are additional issues with the collisions when using identifiers with semantic structure between systems.If you are going to add semantic structure to an identifier, which is frequently useful and a good idea, best practice is usually to encrypt it before sending it to the external world. Encrypting a UUID-like structure is approximately free on modern computers.

评论 #40267281 未加载

评论 #40271851 未加载

评论 #40267374 未加载

评论 #40267134 未加载

EGreg大约 1 年前

Is it just me, or is this just a special case of indirection? For example, virtual function pointers versus hardcoding method calls as jumps directly to some code. Pointers and indirection in general give you a place where you can update associations, at the cost of one extra lookup on every access! In database, you often have many to many join tables, just in case. Those tables give you total flexibility later to change associations or even introduce new ones. So instead of having your identifiers point directly to the thing, simply have such a table for one more look up in between.At my company, we often ran into the same question over and over, namely, weather a convention should go one way or the other way. And in almost every case, we found it’s better to just make a general implementation with the options being available to be supplied at runtime, or in a configuration file. In other words, don’t choose, implement a more general solution. That has become the policy in our company.

评论 #40268308 未加载

Joker_vD大约 1 年前

> Addresses make notable examples. The “complex and idiosyncratic” Japanese address system reflects the organic growth of its urban areas. In British postal codes the final part can designate anything from a street to a flat depending on the amount of mail received by the premises.Those systems are actually useful, you know. I have a friend who used to live at a place with address like "Northern Living Block, 157". There were about 300 total buildings in that block, and the numbers were assgined to the buildings pretty randomly, so it was impossible to navigate unless you were either given explicit directions, or had a map with you.The routing info has to live somewhere, you know. Pushing it into the IDs means that you don't need to "have an entry in some database saying '1|INFRA|HOST|12' RUNS '1|APM|APPLICATION|23'", you don't have to update/delete it as needed, you don't need to look it up and deal with caching issues, etc.