Debian opens a can of username worms

249 点作者 jwilk5 个月前

38 条评论

dfranke5 个月前

Allowing purely numeric usernames seems like a terrible idea to me, because it creates ambiguity between what's a username and what's a UID. It's common for tools like ls or ps to display a username when one is found and fall back to displaying a UID if it isn't, and similarly tools like chown will accept either a UID or a username and disambiguate based on whether it's numeric or not. Now suppose there's a numeric username that doesn't match its own UID, but does match some other user's UID. It doesn't take a lot of imagination to see how this would lead to vulnerabilities.

评论 #42339750 未加载

评论 #42341274 未加载

评论 #42344550 未加载

评论 #42345054 未加载

评论 #42342334 未加载

评论 #42346235 未加载

perlgeek5 个月前

Just imagine how many poorly-written shell scripts will break when we suddenly allow dollars, quotes, backticks and the likes in username. Heck, even allowing spaces sound like horror to me.On the display side, I'm sure most tools that display usernames won't make it easy to see if there are leading or trailing whitespace characters, double blanks, tabs etc in usernames.This sounds like support hell to me.

评论 #42340008 未加载

评论 #42345102 未加载

评论 #42339746 未加载

评论 #42347824 未加载

评论 #42341199 未加载

评论 #42339752 未加载

评论 #42339768 未加载

account425 个月前

Always fun to see people poke the Unicode dragon only to be dumbstruck by its true size as it stands up in preparation of engulfing them with the fire of unintended consequences.

评论 #42340240 未加载

评论 #42344361 未加载

hiccuphippo5 个月前

As someone who needs non-ascii characters to write my name: please don't. You are making things worse just to be "courteous" about something we don't care about and will actually be annoyed at if we have to find how to write a letter in the keyboard or worse case scenario, figure out how to change the layout to the correct one before I even logged in.

评论 #42341624 未加载

评论 #42341710 未加载

评论 #42346230 未加载

评论 #42350669 未加载

dsr_5 个月前

I will remind everyone that there are a minimum of three identifiers here.The UID, which is an integer. Ownership resides here; it's the primary key. Can be used by programs.The username, each of which must be unique and maps to one UID -- but multiple usernames can map to the same UID. Used by humans and programs to login.The GECOS field, or "human readable name", which is only used as a display label. Some systems include a structure inside this for additional info like phone number, office number, or similar". I don't think anyone would object to UTF-8 here.

seiferteric5 个月前

OMG Can't believe this, I ran into this exact thing at my last job. We discovered a security vuln in several of our services because we were accepting unsanitized usernames, but since we and doing things with them (passing them to scripts etc.) but only after passing them to useradd/usermod etc so we thought they were safe, and of course you could put in things like ";" and "&", ">" etc and do whatever you want. I discovered that debian DISABLED the username sanity checks and could not believe it. anyway I installed a patched version as well as sanitized input and other stuff to resolve the issue.

zvr5 个月前

Most people are too young to remember that when you typed your username in all-caps in the login prompt (because the CapsLock key was on by accident, for example), the login(8) program assumed you were in a connection that could only do 7-bit (upper case, but no lower case characters) and immediately switched the tty settings and you were then presented with a "\PASSWORD: " prompt.

评论 #42341112 未加载

rini175 个月前

Perhaps it's time to agree upon how to Unicode in identifiers? The normalization, unprintable characters, confusing characters with same glyphs, etc. It's obviously problematic when everyone is doing it on their own.

评论 #42339741 未加载

评论 #42340947 未加载

评论 #42339295 未加载

评论 #42340828 未加载

评论 #42339304 未加载

评论 #42343113 未加载

jcarrano5 个月前

I don't get it? What's the purpose of changing the default rule in shadow-utils. Not only is it completely unnecessary and introduces risks for shell injections, it also risks introducing incompatibilities between Debian and any other system.I feel that there are already too many other things to fix to be wasting time in creating new potential bugs.

soneil5 个月前

This reminds me of the systemd bug where usernames starting with a digit were mishandled (#15141).It seems to me like something that "should" be relaxed, but we need to have high confidence in the entire foodchain. adduser seems like the last place it should be changed, not the first - anyone requiring "enough rope" is already served by useradd.

mmsc5 个月前

I wonder how this will affect ssh. OpenSSH recently restricted more characters for valid usernames: <a href="https://github.com/openssh/openssh-portable/commit/7ef3787c84b6b524501211b11a26c742f829af1a">https://github.com/openssh/openssh-portable/commit/7ef3787c8...</a>

评论 #42339803 未加载

评论 #42339640 未加载

评论 #42339859 未加载

huhtenberg5 个月前

Sound like a solution in search of a problem.And a disruptive solution with unclear side effects at that.

miohtama5 个月前

I remember useradd and adduser when learning Linux and oh boy what a confusion it was... Why not just one command

kej5 个月前

I wonder if it would work to do something like the punycode system for internationalized domain names. Shell scripts could handle a name like `xn--0civ130n` just fine, and user-facing utilities could choose to convert that to :sparkle::unicorn: when appropriate. The same homograph protections would probably work, as well.

SuperSandro20005 个月前

They are clearly bored and want to start a year long bug hunt through half of unix

评论 #42345421 未加载

thway152690375 个月前

Before opening this can of worms, can we finally address that there is a hard, hardcoded limit of 255 bytes per file name (folder name) in Linux? Yeah, 255 bytes, that is, like 63 japanese characters or emojis or maybe less. And in kernel, too, so you physically cannot correct this issue by using another filesystem or something.Before anyone asks: yes, these folders do occur in real life, and I tired of pretending that they do not.

nmstoker5 个月前

Unfortunate ambiguous uses of the word drop throughout the otherwise excellent article

评论 #42340537 未加载

评论 #42339994 未加载

knorker5 个月前

So in the future I may not be able to even type the name of another user? Admins and other users not being able to type usernames sounds very bad.And I say that as someone whose native language has more letters than English.

hwc5 个月前

My work machine uses my complete email address as a user machine (this was a done by someone in the IT department). Vim gets confused when I use the `gf` command to open a path that contains an '@' character in it.

johnisgood5 个月前

> If a keyboard input system provides the former sequence of bytes, but the username is stored in the login infrastructure using the latter sequence of [bytes], then a naive comparison will not find the user "émollier" in the system. Unicode defines in Annex 15 a few normalization forms as a way to work around this problem. But a correct use of these normalization forms still requires coordination and standardization among all programs accessing the data.ICU could work, but adds an extra dependency, there is also GNU's libunistring.

chikere2325 个月前

oh yes, let's break things to gain nothing of value

评论 #42341020 未加载

评论 #42340045 未加载

nineteen9995 个月前

I have an affectionate place in my heart for Debian, the community is passionate, they have wonderful ideals, hell I even helped found a charity which distributes it on used PC's discarded by large companies to disadvantaged people over 20 years ago which is still running today. It was my favourite distro for a long time after I moved on from Slackware in the late 90's, I used it at home, I used it in my job at a small ISP on everything from x86 to Sun Sparc to DEC Alpha hardware. We are lucky in the Linux community to have them. I could care less about deriatives like Ubuntu, seems to be one too far removed.But over the years the bikeshedding and some of the poor technical decisions started to wear on me. The debconf approach of asking a million questions on install bothered me. In my current job we use it on small industrial ARM PC's and it does a great job there at a large scale distributed over a wide variety of environments and geographical area, scorching heat, freezing cold and everything in between. But that's easy because it's a single system image which we deploy to hundreds of devices and it only requires minimal customisation to perform the required tasks.But our datacenter servers remain RHEL for the simple reason ... the deployment and broad customisation process per server is easy, LDAP integration is straight forward and the customer wants to pay for support from the vendor even though we never use it. Security updates and bugfixes are delivered quickly and the vendors commitment to stability is commendable. It's a no brainer. More and more companies started to move their workloads to RHEL once it came out and unfortunately it just didn't make sense to bother with distributions outside of RHEL/Fedora for my personal use anymore, some sort of work/life balance is needed and I don't want to spend my personal computing time remembering all the idiosyncracies between different Linux distributions anymore. I would argue that Debian is pretty idiosyncratic and opinionated if you have come from more traditional UNIX systems in the 90's, while RHEL/Fedora more closely model an "evolution" of those classic systems if you like. It will be interesting to see what happens to RHEL in the coming years as Redhat becomes more and more absorbed into the IBM environment.

评论 #42344965 未加载

codedokode5 个月前

Don't you think that it would be better to get rid of usernames in UI? They only provide unique data for fingerprinting and do almost nothing useful on a single-user system. Wouldn't it be better to simply have a default name like "primary user" or "main user" for the first user and skip one step in installation process? Also it frees you from typing a username on login for a single-user system.

评论 #42341320 未加载

cratermoon5 个月前

My take: user names are not strings, though they may be represented as strings. As such, a type, e.g. Username, would provide a constrained and consistent range of allowed values, much as a type like float32 allows (within IEEE 754 rules).It's time for programmers to stop treating everything that can be represented by a string as anything representable by a string type.

biglost5 个月前

I'm the type off person who don't use spaces in files, usernames, directories or folders, i'm too old to keep fighting scripts, i would vote for ascii and spaces forbidden in almost everywhere. But the right thing is use UTF-8 and hooe for the Best

ipython5 个月前

This sounds like a security nightmare just waiting to happen. Nothing like embedding gigantic libraries like libicu into security critical code bases so you can do things like Unicode normalization and comparison functions on usernames.

okasaki5 个月前

Aren't pretty much all devices nowadays owned by a single person?What's the user case for non-system usernames at all?Why not just "user" and "root"?

jmclnx5 个月前

Company I work at moved to an ID like [A-Z]Employee-number. Moot point for them :

seu5 个月前

The fact that this whole discussion happens in english, partially explains why there is a discussion at all. The whole problem could have been avoided if the development of computers had been a more international effort.

评论 #42438007 未加载

评论 #42342414 未加载

rurban5 个月前

They are so stupid, I cannot believe!Names are identifiers, and such need to stay identifiable. There exist unicode security guidelines and rules for identifiers, they don't know about. My libu8ident library would help with that.

quectophoton5 个月前

chown is getting "fun", I guess.

UniverseHacker5 个月前

Clearly we should open up usernames to be an unlimited size set of mixed data types: e.g. the first “character” could be a hand drawn picture of a cat, the second the entire text of the US constitution in unicode, and so on. We could then extend this flexibility to filenames, passwords, and Unix commands. Internally, this could involve replacing all text strings with folders on a filesystem where you can put any files you want in any desired order. /s

评论 #42350517 未加载

bjourne5 个月前

Honestly, it is super brain-dead that Linux and other operating systems still have such massive problems with "special" characters. Just the other day I had to help someone who had trouble building. The cause turned out to be that they had dropped filenames with parentheses in the source directory which, apparently, confused bash which make relies on. Such trash is everywhere on Linux systems. Eventually you learn to only use [a-zA-Z0-9-_.] in names because anything else will inevitably confuse some tool or another (even capital letters can be a PITA)... I so wish someone would take it upon themselves to clean up this mess, but it's probably too much work and too many who are nay-sayers conditioned to it who don't see the need for changes.

abigail955 个月前

if you cannot handle UTF-8 anywhere anything approaching text could be, your program is malformed and should be deprecated and removed.if you wrote code that couldn't handle bob;>/hacked in a username, you would and should be laughed at.why are we using this ancient stuff?

评论 #42340431 未加载

评论 #42340089 未加载

评论 #42341884 未加载

评论 #42341715 未加载

IshKebab5 个月前

> Most Debian users don't work with useradd, or groupadd, directly. Instead, Debian has long supplied its own adduser (and addgroup) utilities, originally written by founder Ian Murdock. These act as simpler front ends to useraddOne of the dumbest things Debian has done.

resource_waste5 个月前

This is important because Debain-family is used on many servers?Debian seems to just squander resources on things a few powerful people care about.All my servers have been Debian-based, so I can't be too hard on them, but whenever I see someone recommend a Debian-family distro as a Desktop OS, I feel like I need to call the police.

card_zero5 个月前

> naming things is one of the hard things to do in computer scienceI've been thinking about that a lot lately. Code is text, it's arranged linearly, code has to be readable, identifiers are thus short strings that try to express short essays about the purpose of the variable or whatever it is, and then ideally there's a longer version of the essay in a comment, but not too long because that would clutter up the code as well (because it's text, arranged linearly). And we have code folding to tidy them up, for what good it does, and ideally an even longer version of the essay in documentation except nobody writes that.What if it wasn't text, and wasn't linear, and we didn't have an expectation that code should be strings of stupid over-terse names and hieroglyphic symbols? So I was thinking vaguely about investigating graphic-based programming, but it's probably worse, IDK. It could automatically assign arbitrary icons* instead of identifiers, and you could write tooltip-like comments to describe them as and when you want to, and everything could be laid out nicely with diagrams and different pages instead of like a text file. I suppose this is all merely cosmetic? The thing with the instance on code being written as strings of text feels very primitive, is all. It causes this problem.* Which doesn't solve the problem, I admit, because now you have to remember what the icons mean, but maybe that's easier?

评论 #42339538 未加载

评论 #42339397 未加载

评论 #42339800 未加载

评论 #42339292 未加载

tiahura5 个月前

When you think about all the time, money and effort that have been wasted on Unicode...

评论 #42339162 未加载

评论 #42341083 未加载

评论 #42339380 未加载

评论 #42339373 未加载

评论 #42339142 未加载