Guys! The point of this article is not to prescribe the only method of displaying human-readable file sizes. Obviously one could use `ls -lh`; the author clearly demonstrates that he is willing and able to read man pages to find answers.<p>Rather, this is a <i>pretty interesting</i> look into what it actually entails to make what ought to be a very simple and straightforward change.<p>It turns out that these simple changes are hard! Not just in identifying the piece of code to modify, but that man pages are often incomplete or unclear. It also illustrates the complexities behind making software portable - in this case, using the nation-neutral place separator. It also reminds us that solving what is on the surface a simple problem lets one uncover all sorts of interesting and messy details underneath - including more problems to solve!<p>These are steps that he'd have to take no matter what the code or feature. This article is not "complexity for complexity's sake", it's illustrating the complexity of making changes to any piece of code - and that it is surprisingly difficult for something that one would think is very easy!
I enable this for GNU ls like:<p><pre><code> alias ls="BLOCK_SIZE=\'1 ls --color=auto"
</code></pre>
The above is a bit hacky and not very UNIXy as it's
lumping more logic into ls, rather than splitting out
into functional units.<p>Number formatting being a very
common requirement, I've proposed a design for a new
numfmt GNU coreutil<p><a href="http://lists.gnu.org/archive/html/coreutils/2012-02/msg00085.html" rel="nofollow">http://lists.gnu.org/archive/html/coreutils/2012-02/msg00085...</a><p>which would be used like:<p><pre><code> ls -l | numfmt --field=5 --format=%'d</code></pre>
Mr. Lehey managed to improve the system in such a way that it will subsequent changes for him and others easier, independently of whether the specific change to `ls` is never adopted. It's “five whys” applied to “why is this hard” and “how can I make it easier”. It's more effort, with a greater chance that much of it will survive the current context and requirements.<p>Some people improve the area they travel through, others leave debris, and many are noops who make no difference to those who come after. If there's not enough entropy fighters like Mr. Lehey working a system, it turns to kipple.
In case anyone wonders, I recently looked into how locales work with respect to LANG, LC_ALL and LC_*:
<a href="http://c.i3wm.org/6799926" rel="nofollow">http://c.i3wm.org/6799926</a><p>By the way, by looking at <a href="http://www.lemis.com/grog/index.php" rel="nofollow">http://www.lemis.com/grog/index.php</a> you can see that the author uses FreeBSD, just in case you were wondering about /usr/src
Most annoying is that gcc warns about perfectly valid and logical code. That causes people to ignore warnings, and before you know it, you have a piece of software that has more warnings than lines of code.<p>Alternatively, when you cleverly figure out how to work around the warning, like the author does, you now prevent that rule from triggering even when it's right. Clearly a better unit test is needed.
Incidentially, I completed ls's set of -a-z options recently.<p><a href="http://joeyh.name/~joey/blog/entry/ls:_the_missing_options/" rel="nofollow">http://joeyh.name/~joey/blog/entry/ls:_the_missing_options/</a><p>(Well, actually, I never got around to writing -z, but it's clear what it should do, and any ls hackers are encouraged to finish that up.)
gobble.wa@gmail.com made a similar post to the freebsd-questions mailing list a month ago. In his case the question was how to print an md5sum along with the file names in a given directory. I saved it because I thought it was a clever hack.<p><a href="http://lists.freebsd.org/pipermail/freebsd-questions/2012-September/244933.html" rel="nofollow">http://lists.freebsd.org/pipermail/freebsd-questions/2012-Se...</a><p>A lot of times I catch myself in the mindset of taking a step back and saying "here are the set of tools I have at hand to accomplish a task" without realizing that I should simultaneously be taking a step "in"--so to speak--and acknowledging that the tools I have to work with are not immutable tools cast of iron; they are malleable and can be re-tooled to suit my purposes.. and that sometimes going that route can be the simplest--and in fact "best"--solution.
Here's a wrapper I wrote for ls a while ago that allows you to spell "--color" as "--colour":<p><a href="http://ubuntuforums.org/showthread.php?t=684239" rel="nofollow">http://ubuntuforums.org/showthread.php?t=684239</a>
Or, you could use 'ls -h'...<p>(that said, I do see the utility, since it gives a more obvious visual queue as to the order of size differences... but if you're doing anything with the sizes programatically, you have to remove the commas afterwards... Short version: if you're going to do this, make it a unique flag, or a new flag modifier to the -l flag... don't overload the -l flag without recourse...)
FYI, there is no need to change GNU ls to get that behavior. You can make it use your locale's separator with either the --block-size="'1" option or by setting the LS_BLOCK_SIZE envvar to that same string:<p><pre><code> $ LC_ALL=en_US.UTF8 ls -og --block-size="'1" .
-rw-------. 1 5,145,416 Oct 5 16:44 A
-rw-------. 1 5,137,692 Oct 4 14:37 B
-rw-------. 1 5,147,168 Oct 8 07:52 C
</code></pre>
This feature is documented in the "Block size" section of the coreutils manual: i.e., you can type this to see it:<p><pre><code> info coreutils 'block size'</code></pre>
Now let's consider software lifecycle in a large context: longevity of forks.<p>If he doesn't send the changes off to upstream, and make a case good enough for them to be approved, then all this dooms him to maintaining his fork on all the platforms where he wants it until he gets sick of it or convinces someone else to do it for him.
Man, such fragile stuff. Why not code a function yourself that turns a number into a string representing it decimally with the commas every three digits. I normally like and use good library functions and standards, but if they're that fragile and depend on your environment then no thanks.
For such large numbers, would it make more sense to use groups of 6 instead of 3? This would allow you to easily identify the megabyte position with the next separator at the terabyte position.
Surely the appropriate option character for this new, human-readable output is "-h".<p>Makes you wonder whether anyone ever considered the problem before...
Is this front page worth materiel? I mean, it's good you took the time to add a ', but doing the same with sed would have been faster.<p>Alternatively, do you know about ls -lhrS? It will print size in human formats and reverse sort the files by size - ie the bigger will be at the end of the list