I wrote something similar (but necet really finished it), called 'gut', in Go a few years back. Funny thing is, that I literally never use it. I thought splitting on regexes and that stuff would be super useful, but it turns out that I just use Perl one-liners instead. And Perl is available on something like 99.99% of all *nix machines, which my own 'cut'-substitute isn't.<p>Still a good exercise for me to write it, and I assume for OP too.
Love seeing these modern alternatives to coreutils! Ripgrep, fd, hyperfine, bat, exa, bottom, gdu, wc, sd, hexyl...<p>Yet to find a GNU 'tr' alternative though
Nice work!<p>I don't know whether anyone here has used Rexx. The 'parse' instruction in Rexx was incredibly powerful, breaking up text by field/position/delimiter and assigning to variables all in one line.<p>I've often wondered if there was a command-line equivalent. Awk is great but you have to 'program' the parsing spec, rather than declare it.
It is interesting to note how it compares to "choose" (also in Rust) in the benchmarks.<p>single character<p><pre><code> hck 1.494 ± 0.026s
hck (no-mmap) 1.735 ± 0.004s
choose 4.597 ± 0.016s
</code></pre>
multi character<p><pre><code> hck 2.127 ± 0.004s
hck (no-mmap) 2.467 ± 0.012s
choose 3.266 ± 0.011s
</code></pre>
The single pass optimization trick[1] seems to be helping a lot in single character case.<p>Of course, doing away with a pass is suppossed to give 2x, and I am wondering whether the regex constraint lead to this "side-effect".<p>[1] fast mode - <a href="https://github.com/sstadick/hck/blob/master/src/lib/core.rs#L194" rel="nofollow">https://github.com/sstadick/hck/blob/master/src/lib/core.rs#...</a>
<a href="https://github.com/sstadick/hck/blob/master/src/lib/core.rs#L324-L331" rel="nofollow">https://github.com/sstadick/hck/blob/master/src/lib/core.rs#...</a>
I saw about `hck` recently on twitter, was impressed to see support for compressed files. From the current todo list, I hope complement is implemented for sure.<p>I see Negative index is currently "unlikely". I'm writing a similar tool [0], but with bash+awk. I solved the negative index support with a `-n` option, which changes the range syntax to `:` instead of `-` character.<p>My biggest trouble came with literal field separator [1], because FS can only be specified as a string in awk and backslash is a metacharacter for both string and regexp.<p>[0] <a href="https://github.com/learnbyexample/regexp-cut" rel="nofollow">https://github.com/learnbyexample/regexp-cut</a><p>[1] <a href="https://learnbyexample.github.io/escaping-madness-awk-literal-field-separator/" rel="nofollow">https://learnbyexample.github.io/escaping-madness-awk-litera...</a>
<offtopic>
I have implemented a `_split` command to split a line by a separator and `_stat` command that does basically `sort | uniq -c | sort -nr` counting elements and sorting by frequency. Really useful operations for me.<p>When my one liners become 2-3 lines long I need to switch to a regular script, but I also log all my shell commands years back and have something a bit better than `history | grep word` to search it.</>
Nice one op. It’s mostly due to my lack of knowledge of rust but the code is not easy to read unlike golang. Does anyone feel the same ? (between nothing to do with how op wrote but rather the language itself)