TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

TXR – A Programming Language for Convenient Data Munging

118 pointsby joshumaxabout 6 years ago

13 comments

kazinatorabout 6 years ago
Author here. Currently working on a debugger. (Threw the old crappy one out.) Backtraces are working. Some of the remaining work is going to require long, uninterrupted concentration that is hard to come by due to taking care of a six-month-old baby.<p>I have over 50 unreleased patches. There are some bugfixes, including a compiler one, involving dynamically scoped variables used as optional parameters:<p><pre><code> (defvar v) (defun f (: (v v))) (call (compile &#x27;f)) ;; blows up in virtual machine with &quot;frame level mismatch&quot; </code></pre> Patch for that:<p><pre><code> diff --git a&#x2F;share&#x2F;txr&#x2F;stdlib&#x2F;compiler.tl b&#x2F;share&#x2F;txr&#x2F;stdlib&#x2F;compiler.tl index e76849db..ccdbee83 100644 --- a&#x2F;share&#x2F;txr&#x2F;stdlib&#x2F;compiler.tl +++ b&#x2F;share&#x2F;txr&#x2F;stdlib&#x2F;compiler.tl @@ -868,7 +868,7 @@ ,*(whenlet ((spec-sub [find have-sym specials : cdr])) (set specials [remq have-sym specials cdr]) ^((bindv ,have-bind.loc ,me.(get-dreg (car spec-sub)))))))))) - (benv (if specials (new env up nenv co me) nenv)) + (benv (if need-dframe (new env up nenv co me) nenv)) (btreg me.(alloc-treg)) (bfrag me.(comp-progn btreg benv body)) (boreg (if env.(out-of-scope bfrag.oreg) btreg bfrag.oreg)) </code></pre> There is now support in the printer for limiting the depth and length.<p>I added a derived hook into the OOP system; a struct being notified that it is being inherited.
otoburbabout 6 years ago
<i>&quot;TXR Lisp programs are shorter and clearer than those written in some mainstream languages &quot;du jour&quot; like Python, Ruby, Clojure, Javascript or Racket. If you find that this isn&#x27;t the case, the TXR project wants to hear from you; give a shout to the mailing list. If a program is significantly clearer and shorter in another language, that is considered a bug in TXR.&quot;</i><p>That section made me chuckle. Admirable if true.
评论 #19929393 未加载
评论 #19930959 未加载
notafraudsterabout 6 years ago
This seemed interesting, but when I went through the &quot;Accepted Stack Overflow&quot; links on the main page, I thought &quot;how would I do this in an R tidyverse stack?&quot; and set the goal that my responses should be shorter, clearer, or ideally both, and that I would favour clearer answers to code golf, except that when posting to HN I collapse the code into a single line while in R there would be linebreaks at each semicolon or after each pipe operator (%&gt;%). Here are three examples below:<p>&quot;Customized sort based on multiple columns of CSV&quot;. In R, something like this: `library(tidyverse); read_delim(&quot;file.tsv&quot;, delim = &quot;@&quot;) %&gt;% arrange(.[[2]]) %&gt;% group_by(.[[2]]) %&gt;% arrange(match(.[[3]], c(&quot;arch.&quot;, &quot;var.&quot; &quot;ver.&quot;, &quot;anci.&quot;, &quot;fam.&quot;)), .[[3]]) %&gt;% group_by(.[[2]], .[[3]]) %&gt;% mutate(n = n()) %&gt;% arrange(desc(n)) %&gt;% ungroup() %&gt;% select(1:4)`<p>&quot;Extract text from HTML table&quot;. In R, something like this would suffice: `library(rvest); library(tidyverse); read_html(URL_GOES_HERE) %&gt;% html_nodes(&quot;div.scoreTableArea&quot;) %&gt;% html_table() %&gt;% write_delim(&quot;out.csv&quot;, delim = &quot;\t&quot;)`<p>&quot;Get n-th Field of Each Create Referring to Another File&quot;. In R: `library(tidyverse); file1 = read_delim(&quot;file1.txt&quot;, delim = &quot; &quot;, col_names = FALSE); chunks = readChar(&quot;file2.txt&quot;, 999999) %&gt;% str_split(&quot;;&quot;) %&gt;% unlist() %&gt;% map(function(x) { matches = str_match(str_trim(x), &#x27;^create table &quot;(.<i>)&quot;([^(]</i>)\\(((.|\n)*)\\)$&#x27;); title = matches[, 2]; fields = matches[, 4] %&gt;% str_split(&quot;,&quot;) %&gt;% unlist() %&gt;% str_trim(); return(tibble(table_name = rep(title, length(fields)), n = 1:length(fields), field = fields)) }) %&gt;% bind_rows(); file1 %&gt;% left_join(chunks, by = c(&quot;X1&quot; = &quot;table_name&quot;, &quot;X2&quot; = &quot;n&quot;))`<p>The third example trades off a little clarity for a little robustness by adding a regex instead of assuming the SQL table definition is one field per line.
评论 #19931555 未加载
评论 #19948833 未加载
anentropicabout 6 years ago
&gt; The PDF rendition of the reference manual, which takes the form of a large Unix man page, is over 600 pages long, with no index or table of contents. There are many ways to solve a given data processing problem with TXR.<p>&quot;Good luck, you&#x27;re on your own!&quot;
评论 #19929647 未加载
评论 #19928623 未加载
评论 #19928063 未加载
评论 #19927694 未加载
js8about 6 years ago
It would be interesting to have a DSL for data munging, but I am afraid TXR is not it. My requirements would be that the language should be functional and total.<p>Most transformations that we do on data do not require Turing completeness or recursion. I think it would be useful to write these down in a language with semantics that is easy to analyze.
评论 #19930998 未加载
评论 #19931317 未加载
评论 #19930202 未加载
cstrossabout 6 years ago
From where I&#x27;m standing this looks like someone put a <i>lot</i> of effort into re-inventing Perl, minus the documentation and user community.
评论 #19928329 未加载
评论 #19928402 未加载
usgroupabout 6 years ago
I ashamedly had never heard of this before. Could anyone add any colour RE:<p>1. Parsimony.<p>2. Performance vs awk and friends.<p>3. Multi threading.<p>4. Ideal use cases.
评论 #19928648 未加载
uptownfunkabout 6 years ago
We already have this, it is R with tidyverse. What we need is a fully baked transpiler from R&#x2F;tidyverse to sql.
评论 #19928769 未加载
mcguireabout 6 years ago
Confusingly, there&#x27;s another language called TXL (<a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;TXL_(programming_language)" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;TXL_(programming_language)</a>) that&#x27;s both obscure and neat.
theon144about 6 years ago
Well, this looks great, but I&#x27;m not about to start digesting the self-admitted 600-page tome just to see if it&#x27;s worth learning for the tasks I encounter - surely there&#x27;s a &quot;tutorial&quot; somewhere?
评论 #19928425 未加载
mark_l_watsonabout 6 years ago
Interesting lisp’y language. Off topic, but I find the domain name nongnu.org to be amusing for a GNU&#x2F;FSF web site. “nongnu” to me reads as “not gnu”
评论 #19928915 未加载
jdmoreiraabout 6 years ago
Very interesting. I&#x27;m wondering why they didn&#x27;t implement the Lisp version on top of CL with macros
评论 #19931142 未加载
vcdimensionabout 6 years ago
Has anyone run any benchmarks of TXR against awk, R, python, or miller?