Critique of Lazy Sequences in Clojure

87 pointsby robtoalmost 2 years ago

13 comments

jwralmost 2 years ago

This article is somewhat puzzling for me. On one hand, the OP clearly knows Clojure very well. The disadvantages of laziness are real and well described.On the other hand, though, this sounds like a theoretical/academic article to me. I've been using Clojure for 15 years now, 8 of those developing and maintaining a large complex SaaS app. I've also used Clojure for data science, working with large datasets. The disadvantages described in the article bothered me in the first 2 years or so, and never afterwards.Laziness does not bother me, because I very rarely pass lazy sequences around. The key here is to use transducers: that lets you write composable and reusable transformations that do not care about the kind of sequence they work with. Using transducers also forces you to explicitly realize the entire resulting sequence (note that this does not imply that you will realize the entire source sequence!), thus limiting the scope of lazy sequences and avoiding a whole set of potential pitfalls (with dynamic binding, for example), and providing fantastic performance.I do like laziness, because when I need it, it's there. And when you need it, you are really happy that it's there.In other words, it's something I don't think much about anymore, and it doesn't inconvenience me in any noticeable way. That's why I find the article puzzling.

评论 #37033154 未加载

评论 #37033092 未加载

评论 #37033290 未加载

评论 #37033086 未加载

geokonalmost 2 years ago

As I posted on Reddit:It might also be good to mention Injest<a href="https://github.com/johnmn3/injest">https://github.com/johnmn3/injest</a>Which makes transducers more ergonomic to use if you are like me and use threading macros everywhereWould be curious to hear how others feel about it

评论 #37036563 未加载

评论 #37033533 未加载

BaculumMeumEstalmost 2 years ago

It’s too bad that transducers were created long after clojure’s inception. Can you always replace a lazy seq with a transducer? Could the language theoretically be redesigned to replace all default usages of lazy seqs with transducers, even if it were a major breaking change? And have lazy operations be very explicit?

评论 #37033678 未加载

评论 #37041643 未加载

kimialmost 2 years ago

Not sure if the "transducer" approach suggested as a workaround makes your life easier or further adds to the mental overhead. See <a href="https://www.astrecipes.net/blog/2016/11/24/transducers-how-to/" rel="nofollow noreferrer">https://www.astrecipes.net/blog/2016/11/24/transducers-how-t...</a> for some example transducers.

评论 #37032565 未加载

评论 #37031753 未加载

psd1almost 2 years ago

I've been circling around lisp for a couple of years. I'm starting in a month, I'll spend several hours a day. I still don't know what language I want to learn.I was drawn to Clojure because it looked like a lisp for getting stuff done. But a few things put me off. This article puts me off more. I want to get the semantics down before I have to think about what's going on under the hood.

评论 #37034544 未加载

评论 #37033573 未加载

评论 #37031895 未加载

评论 #37033246 未加载

评论 #37036336 未加载

评论 #37034502 未加载

评论 #37035890 未加载

评论 #37037918 未加载

评论 #37032037 未加载

lenkitealmost 2 years ago

The main issue is that Clojure compiler doesn't really optimize lazy sequences right ? Most language compilers do this. Rust lazy iterators for example many times exhibit faster performance than for-lops.And clojure also doesn't give an error/warning when lazy sequences aren't finalized.

评论 #37032720 未加载

评论 #37033566 未加载

评论 #37032295 未加载

gravalmost 2 years ago

I've been using the trick with enforcing realization by serializing to strings a few times. Slow, but quite useful in many contexts. However, instead of using `(with-out-str (pr ...`, there's simply`pr-str`, which is easier to remember.I'm typically using it like so:<pre><code> (defn realize [v] (doto v pr-str)) (binding [*some* binding] (realize (f some-nested-lazy-seq)))</code></pre>

ndralmost 2 years ago

> The good parts of laziness: Avoiding unnecessary workActually be very careful with side effects. Some functions like `map` and `for` take things in chunks, typically in steps of 32 as most underlying structures are in log-32 leaves.```<pre><code> (let [printing-range (map (fn [i] (print "debug: " i) i) (range)) first-10 (take 10 printing-range)] first-10) debug: 0 debug: 1 debug: 2 debug: 3 debug: 4 debug: 5 debug: 6 debug: 7 debug: 8 debug: 9 debug: 10 debug: 11 debug: 12 debug: 13 debug: 14 debug: 15 debug: 16 debug: 17 debug: 18 debug: 19 debug: 20 debug: 21 debug: 22 debug: 23 debug: 24 debug: 25 debug: 26 debug: 27 debug: 28 debug: 29 debug: 30 debug: 31 (0 1 2 3 4 5 6 7 8 9) </code></pre> ```

评论 #37039646 未加载

munchleralmost 2 years ago

F# is similar in that it supports lazy sequences but is mostly eager otherwise, and often handles errors using exceptions. One does have to be careful, but the benefits far outweigh the risks in my experience.

waffletoweralmost 2 years ago

I think the main issue with lazy sequences is understanding and controlling their scope. Transducers, particularly when utilized within an `into` scope, can encapsulate laziness very neatly. Indeed, transducers utilize lazy sequences internally, and the OP shows their clear performance advantage. I think the article would be more effective if it shifted tone to "Clojure laziness best practices" rather than damning the idea wholesale. There be dragons for sure.

jgalt212almost 2 years ago

I have not read yet, but how much of this critique is relevant to Python and overuse of generators?

beanjuiceIIalmost 2 years ago

i need to figure out where my slowdown is happening...I'll be needing a phd for that

评论 #37031770 未加载

kazinatoralmost 2 years ago

TXR Lisp also fails this test:<pre><code> 1> (len (with-stream (s (open-file "/usr/share/dict/words")) (get-lines s))) ** error reading #<file-stream /usr/share/dict/words b7ad7270>: file closed ** during evaluation of form (len (let ((s (open-file "/usr/share/dict/words"))) (unwind-protect (get-lines s) (close-stream s)))) ** ... an expansion of (len (with-stream (s (open-file "/usr/share/dict/words")) (get-lines s))) ** which is located at expr-1:1 </code></pre> The built-in solution is that when you create a lazy list which reads lines from a stream, that lazy list takes care of closing the stream when it is done.If the lazy list isn't processed to the end, then the stream semantically leaks; it has to be cleaned up by the garbage collector when the lazy list becomes unreachable.We can see with strace that the stream is closed:<pre><code> $ strace txr -p '(flow "/usr/share/dict/words" open-file get-lines len)' [...]read(3, "d\nwrapper\nwrapper's\nwrappers\nwra"..., 4096) = 4096 read(3, "zigzags\nzilch\nzilch's\nzillion\nzi"..., 4096) = 826 read(3, "", 4096) = 0 close(3) = 0 fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 write(1, "102305\n", 7102305 ) = 7 exit_group(0) = ? +++ exited with 0 +++ </code></pre> It is possible to address the error issue with reference counting. Suppose that we define a stream with a reference count, such that it has to be closed that many times before the underlying file descriptor is closed.I programmed a proof of concept of this today. (I ran into a small issue in the language run-time that I fixed; the close-stream function calls the underlying method and then caches the result, preventing the solution from working.)<pre><code> (defstruct refcount-close stream-wrap stream (count 1) (:method close (me throw-on-error-p) (put-line `close called on @me`) (when (plusp me.count) (if (zerop (dec me.count)) (close-stream me.stream throw-on-error-p))))) (flow (with-stream (s (make-struct-delegate-stream (new refcount-close count 2 stream (open-file "/usr/share/dict/words")))) (get-lines s)) len prinl) </code></pre> With my small fix in stream.c (already merged, going into Version 292), the output is:<pre><code> $ ./txr lazy2.tl close called on #S(refcount-close stream #<file-stream /usr/share/dict/words b7aecee0> count 2) close called on #S(refcount-close stream #<file-stream /usr/share/dict/words b7aecee0> count 1) 102305 </code></pre> One close comes from the with-stream macro, the other from the lazy list hitting EOF when its length is being calculated.Without the fix, I don't get the second call; the code works, but the descriptor isn't closed:<pre><code> $ txr lazy2.tl close called on #S(refcount-close stream #<file-stream /usr/share/dict/words b7b70f10> count 2) 102305 </code></pre> In the former we see the call to close in strace; in the latter we don't.