HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
Rich Hickey - Inside Transducers

ClojureTV · Youtube · 112 HN points · 0 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention ClojureTV's video "Rich Hickey - Inside Transducers".
Youtube Summary
Rich Hickey, the author of Clojure and designer of Datomic, is a software designer with over 25 years of experience in various domains. Rich has worked on scheduling systems, broadcast automation, audio analysis and finger printing, database design, yield management, exit poll systems, and machine listenings, in a variety of languages. Rich is the CTO of Cognitect, Inc.
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
Nov 21, 2014 · 112 points, 49 comments · submitted by tsm
ultimape
For those not familiar with Clojure, here's a great demonstration of the concept done up in JavaScript: http://jlongster.com/Transducers.js--A-JavaScript-Library-fo...
kbeaty
There is also Transducers Explained [1] if you are familiar with JavaScript.

(Shameless plug in the hope it may be helpful to someone)

[1]: http://simplectic.com/blog/2014/transducers-explained-1/

nilliams
For JavaScripters, there's also the 'Like Underscore, but lazier' Lazy.js http://danieltao.com/lazy.js/
jonahx
From that link:

"The reduce function is the base transformation; any other transformation can be expressed in terms of it (map, filter, etc)."

This seems so obvious in retrospect -- I can't believe I had never made that connection before.

ultimape
I had a similar epiphany while learning about regular expressions in Perl. That connection with text processing is what helped me understand list comprehensions. They seem strikingly similar.

For JS, I personally prefer lo-dash for this kind of work, or dropping in polyfills from MDN.

For some context, here are .map functions applied to normal arrays http://jsperf.com/native-vs-array-js-vs-underscore/54

I've been looking for similar libraries that work on typed-arrays because they are so much more efficient when working with web-workers or with raw canvas data. My attempts at hacking it in feel like they are just bad ideas: http://jsperf.com/float32array-map/2

grayrest
Most people use underscore/lodash for this stuff. The difference is that the js transducers libraries don't create intermediate arrays, only do enough work to produce the requested output, and work on top of anything that can be coerced into the iteration protocol. I've seen demos of using Facebook's Immutable JS, CSP.js [1], and I don't see why you couldn't put them on top of Typed Arrays or a FRP library like Kefir.

[1] http://jlongster.com/s/nationjs-slides/

ultimape
Yeah, the overhead of creating intermediate typed arrays is exactly what I'm trying to avoid with this.
jdd
Lo-Dash 3.0 will have support for lazy evaluation in its chaining syntax that supports shortcut fusion and avoids intermediate arrays as well.
ultimape
Do you know if they work with typed arrays?
tincholio
Graham Hutton has a very nice tutorial [0] on the universality of fold, which shows this elegantly (in Haskell, though).

[0] Graham Hutton, "A tutorial on the universality and expressiveness of fold", J. Functional Programming 9(4): 355–372, July 1999. http://www.cs.nott.ac.uk/~gmh/fold.pdf

ajuc
For me the "ephiphany" was when I realized reduce don't need to be (T,T)->T, but can be (T1, T2) -> T1.
dkersten
They can also be (T1,T2)->T3
ajuc
Not really?

When you reduce with function f(T1, T2) -> T3 the result of previous iteration becomes first argument for the next iteration, so they must be of the same "type".

Or am I missing something?

dkersten
Ah, no, you're right.

I was thinking that mapping can be f(T1)->T2 and how reducing can change types too, but I guess I wasn't paying enough attention because of course the reduce signature is f(accumulator, input) and the output is accumulator, so yea, you're absolutely right: f(T1,T2)->T1

cousin_it
Yeah. That's actually how you implement lists in lambda calculus, as opaque functions that accept a "visitor". There are two different ways of doing it:

    -- Mogensen-Scott encoding
    data ScottList a = ScottList (forall r. (a -> ScottList a -> r) -> r -> r)

    -- Boehm-Berarducci encoding
    data ChurchList a = ChurchList (forall r. (a -> r -> r) -> r -> r)
Roughly, the first encoding gives you pattern matching, and the second gives you foldr (reduce). Either of these operations is sufficient to do anything with the list.

Also note that both of these are encodings of lazy (potentially infinite) lists. To encode strict (guaranteed finite) lists, you really need algebraic data types like in ML, the visitor pattern can't do that.

ultimape
Here's another video by Rick Hickey about transducers: https://www.youtube.com/watch?v=6mTbuzafcII

- it was referenced in this great post detailing the transducer notion with Clojure, Scala, and Haskell: http://blog.podsnap.com/ducers2.html

charlysisto
So Transducers generalize the usage of Enumerable functions such as map, reduce, filter, flatMap (...).

Please could someone tell me if this is conceptually different from ruby's Enumerable module which only needs the class its included in to implement `each` so anything can be enumerable ? Or is it a similar but just in translated to the FP world ?

drostie
Transducers are a different kind of thing, not just a generalized usage of enumerable functions.

Clojure transducers are isomorphic to functions of type `a -> [b]`, that is, functions which take some thing and return lists of other things. You can implement map and filter in this language:

    tmap f a = [f a]
    tfilter p a = if p a then [a] else []
You can also compose two such functions:

    concat :: [[x]] -> [x]
    concat [] = []
    concat (x : xs) = x ++ concat xs
    
    concatMap f list = concat (map f list)
    
    compose lb_a lc_b = \a -> concatMap lc_b (lb_a a)
One reduction function (`foldr`) turns out to just be the identity function [a] -> [a], as I recall.

The interesting thing about Clojure transformers is that they have a strange continuation-passing form, such that this function `compose` that I wrote above for the type `a -> [b]` is actually in Clojure just a function composition. That is, instead of `a -> [b]` we see `forall r. (b -> r -> r) -> (a -> r -> r)`, which composes with normal function composition.

moomin
What's you're thinking of is more like ISeq in Clojure, or Foldable is Haskell. The more interesting generalisation is not between arrays, vectors and hashsets, but between data structures and event streams.
YuriNiyazov
Yes; There are differences.

1. Transducers are parallel under the covers. Since the expectations is that the code that the predicate or mapping functions that you pass into map, filter and reduce are pure (no variables are changed, no state is modified, just a calculation that is only dependent on arguments) the parallelism is hidden away from you, but it's there. Ruby's Enumerable can't do parallelism

2. When you compose transducers, there are no intermediate sequences generated. The simplest example is this:

In Ruby: [1, 2, 3].map {|x| x + 1}.map {|x| x+ 2} will generate an intermediate array [2, 3, 4] after the first map, and then will generate [4, 5, 6] when it has evaluated the whole expression.

In Clojure (I am sure I got the syntax wrong for this one) (transduce (map #(+ 2 %) (map inc)) [1 2 3]) will create an intermediate mapping function that will first increment, then add 2 to its argument, and then will map once using that intermediate function.

charlysisto
Thanks for the answer. Indeed Ruby in general has the limitations inherent to its "non functional" nature. However regarding 2) Enumeration can now (in ruby 2.0) be lazy and thus take each item one after the other through the whole chain.

(1..Float::INFINITY).lazy.map{|a| p a; a*10}.map{|a| p a+1} # => 1 11 2 21 etc...

So it can work with streams & so on... But I know (from a Jessica Kerr presentation) it also has limitations there.

cfeduke
Transducers _can be_ parallelized (TODO) but some transducer implementations do contain state, like `take` and therefore cannot be parallelized.

The big thing is no intermediate results.

nickik
The big thing is not even that, the big thing is that you can seperate your logic from the data that you want to process. So instead of (map inc [1 2 3 4]) you can do (def inc-transducer (map inc)) and then use that for any datastructures, channels or whatever other transducer context people come up with. To be sure, you would not do that with (map inc) but if you have more logic, you could.
seanmcdirmid
So transducers are like .Net/LINQ enumerables?
Skinney
Nah, Clojure seqs are more like .NET/LINQ enumerables.

Functions that work over seqs/enumerables return a new seq/enumerable. From what I understand, and I'm sure I am wrong, a transducer receives and transforms a value, and may call the next transducer with the transformed value.

The nice thing is that a transducer does not create intermediate results (a seq/enumerable), and that it doesn't make any assumptions on the underlying datastructure. So you can, for instance, send a value through a transducer and directly place it in a list, instead of creating a set of enumerables and then call .ToList on that. Since transducers can work with any value, you also aren't tied to enumerables. So transducers should be a little more universal and performant than enumerables.

icefox
A contrived example in C#/LINQ

  Enumerable.Range(1, 100)
	  .Where(n => n % 2 == 0)
	  .Select(n => n * 2)
          .ToList();
And as a transducer it could looks something like this?

  sequence(
    Enumerable.Range(1, 100),
    compose(
      filter(n => n %2 == 0),
      map(x => x + 1)
    )
  ).ToList();
On the linq side it would create only 2 enumerable objects (one for where and select) and the ToList would result in a copy the object pointer as it was passing through each enumerable. For the transducer example rather than having 2 enumerable's we would have 2 transducer objects?

I absolutely understand the case where a language's map/filter/reduce/etc results in a whole new list, but for the above case the performance/memory overhead improvement doesn't see that huge and really minor. The fact that it removes the overhead of having to support IEnumerable and as you mentioned is only dealing with functions is what seems like a win. At first I was thinking that plain old LINQ is also more readable, but my contrived c# transducer example doesn't look that bad in the end. Am I incorrect on any of this?

ajanuary
With Linq you can't really create an object/function that describes the transformation that should happen. I start with an enumerable, describe how to transform /that/ enumerable, and in the end have something that describes a transformation applied to a particular initial data. I can't do anything with it but laziy read the transformed values for the original input.

With transducers I can describe a transformation that should happen to /some/ reducable input, and in the end have something that represents just the transformation. I. An then apply that to any reducable input I have.

[edit]

Your example becomes:

  var incrementedEvens = compose(
      filter(n => n %2 == 0),
      map(x => x + 1)
  );
  transduce(Enumerable.range(0, 100), incrementEvens);

  var lowerAlph = compose(
    filter(x => Character.isAlpha(x)),
    map(x => x.ToLowercase())
  );
  System.in.setTransducer(lowerAlpha);
You can really do that in Linq.
YuriNiyazov
If you watch the first Rich Hickey Transducers talk, he shows that some transducers (the ones that rely in the underlying reduce implementation that uses fork/join and assumes that reducing the collection is associative) are already parallelized. I agree with you regarding `take`
cfeduke
IIRC from yesterday's talk he was explaining that these will be easy to parallelize, and its slated for 1.7. I had thought they were already parallel myself until I saw the talk; and possibly some of them are, but the item is still Open.

http://dev.clojure.org/jira/browse/CLJ-1553

YuriNiyazov
I was just talking about the "reduce fold" being already parallel.
nickik
Again a more technical talk. I really liked it. Specially the end bit about pipelines, that seams like it would be really usful.

Transducers overall are quite intressting, I am exited to see what other transducer context people come up with.

moomin
It's not that clear there will be one. Transducers have three steps: begin, during and end. In practice, start is only used for reduce operations and outside of reduce operations end can only be used for its side effects.

There's also definitional issues e.g. can you have an async transducer?

cfeduke
Yes you can have async and blocking transducers. Forward to the point in the video about channels for a discussion.
moomin
But if you use an async transducer on a channel, it fails. (Actual tested code, sadly.)

Which might be okay, but my basic point that it's definitionally fuzzy still holds.

kbeaty
You can indeed use transducers in asynchronous contexts. In fact, that is what intrigues me most about the abstraction.

Transducers are defined such that you can abstract the context of input, output and iteration and focus on the transformation of each element from an independent source individually. The implementation of each transducer accepts another transformation in a "pipeline" (eventually the output sink) and accepts an input during an external iteration process. The implementation decides what to do with it (map changes with a function, filter ignores certain values with a predicate, etc.) You can also define transducers that send multiple values for each input (think string.split) or some that do not send any until completion (think buffered results).

Since the iteration source and sink are abstracted from the transformation, you can use the same transformation in other contexts (Promises, event streams, IO streams, CSP, etc.)

I've been experimenting with transducers in asynchronous contexts in JavaScript if anyone is interested, for example:

[1]: http://simplectic.com/projects/underscore-transducer/ [2]: http://simplectic.com/projects/underarm/ [3]: https://github.com/transduce/transduce-async [4]: https://github.com/kevinbeaty/transduce-stream [5]: https://github.com/transduce/transduce-push [6]: https://github.com/transduce/transduce-then

sitkack
These aren't materially different from itertools in Python.
kushti
For those who know Russian, here's awesome critics of "transreducers" http://thedeemon.livejournal.com/87320.html
vanderZwan
Could you try to give a summary for those who don't?
juskrey
The summary: "Hickey did not invent FP, how dares he?"
nickik
People keep attacking him whenever he interduces something. Why? Every time he releases something, he also shows from what research it originated. Go back to the first transducer talk and you will see the references.
lomnakkus
Personally, I don't mind him too much even though I'm a typeful-programming weenie. That said, he does have a tendency to somewhat unfoundedly say "... and you couldn't do this in a typed language" (or at least allude to it)... whereas people again and again show that, yes, it could be done in a statically typed language. In fact, this is trivially true in a sense as shown by Bob Harper[1] another person who is undoubtedly hugely influential, but who I have trouble with because of his obvious bias and trollish ways.

(I think Rich is absolutely spot on on many of the more abstract things about or industry/craft, but this is just one of those details that irks me.)

[1] http://existentialtype.wordpress.com/2011/03/19/dynamic-lang...

ajanuary
In the strangeloop talk his argument seemed to be that you couldn't use types to describe the entire and exact contract of transducers. Which is true (and true of almost anything), but you can at least describe a significant enough chunk of it to reduce errors.

As always it's a trade off. As you try to capture more of the contract, you massively increase complexity, sometimes to the point of reducing usability (i.e. staring at an obtuse type error, or trying to work out how to do basic things with a function just from a Scala type)

It's my interpretation that "you can't type transducers" is an invitation to think more deeply about types and the limitations in what they can express, mixed in with some "get off my back about creating a dynamically typed language".

seanmcdirmid
Isn't the problem with transducer typing is that transducers aren't very functional? In one of your posts you wrote:

   System.in.setTransducer(lowerAlpha);
In the LINQ/functional origami world, you could only produce new streams, not transform an existing stream in place. Because a new stream is computed, a new type can be computed. However, from what I've seen with transducers, they would be limited to basically Func<T, Predicate<T>, T> signatures.
lomnakkus
That's part of it, but of course you can embed monadic types/effects within the appropriate combinators. (I think "tel" showed an example of this in the original "transducers" post)
Blackthorn
Even if you forget entirely about Clojure for a second, Rich has this very rare gift to take a complicated subject and make it easy for the audience to understand.
lomnakkus
This is a double-edged sword. Forgive me a small anecdote: I once had a wonderful tutor who could explain any advanced concept in highly intuitive terms and it just made sense... until I got out of the classroom. At which point I'd just have this feeling of "Hang on, whaaaaaa...?". The first time I attempted the exam in this particular subject matter, I failed miserably (rightly so). Having learned my lesson, I went back to study the actual source material more thoroughly instead of mostly just listening to the probably-best-educator on the subject. That time I actually understood the material and passed with flying colours. (I still think the particular educator had a major role in me passing at all, but I digress.)

That that for what you will, but please be aware that a "gift for simplification" sometimes is a double-edged sword and can leave the audience thinking that they've understood when, actually, they haven't.

I'm not saying that's the case here, just something to be wary of.

agumonkey
I had similar experiences many times, and I don't know what's best, looking at an abstraction for weeks until you get the "ohh they just meant this" or knowing in advance it's not that obscure and then work to imprint said abstraction deeper.
lomnakkus
Ever since that experience, I've tried to somehow try to practice working with "X" in some concrete way, even though it might feel like a classic "math problem" situation. Personally, I find it helps me get a grasp on "X", whatever it might be. (Mind you, this is just personal experience so YMMV.)
brudgers
The difference here is that Rich Hickey is explaining a fundamental computing mechanism that many programmers have come in with but few think about enough to give a name: the finite state transducer [see Wikipedia].

The name "transducers" is not an invention, it's basic computer science. The idea of using transducers to abstract over an input is not really controversial in the sense that it's the essence of what turns source into executable code on digital computers. Sure, Ther are tradeoffs relative to which flavor of Turing machine underpins any abstraction, but at least this is an area where we can achieve clarity.

This is something Hickey implemented, not invented. Like all of automata, understanding the concept is likely non-trivial for some pretty smart people since it requires thinking about first principles of computing.

HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.