HN Books @HNBooksMonth

The best books of Hacker News.

Hacker News Comments on
Purely Functional Data Structures

Chris Okasaki · 2 HN points · 23 HN comments
HN Books has aggregated all Hacker News stories and comments that mention "Purely Functional Data Structures" by Chris Okasaki.
View on Amazon [↗]
HN Books may receive an affiliate commission when you make purchases on sites after clicking through links on this page.
Amazon Summary
Most books on data structures assume an imperative language such as C or C++. However, data structures for these languages do not always translate well to functional languages such as Standard ML, Haskell, or Scheme. This book describes data structures from the point of view of functional languages, with examples, and presents design techniques that allow programmers to develop their own functional data structures. The author includes both classical data structures, such as red-black trees and binomial queues, and a host of new data structures developed exclusively for functional languages. All source code is given in Standard ML and Haskell, and most of the programs are easily adaptable to other functional languages. This handy reference for professional programmers working with functional languages can also be used as a tutorial or for self-study.
HN Books Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this book.
I see. So, let's say I have a tree with arbitrary number of nodes branching into further nodes and so on. (My 'state')

Can I now make a single call to some API-function that gives me an almost identical (immutable) tree as a result except that some leaf somewhere has a modified value, or where some node has been deleted?

And, would this internally happen without much copying?

The answer to all your questions is all "yes" technically, but the way you've worded them makes me think just saying "yes" will be a bit misleading to you. So let's take a real-world example. I'll use pseudo-Scala syntax since that's fairly similar to a lot of imperative languages.

Let's construct a tree data structure from scratch and see how to update it.

  abstract class BinaryTreeOfInts
  final class Leaf(final value: Int) extends BinaryTreeOfInts
  final class Node(final value: Int, final left: BinaryTreeOfInts, final right: BinaryTreeOfInts) extends BinaryTreeOfInts
All fields here are by reference.

So because everything is final here, we've constructed an immutable tree. You cannot update any of these values once it's been constructed.

Let's see how you might construct the following binary tree

   / \
  6   1
     / \
    2   3

  myTree = Node(
I want to modify 5 now to be 10

   newTree = Node(10, myTree.left, myTree.right)
Done. This is effectively one operation. newTree and myTree are both still immutable, but there's a ton of sharing going on. Now this is the best case scenario. Modifying a leaf is the worst case scenario, but even there only requires as many operations as there are levels in your tree, which, as long as you have a fairly balanced tree, only grows logarithmically in the number of elements you have.

To illustrate let's make another change, this time changing 3 to 30.

  anotherTree = Node(
Note I've had to instantiate a new Node for each level of the tree, and I've made a new Leaf, but otherwise everything else is shared. Now this is the low-level way of performing these changes. There are various APIs and tricks to make this look a lot cleaner in code so writing this out isn't as tedious, but I figured I'd present the low-level way to make things less magical.

BTW, the logarithmic time factor to update a leaf is why you will often see logarithmic factors in time complexities of various operations on immutable data structures.

> BTW, the logarithmic time factor to update a leaf is why you will often see logarithmic factors in time complexities of various operations on immutable data structures.

A mutable tree requires the same logarithmic time factor to update a leaf, because it also requires logarithmic time to access the leaf. It seems like the real difference between the mutable and the immutable structure is that the mutable structure requires O(1) space to do the update, while the immutable one requires O(log n) space in addition to the common requirement of O(log n) time.

Sort of. Structural sharing generally actually results in amortized constant space operations if you don't need the copies. In this case, if you aren't actually using the other tree, the nodes could be GC-ed. In the most ideal case they could be GC-ed as you're constructing the new nodes, though I imagine this essentially never happens (hence amortized). But regardless you're right that the thing to focus on is space rather than time when looking at how it would be different in an immutable vs mutable context.

Also my comment was incomplete. What I should've added is that most immutable data structures are trees under the hood in some form or another, hence many operations have a logarithmic factor. Mutable data structures provide more options when you want constant time access.

A great resource for understanding how persistent data structures are implemented is Okasaki’s Purely Functional Data Structures:

Also available as a free PDF:

> A data structure cannot be functional.

I think the author is using the term "purely functional data structure" to mean a data structure that lends itself to an implementation in a purely functional language [1,2].



I feel like I already said "I understand but I disagree".

A round wheel is more useful for a car than a triangular wheel. That doesn't mean it's a "car wheel". It's just as good on a horse wagon or a bike.

You can disagree if you want. I just don't want other readers to be misled into thinking that "purely functional data structure" isn't a term of art. Given the number of references to Okasaki's book in this thread, I feel like the interested reader is sufficiently armed to learn more that I don't feel the need to continue this discussion.

You may not find the definition useful, and that's alright. I won't try to convince you.

Yup. Purely Functional Data Structures (thesis as linked, book: is awesome. I especially found some of the tree-based recursive algorithms eye-opening (once you spend the time to wrap your head around them).

IMHO required reading if you're doing any heavy FP work.

And there is UChicago course that is following some parts of this book and implements it in Elm language

The canonical text discussing how this is achieved refers to data structures optimized for immutability as "purely functional data structures". [1] These type of data structures are the default for Clojure as well, and there are a number of blog posts and presentations discussing their implementation. [2] [3]

It's a reasonably complicated topic, but the basic idea is that since the data structures are immutable, much of their structure can be shared between versions with few changes. Most of these data structures end up using a Tree of some sort. Performance characteristics can be influenced to an extent by using bucketing to control the width/height of the tree.

[1] : Purely Functional Data Structures (Amazon: [2] [3]

Going back to the basics to solidify my foundation, one each quarter. Good Practice makes one a better engineer!

Digital Electronics using [1] Operating Systems using [2] Functional Data Structures using [3] Graphics Algorithms [4]

Any recommendations for these subjects sincerely appreciated. Thanks.

[1] [2] [3] [4]

The more you practice, the more you can, the more you want to, the more you enjoy it, the less it tires you.” ― Robert A. Heinlein, The Cat Who Walks Through Walls

Operating Systems basic are well covered in this course
Thanks, this on a first skim, looks a very detailed course. And while we are at it, there is also this post on HN on the front page, which contains more resources for learning about OS.
> Digital Electronics using [1] Operating Systems using ...

also, in case you are not aware of it, there is always the nand2tetris [] thingy (currently running on coursera btw). the book is also pretty good imho.

Thanks for posting the link. I just signed up for the course. Always wanted to learn how simple logic gates end up become all purpose CPU's. I've always thought that someday we'll have same concepts in a cell which becomes a full turing machine and anyone can grow it.
thanks signa11 for the nand2tetris reminder. I have worked through that book and it is really awesome. Worth the time and effort for anyone inclined. I had posted my review on Amazon as well. [1]

I think I should enroll for the Coursera thingy and have at least 1 certificate in my kitty ;-)


> I have worked through that book and it is really awesome. Worth the time and effort for anyone inclined.

very cool :)

in case you want something more, i have _very_ fond memories of zvi-kohavi's book (switching and finite automata theory) as well. you might find useful/instructive.

Depending on your level of programming ability, one algorithm a day, IMHO, is completely doable. A number of comments and suggestions say that one per day is an unrealistic goal (yes, maybe it is) but the idea of setting a goal and working through a list of algorithms is very reasonable.

If you are just learning programming, plan on taking your time with the algorithms but practice coding every day. Find a fun project to attempt that is within your level of skill.

If you are a strong programmer in one language, find a book of algorithms using that language (some of the suggestions here in these comments are excellent). I list some of the books I like at the end of this comment.

If you are an experienced programmer, one algorithm per day is roughly doable. Especially so, because you are trying to learn one algorithm per day, not produce working, production level code for each algorithm each day.

Some algorithms are really families of algorithms and can take more than a day of study, hash based look up tables come to mind. First there are the hash functions themselves. That would be day one. Next there are several alternatives for storing entries in the hash table, e.g. open addressing vs chaining, days two and three. Then there are methods for handling collisions, linear probing, secondary hashing, etc.; that's day four. Finally there are important variations, perfect hashing, cuckoo hashing, robin hood hashing, and so forth; maybe another 5 days. Some languages are less appropriate for playing around and can make working with algorithms more difficult, instead of a couple of weeks this could easily take twice as long. After learning other methods of implementing fast lookups, its time to come back to hashing and understand when its appropriate and when alternatives are better and to understand how to combine methods for more sophisticated lookup methods.

I think you will be best served by modifying your goal a bit and saying that you will work on learning about algorithms every day and cover all of the material in a typical undergraduate course on the subject. It really is a fun branch of Computer Science.

A great starting point is Sedgewick's book/course, Algorithms [1]. For more depth and theory try [2], Cormen and Leiserson's excellent Introduction to Algorithms. Alternatively the theory is also covered by another book by Sedgewick, An Introduction to the Analysis of Algorithms [3]. A classic reference that goes far beyond these other books is of course Knuth [4], suitable for serious students of Computer Science less so as a book of recipes.

After these basics, there are books useful for special circumstances. If your goal is to be broadly and deeply familiar with Algorithms you will need to cover quite a bit of additional material.

Numerical methods -- Numerical Recipes 3rd Edition: The Art of Scientific Computing by Tuekolsky and Vetterling. I love this book. [5]

Randomized algorithms -- Randomized Algorithms by Motwani and Raghavan. [6], Probability and Computing: Randomized Algorithms and Probabilistic Analysis by Michael Mitzenmacher, [7]

Hard problems (like NP) -- Approximation Algorithms by Vazirani [8]. How to Solve It: Modern Heuristics by Michalewicz and Fogel. [9]

Data structures -- Advanced Data Structures by Brass. [10]

Functional programming -- Pearls of Functional Algorithm Design by Bird [11] and Purely Functional Data Structures by Okasaki [12].

Bit twiddling -- Hacker's Delight by Warren [13].

Distributed and parallel programming -- this material gets very hard so perhaps Distributed Algorithms by Lynch [14].

Machine learning and AI related algorithms -- Bishop's Pattern Recognition and Machine Learning [15] and Norvig's Artificial Intelligence: A Modern Approach [16]

These books will cover most of what a Ph.D. in CS might be expected to understand about algorithms. It will take years of study to work though all of them. After that, you will be reading about algorithms in journal publications (ACM and IEEE memberships are useful). For example, a recent, practical, and important development in hashing methods is called cuckoo hashing, and I don't believe that it appears in any of the books I've listed.

[1] Sedgewick, Algorithms, 2015.

[2] Cormen, et al., Introduction to Algorithms, 2009.

[3] Sedgewick, An Introduction to the Analysis of Algorithms, 2013.

[4] Knuth, The Art of Computer Programming, 2011.

[5] Tuekolsky and Vetterling, Numerical Recipes 3rd Edition: The Art of Scientific Computing, 2007.



[8] Vazirani,

[9] Michalewicz and Fogel,

[10] Brass,

[11] Bird,

[12] Okasaki,

[13] Warren,

[14] Lynch,

[15] Bishop,

[16] Norvig,

Well, when you pass a variable around, it doesn't and cannot change. This means that different threads can't get in eachothers' way anymore, for instance, but also that you can't make a big chunk of mistakes at all.

They also have serious disadvantages : they can't be memory managed in the traditional way (since they tend to reuse other instances' memory in complex ways), and thus require a GC (refcounting can work, but ...). They are VERY allocation intensive, and they are worse than most non-persistent data structures. Assuming an O(1) allocator they can match non-persistent data structures in O-ness (ie. when making an invalid assumption that is quite popular in academia. In practice memory allocation is O(1) for small values, then O(n^2) once you get close to the system's memory capacity (scanning for holes in a long list) but don't go over it, and then O(oh fuck it let's just reboot this bloody BOAT ANCHOR) when crossing that line).

Clojure is famous for having good persistent data structures. Rich Hickey went touring academia touting the benefits of immutable/persistent/functional data structures :

There's also a famous book:

> Well, when you pass a variable around, it doesn't and cannot change

Yeah but how's that different than a const?

It's not, but it still has update methods. It's a const with update methods. Example for a map:

x = map{"five": 5} y = x.put("six": 6}

Now x is map{"five": 5} and y is map{"five": 5, "six": 6}. If any other tread was using x, it hasn't changed.

Generally you build the new list using the tail of the old list (which is unchanged), your inserted value, and each of the values in front inserted onto that. So, immutability is preserved. Any existing references will not see their values changed. And the new references will share some memory with the old ones (how much depends on where in the list the value was inserted).

You can learn more in Okasaki's great book:

This book is more or less the definitive book on functional persistent data structures. More about how to design and reason about them, then a collection book.

I'm curious if anyone who has read (or is familiar with the data structures described in) Purely Functional Data Structures[0] can weigh in on how this persistent vector compares to say, the real time deque which is described to have worst case runtime of O(1) for cons/head/tail/snoc/last/init (along with plenty of other structures which have similar amortized time guarantees, etc.) Do these structures not have practical performance characteristics, or is there another reason a O(log) solution was chosen? Like the other poster mentioned here, its a bit upsetting having read something that rigorously analyzes worst time behavior to have an article hand wave away log32 as "effectively constant time". Especially because amortization is even trickier to consider with persistent data structures since you have to deal with the fact that you may have many operations taking place on your worst case node (several pushes on your edge case base vector, etc). Perhaps these tries actually account for that very elegantly, but I certainly don't know enough to intuit it from what I've seen here.


Edit: PDF version:

AFAIR the deques in PFDS don't permit anything better than O(log n) random access -- and the ones that do that are hideously complicated (again AFAIR, it's been years since I read it). Also, the thing is constants matter, and this is what the "log_32(n)" vs. "log(n)" discussion is driving at.

(You're still not going to get anywhere close to a linear scan of an array with directly embedded values, but there's just no way to achieve that kind of raw performance for pure functional data structures.)

The key point here is that Clojure-style vectors have logarithmic random reads/writes, while Okasaki's aforementioned deques do the same in linear time. Clojure uses the persistent vector by default, so it's essential that people can port their usual imperative vector algorithms and expect to get acceptable performance. Hence the focus on random access.

As to amortization, operations on Clojure-style vectors have pretty much zero variance (modulo the size of the vector). There is no resizing, no reordering, no rotation or whatever, it's just the exact same algorithm every time.

For those of you who aren't familiar with the author's other works, it's worth taking a peek at his other books.

I can whole heatedly recommend Pearls of Functional Algorithm Design

It's a good cross between two other excellent books:

- Jon Bentley's Programming Pearls


- Chris Okasaki's Purely Function Data Structures

If you haven't read all three, its well worth your while to do so!

And of course if you are going down the rabbit hole of reading Perls of Functional Algorithm Design then you need to read the "how to read Pearls of Functional Algorithm design" as well.

Seconded! Also, I'd highly recommend "Introduction to Functional Programming using Haskell" by the same author. This was the book that set me forth on the path of FP during post graduation. Though, professionally, I've been programming in Java for close to a decade now most of the FP principals have held me in good stead.

[edit] Preface says "The present book is a completely rewritten version of the second edition of my Introduction to Functional Programming using Haskel" so my recommendation is moot.

Sep 17, 2014 · 2 points, 0 comments · submitted by tosh
Of course it is obligatory to mention the classic: Okasaki's "Purely Functional Datastructures" book.

Also check out this link for additional data structures worked on since then or simply not included in the book:

Is this more required reading? Has anyone else's homework load doubled since college?
A minor nit, most of the data structures provided through Mori via ClojureScript are not based heavily on Okasaki's work, rather they are based on Bagwell's paper on mutable Ideal Hash Trees and Rich Hickey's efficient immutable implementations in Java for Clojure
Jul 20, 2013 · lispm on JavaScript Isn't Scheme
> Lots of Lisps have been backwards-incompatible with previous Lisps. Scheme, Common Lisp, Emacs Lisp, and even MACLISP and LISP 1.5 were all significantly backwards-incompatible with their predecessors.

Right. That's what I'm saying. Clojure does not care to be backwards compatible with Lisp.

> Your taxonomy of immutability and persistence is interesting

That's not mine.

Clojure took its base data structures from Haskell and modern ML.

Not Lisp.


The book comes with examples in ML and Haskell.

I don't think we're going to get anywhere further in this conversation, although I really appreciate everything you've posted so far, and I heartily second your recommendation of Okasaki's book, even though I haven't finished it myself. And I hope that I have avoided being anything like Erik Naggum in this conversation, despite my frustrations.
This conversation was incredible!
Jun 26, 2013 · tel on Clojure Tradeoffs
Immutability means that if you're "modifying" the structure you must actually be "making a new copy with a small change". Structural sharing means that usually only the small change itself gets new allocation, but it's still more pointer-chasing than a continuous memory map.

Lots of algorithms don't work with this kind of copying semantics, but you have all of Purely Functional Data Structures [1] to help.


Fair enough, but I wasn't talking about modification, I asked about iteration.
Oh, true—I jumped his point
Just iterating doesn't make any copies. Replace iterating over with modifying.

Efficient immmutable data structures are easily one of Clojure best features. Easiest trade off ever (though the actual hit to performance isn't that bad) imo.

Not only that, but you can also make a persistent structure locally transient for the duration of the (modifying) iteration, so in lots of cases you do not have to pay the persistence penalty.

functional data structures - how to bend your brain in a functional world - nearly twenty years old. Still have to take a run up to read it

How different is this PDF from the book version? (see: Regardless, thanks for the link.
The book expands upon his thesis, and has Haskell source code in an appendix. I'm more familiar with the book than that PDF (and definitely recommend it), but comparing their tables of contents would probably help.
I have read a few chapters of this very beautiful book by Richard Bird called "Pearls of Functional Algorithm Design":

You should definitely check it out. If you already know Haskell and you love algorithms, that book will probably be more interesting than the Intro to Haskell book.

Also check out Okasaki's book of functional data structures:

Not sure if this is what you want, but Chris Okasaki's "Purely Functional Data Structures" are data-structures that can be accessed in multiple threads because changes result in new versions.

The data structures in that book are also geared towards shared memory; actually, it doesn't cover parallel programming at all, it's just that functional data structures happen to be a good fit for concurrent access in shared memory systems.

You might have more success finding what you want in HPC literature, or maybe in the Erlang community.

That said, it's a decent book, although not exactly a light read; I did struggle to follow some of the numerous proofs. You'll probably find it easier if you have a CompSci background.

Haven't read it, but there is the book Purely Functional Data Structures
Have (tried to ;) read it, but this just describes patterns for constructing the smallest building blocks for what I described.
Apr 24, 2008 · carterschonwald on Algorithms in Lisp
Chris Okasaki's "Purely Functional Data Structures" covers exactly that which is lacking in a standard reference such as CLRS "Introduction to Algorithms". Read / work through both and you're well on your way to being great at algorithmic problem solving

edit: for those who are too lazy to google, here's the book on amazon

He also has a very good blog:

Thanks, It seems to be freely available too, first hit on google.

Actually thats just Bob's copy of Chris' thesis, the book covers that plus a bunch of techniques that other people also cooked up. But the thesis covers enough stuff that it'll give you a good sense of whether or not the work is a worthwhile investment.

Be warned, what you get out of this material in terms of understanding is strictly proportional to your comfort in mathematical reasoning

Right, thats his thesis ... His blog indicates that the book has more than what's in the thesis ...
Also available in extended dead-tree form:

I highly recommend the book if you're implementing or using a functional programming language. There're lots of data structures that aren't mentioned at all in traditional imperative-language textbooks. Some of them even have decent performance.

HN Books is an independent project and is not operated by Y Combinator or
~ [email protected]
;laksdfhjdhksalkfj more things ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.