Purely Functional Data Structures in Clojure: Leftist Heaps

This post is part of a series about Chris Okasaki’s Purely Functional Data Structures. You can see all posts in the series by visiting the functional-data-structures category in this blog.


Last year I started reading a book called Purely Functional Data Structures. It’s a fascinating book and if you’ve ever wondered how Clojure’s persistent data structures work, it’s mandatory reading.

However, all code samples in the book are written in ML - with Haskell versions in the end of the book. This means I got stuck in Chapter 3, where the ML snippets start.

I had no clue about Haskell’s - much less ML’s! - syntax and I was finding it very difficult to follow along. What I did notice is that their syntaxes are not so different from each other.

So I put the book down and read Learn You a Haskell For Great Good! with the hopes that learning more about haskell’s syntax - in particular, learning how to read its type signatures - would help me get going with Puretly Functional Data Structures.

Luckily, I was right - and I recommend you do the same if you’re not familiar with either of those languages. Learn You a Haskell For Great Good! is a great book and I got a lot out of it. My series on Monads is a product of reading it.

Enough background though.

The purpose of this post is two-fold: One is to share the github repository I created and that will contain the Clojure versions of the data structures in the book as well as most solutions to the exercises - or at least as many as my time-poor life allows me to implement.

The other is to walk you through some of the code and get a discussion going. Hopefully we will all learn something - as I certainly have when implementing these. Today, we’ll start with Leftist Heaps.

Leftist Heaps

Leftist Heaps - or trees - are a variant of binary heaps that can be used as priority queues. On top of the standard invariants of binary heaps, it obeys the leftist property:

  • Every node has a rank, which is the distance from its right spine to the nearest leaf
  • A node’s left child has a rank at least as large as its right child

In a nutshell, these are the operations we need to be able to perform on a leftist heap:

  • insert a value into an existing heap
  • merge two heaps
  • find the minimum value in a heap
  • delete the minimum value, returning a new heap

Since the book uses ML/Haskell, it starts with a data type definition for Heaps that exposes these and a couple of other auxiliary functions. I decided to take a stab at writing the solution using Clojure’s protocols and records:

1
2
3
4
5
6
7
8
9
(defprotocol Heap
  (is-empty?   [this])
  (insert      [this v])
  (merge       [this other])
  (rank        [this])
  (find-min    [this])
  (delete-min  [this]))

(defrecord LeftistHeap [rank value left right])

When implementing the algorithms the base case for the recursive solutions will involve dealing with nil values which at first seems like it wouldn’t be a problem. However, protocol functions dispatch on the type of its first argument so what happens if I call the function is-empty? on nil?

Luckily, Clojure allows us to extend a protocol to core types:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
(extend-protocol Heap
  nil
  (rank [_] 0)
  (merge [_ other] other)
  (is-empty? [_] true)

  LeftistHeap
  (is-empty? [this]
    (nil? this))

  (rank [this]
    (:rank this))

  (merge [{val-this :value left-this :left right-this :right :as this}
          {val-other :value left-other :left right-other :right :as other}]
    (cond
     (is-empty? other) this
     (<= val-this val-other) (ensure-leftist left-this
                                             (merge right-this other)
                                             val-this)
     :else (ensure-leftist left-other
                           (merge this right-other)
                           val-other)))

  (insert [this v]
    (merge (->LeftistHeap 1 v nil nil)
           this))

  (find-min [{v :value}] v)

  (delete-min [{left :left right :right}]
    (merge right left)))

Note how I extended a few of the protocol functions to the nil data type, allowing me to continue with this implementation with no nasty hacks.

There’s one last bit missing: a function that will ensure each heap retains the leftist property:

1
2
3
4
5
6
7
(defn ensure-leftist
 [this other v]
 (let [rank-this (rank this)
       rank-other (rank other)]
   (if (>= rank-this rank-other)
     (->LeftistHeap (inc rank-other) v this other)
     (->LeftistHeap (inc rank-this) v other this))))

The reason this function is isolated is that the Heap protocol defined above is fairly generic and could be used for defining other types of heaps - and I didn’t feel it warranted its own interface.

We can now play with it and create a new leftist heap:

1
2
3
4
5
6
7
(-> (->LeftistHeap 1 3 nil nil)
                   (insert 2)
                   (insert 7)
                   (insert 4)
                   (insert 10)
                   (insert 1)
                   (insert 20))

While I quite like this approach, I thought I’d also implement this solution using Clojure’s core data types - maps in this case - and no protocols. The code is shown below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
(defn mk-heap [rank value left right]
  {:rank rank :value value :left left :right right})

(defn heap-rank [heap]
  (if (nil? heap)
    0
    (:rank heap)))

(defn ensure-leftist-heap [value heap-a heap-b]
  (let [rank-a (heap-rank heap-a)
        rank-b (heap-rank heap-b)]
    (if (>= rank-a rank-b)
      (mk-heap (inc rank-b) value heap-a heap-b)
      (mk-heap (inc rank-a) value heap-b heap-a))))

(defn merge-heaps [{val-a :value left-a :left right-a :right :as heap-a}
                   {val-b :value left-b :left right-b :right :as heap-b}]
  (cond
   (nil? heap-a) heap-b
   (nil? heap-b) heap-a
   (<= val-a val-b) (ensure-leftist-heap val-a
                                         left-a
                                         (merge-heaps right-a heap-b))
   :else (ensure-leftist-heap val-b
                              left-b
                              (merge-heaps heap-a right-b))))

(defn heap-insert [value heap]
  (merge-heaps (mk-heap 1 value nil nil)
               heap))

(defn heap-find-min [{v :value}] v)

(defn heap-delete-min [{left :left right :right}]
  (merge-heaps right left))

Using it is equally simple:

1
2
3
4
5
6
7
(->> (mk-heap 1 3 nil nil)
                    (heap-insert 2)
                    (heap-insert 7)
                    (heap-insert 4)
                    (heap-insert 10)
                    (heap-insert 1)
                    (heap-insert 20))

That’s it for now.

As I implement more of the book’s code and exercises I’ll add them to the github repo - it also includes tests for all implementations.

Enjoy :)

Comments