OCaml and sequences from scratch, part 1

I really like ML languages (OCaml, Elm, F#). Recently I was doing the problems in 99 problems in OCaml and found my lack of understanding in some basic patterns with lists.

I decided to reimplement some basic functions and structures in OCaml from scratch as an exercise. I usually do this in a Jupyter notebook but I took my notes and I will be publishing them in my blog from now on (as a way to remember where I put that bloody notebook). You can practice and redo this samples with the Jupyter kernel for OCaml, in fact, I already have a small docker image with it ready for you to try if you want to, go and pull it from Docker Hub.

The definition of sequences

Let’s start defining a sequence, we will call it seq:

type 'a seq =
  | End
  | Next of 'a * 'a seq
Out[*]:
type 'a seq = Nil | Next of 'a * 'a seq

As you can see we use a generic parameter 'a for the value of the sequence. We can create a sequence with numbers from 0 to 3 and an empty sequence:

let sample = Next(1, Next(2, Next(3, End))) ;;
let empty = End
Out[*]:
val sample : int seq = Next (1, Next (2, Next (3, End)))
val empty : 'a seq = End

Basic operations

Let’s create the simplest operations in town: length, head, tail, cons and nth

let seq_length x =
  let rec length' acc = function
    | End -> acc
    | Next (_, t) -> length' (acc + 1) t
  in
  length' 0 x
;;

seq_length sample ;;
seq_length empty
Out[*]:
val seq_length : 'a seq -> int = <fun>
- : int = 3
- : int = 0

This pattern is very common in OCaml, I call it inner accumulator loop. There is a variation of this pattern and we will see later.

The standard library in OCaml and Caml has the function length in the List module

Now it is time for the head operator, or well, it is named hd in the List module in Caml, it will return only the first element of a sequence or throw an invalid argument if the sequence is empty.

let seq_head = function
  | End -> raise (Invalid_argument "seq_head")
  | Next (x, _) -> x
;;

seq_head sample
Out[*]:
val seq_head : 'a seq -> 'a = <fun>
- : int = 1

The contrapart of this operator is its brother tail, or tl in the List module.

let seq_tail = function
  | End -> raise (Invalid_argument "seq_tail")
  | Next (_, t) -> t
;;

seq_tail sample
Out[*]:
val seq_tail : 'a seq -> 'a seq = <fun>
- : int seq = Next(2, Next(3, End))

Easy!

It is time for the cons operator, this is basically adding an element to the head of the sequence, in OCaml and Standard ML is denoted as well with the operator ::

let seq_cons x al = Next(x, l) ;;

seq_cons 0 sample
Out[*]:
val seq_cons : 'a -> 'a seq -> 'a seq = <fun>
- : int seq = Next(0, Next(1, Next(2, Next(3, End))))

Last but not least, it is time to implement the nth operator, it will return the element at the index n starting from 0.

let seq_nth n l =
  if n < 0 then raise (Invalid_argument "nth") else
  match l with
    | End -> raise (Failure "nth")
    | Next (x, tl) -> if n = 0 then x else nth_seq (n - 1) tl
;;

seq_nth 0 sample ;;
seq_nth 2 sample
Out[*]:
val seq_nth : int -> 'a seq -> 'a
- : int = 1
- : int = 3

Interestinly, this function had the potential to use the inner accumulator loop but if we see clearly it is not needed, remember, if we discard the value of a function when it returns in a recursive call it is basically tail recursive, and here we do nothing with the return value of the function until we need it.

An easy way to check if a function call is tail recursive is asking the compiler to tell us so (available since OCaml 4.04)

let seq_nth n l =
  if n < 0 then raise (Invalid_argument "nth") else
  match l with
    | End -> raise (Failure "nth")
    | Next (x, tl) -> if n = 0 then x else ((nth_seq [@tailcall]) (n - 1) tl)

In this case, if that marked call is not tail recursive (the place where we suspect is not), the compiler will throw a warning.

That is all for today, we will continue another day!.

NOTE: If somebody is interested in the Jupyter notebook for this blog post (and the whole series), it is located in my GitHub repository.