OCaml List rev_map vs map

Reading time ~3 minutes

If you found this page, you are probably very familiar with OCaml already!

So, OCaml has a map function whose purpose is pretty clear: apply a function to each element of a list ('a -> 'b) -> 'a list -> 'b list.

And rev_map, which seems more complex as it applies a function to each element of a list and returns a reversed version the list. Why would OCaml developers bother to have something that exotic in the standard library?

Besides, it would be super easy to implement. List.map f l |> List.rev and we are done! Aren’t we?

A not so subtle issue with large lists

Let’s try to do the following with a large list:

let l = List.init 2000000 (fun i -> i)

let () = 
  let _ = List.map (fun x -> x mod 2) l in
  ()

If you run this code, you should run into a stack overflow error:

Stack overflow during evaluation (looping recursion?).

While the following (which does the same thing) runs perfectly well:

let l = List.init 2000000 (fun i -> i)

let () = 
  let _ = List.rev_map (fun x -> x mod 2) l |> List.rev in
  ()

So what just happened?

Let’s jump to the code!

If you are using vim and merlin, just type gd in normal mode to jump to the definition of the function you need.

let rec map f = function
    [] -> []
  | a::l -> let r = f a in r :: map f l

And rev_map

let rev_map f l =
  let rec rmap_f accu = function
    | [] -> accu
    | a::l -> rmap_f (f a :: accu) l
  in
  rmap_f [] l

As you can see, rev_map holds an extra variable: accu, the accumulator. When the case [] is reached, the program only has to return the accumulator and can somehow forget the call stack (but has to remember the return value inside of the accumulator).

On the other hand, the map function above has to “remember” of the stacked calls to the (recursive) map function.

The compiler can do all sort of optimizations and distinguishes between what is called tail recursive functions (rev_map) and non tail recursive functions (map).

The details of tail recursive functions are beyond the scope of this article, let’s just keep in mind that a tail recursive function does not have to “remember” the nested function calls, even if the function is recursive!

Is my function tail recursive?

The developers propose the keyword tailcall that will issue a warning if the compiler detects that a function is not tail recursive. The following snippet, per example:

let l = List.init 200 (fun i -> i)

let rec map f = function
    [] -> []
  | a::l -> let r = f a in r :: (map [@tailcall]) f l

let rev_map f l =
  let rec rmap_f accu = function
    | [] -> accu
    | a::l -> (rmap_f [@tailcall]) (f a :: accu) l
  in
  rmap_f [] l

let () = 
  let _ = map (fun x -> x mod 2) l |> List.rev in
  ()

Would output the following warning upon compilation.

File "./test.ml", line 5, characters 32-53:
5 |   | a::l -> let r = f a in r :: (map [@tailcall]) f l
                                    ^^^^^^^^^^^^^^^^^^^^^
Warning 51 [wrong-tailcall-expectation]: expected tailcall

What if my function cannot be turned into a tail-recursive function

It is not always possible to turn a recursive function into a tail recursive function. If your input is too large and your program crashes you will either have to think about another way to achieve what you want, or you can increase the stack limit (per example with ulimit on Linux)!

How to optimize PyTorch code ?

Optimizing some deep learning code may seem quite complicated. After all, PyTorch is already super optimized so why (and how) one could i...… Continue reading

Acronyms of deep learning

Published on March 10, 2024

AI with OCaml : the tic tac toe game

Published on September 24, 2023