Monday, May 31, 2010

List<>.collect Implementations with Benchmarks

I had a situation come up in my proof-of-concept where I needed to do a collect on a .NET List<>. So I did some experiments with how it might best be accomplished. A key factor was that this is something that may be done literally millions of times a day if the application makes it to deployment; so I was concerned with performance. I came up with several versions that I benchmarked. All assume an “open System.Collections.Generic.”

As always, presented "as-is" and without warranty or implied fitness of any kind; use at your own risk.

This is the most straightforward; it’s how you would immediately think of doing it in an imperative style:

let listCollect0<'a,'b> (f:'a->List<'b>)
                        (la:List<'a>) : List<'b> =
  let lb = new List<'b>()
  for a in la do
    for b in (f a) do
      lb.Add b
  lb

These two versions are probably the most straightforward functional style:

let listCollect1<'a,'b> (f:'a->List<'b>)
                        (la:List<'a>) : List<'b> =
  let lb = new List<'b>()
  Seq.iter (fun a->Seq.iter lb.Add (f a)) la
  lb
 
 
let listCollect2<'a,'b> (f:'a->List<'b>)
                        (la:List<'a>) : List<'b> =
  let lb = new List<'b>()
  Seq.iter lb.Add (Seq.collect f la)
  lb

This version is the same as one of the above functional styles, but with some “hints” to help with optimization:

let listCollect3<'a,'b> (f:'a->List<'b>)
                        (la:List<'a>) : List<'b> =
  let fAdd (l:List<'b>) (a:'a) = Seq.iter l.Add (f a)
  let lb = new List<'b>()
  let fAddb = fAdd lb
  Seq.iter fAddb la
  lb

This version uses the fold function to manage the returned List<’b>:

let listCollect4<'a,'b> (f:'a->List<'b>)
                        (la:List<'a>) : List<'b> =
  Seq.fold 
    (fun (l:List<'b>) (b:'b) -> l.Add b; l) 
    (new List<'b>()) 
    (Seq.collect f la) 

These versions mirror 1 and 2, but use sequences until the last step.

let listCollect5<'a,'b> (f:'a->seq<'b>)
                        (la:List<'a>) : List<'b> =
  let lb = new List<'b>()
  Seq.iter (fun a->Seq.iter lb.Add (f a)) la
  lb
 
 
let listCollect6<'a,'b> (f:'a->seq<'b>)
                        (la:seq<'a>) : List<'b> =
  let lb = new List<'b>()
  Seq.iter lb.Add (Seq.collect f la)
  lb

And here are the benchmarks in %. (I left out the test code; it’s not particularly relevant. Suffice to say all were compiled and run in F# 2, release mode, using tests cleverly designed to foil any optimization.)

listCollect0 = 100%
listCollect1 = 210%
listCollect2 = 169%
listCollect3 = 214%
listCollect4 = 205%
listCollect5 = 434%
listCollect6 = 546%


It turns out that the naïve version worked the best by a pretty big margin. The straightforward functional versions (including the one with “hints”) were second. The ones which mixed lists and sequences were the slowest. What does this imply?

First: it says that if performance is an issue, you simply have to run benchmarks. Furthermore, these benchmarks should include at least one version which is designed to be as simple and direct as possible.

Second: although the straightforward functional versions were slower than the naïve version, they were still plenty fast enough for applications where iteration counts may be in the thousands rather than the millions. They might be preferable in the former instances because they provide a familiar idiom for anyone maintaining the code.

Third: attempts to make the code overly general may impair performance. Stick to the level of abstraction you need, neither more nor less. This is where F# has a real advantage, since it is a language that rewards quick experimentation and the local customization of code snippets.

Bottom line: I’ll probably use the naïve version (listCollect0) in my application. If I come across a similar situation which requires lower performance, I’ll probably use something more like listCollect1 or listCollect2.

-Neil

2 comments:

Michael Robin said...

I think fastest solution is also the easiest to read - I think this would even be true even for funcational coders. What we're doing is using a new instance of a .Net mutable data-structure anyway (List<'T> not an f# 't list), - so I don't know why you'd want to put a functional veil over the code anyway.

TechNeilogy said...

Thanks for commenting, Michael!

In the case of this example, I'm inclined to agree: the fastest solution is also very easy to read. The views I express above may be somewhat slanted by the fact that I am trying to force myself to think more functionally -- even to an excess for a while -- as a learning exercise, while also trying to internalize things like the efficiency of the generated code.