Saturday, July 31, 2010

F# Sequence Lazy Evaluation Gotchas

Today’s post will be old news to experienced F# programmers. Actually, it’s old news to me, relative newcomer that I am. However, it’s one of those things I tend to forget until it jumps up to bite me, as it did today. (As always, all the code here is presented "as-is" and without warranty or implied fitness of any kind; use at your own risk.)

Here’s a perfectly valid snippet of F# code, along with its typical output:

 
let mutable myList:int list = []
 
let sideEffect v = 
  printf "(%i) " v
  myList<-v::myList
  v
 
seq{ for i=0 to 99 do yield i }
  |> Seq.map sideEffect
  |> ignore
printfn ""
printfn "len=%i" myList.Length
 




Nothing from the sequence is added to list myList, and the printf side-effect is never called! What’s going on here?

What’s going on is that the sequence generated by the comprehension is evaluated in a lazy manner. That is, it is evaluated on demand. That part is easy to remember. What’s harder to remember is that neither the Seq.map nor the ignore generates such a demand. No demand means that the sequence is never evaluated and so the side effects of updating the list and printing are never evaluated either.

Now look at this example:

 
let mutable myList:int list = []
 
let sideEffect v = 
  printf "(%i) " v
  myList<-v::myList
  v
 
let mySeq =
  seq{ for i=0 to 99 do yield i }
    |> Seq.map sideEffect
printfn ""
printfn "len=%i mySeq=%A" myList.Length mySeq
 




In the foregoing example, the call to printfn of mySeq at the end causes the evaluation of five elements from the sequence. The side effect printfn’s end up sandwiched inside the original printfn and the second printfn at the end shows that the list has been modified.

And finally, an example which calls Seq.length, a function which will assure that the entire sequence is evaluated. A side-effect printf is called for each element in the sequence comprehension, and the list is updated to contain the entire sequence:

 
let mutable myList:int list = []
 
let sideEffect v = 
  printf "(%i) " v
  myList<-v::myList
  v
 
seq{ for i=0 to 99 do yield i }
  |> Seq.map sideEffect
  |> Seq.length 
  |> ignore 
printfn ""
printfn "len=%i" myList.Length
 




What’s the takeaway here?

First: always think about what things look like from the computer’s point of view. From a human point of view, a Seq.map piped to an ignore looks like it’s saying “map the sequence and ignore the result.” From the computer’s standpoint, however, it says “map the sequence when you need to, but the ignore means you’ll never need to.” Minus the side effects, same result; add the side effects, and the result is very different.

Second: this is why a lot of very smart programmers (e.g. Don Syme, Erik Meijer, Matthew Podwysocki) never tire of making Channel 9 videos reminding us the importance of immutability, the importance of avoiding (or at least taming) side-effects, etc. In the turmoil of the trenches, it can be tough to adhere to pure functional programming. But that same turmoil can be lessened by at least learning the lessons functional programming teaches.

-Neil

p.s. I gave this blog entry a title that I might type into a search engine the next time I forget about this gotcha and go searching for information about it. Good engineering pre-planning at work, lol!

No comments: