Saturday, July 3, 2010

Dijkstra’s Shunting-Yard Algorithm

NOTE: OOPS! THE CODE IN THE ORIGINAL VERSION OF THIS POST CONTAINED A BUG. Fortunately, it caused an exception to be thrown so if you used that version it will at least fail and not propagate an error. The corrected function has been substituted below. The error was in the switching logic of the "close" function, which handles close parentheses. The exception was "Mismatched parenthesis."

I like domain-specific languages (DSLs). But one perennial problem when parsing input for a DSL is the need to convert from infix notation into some more amenable kind of order or to an abstract syntax tree. The original syntax is often limited enough in scope that a full-blown Lex/Yacc or other sophisticated parser seems like overkill. Sometimes all that’s needed is a simple tokenization, followed by a basic conversion.

Enter Dijkstra’s shunting-yard algorithm.

Although I had earlier implemented this algorithm in C#, to this point in my F# career, I had not implemented it in that language. It’s not a super complex algorithm, but it is tricky to get right in a new programming paradigm. Below is my first attempt; I’m not sure it’s correct, so take it with a grain of salt and proceed with caution. I did test it using worked examples both of my own and from online, but if there are any obvious errors, I’d be happy hear about it and to post corrections (with attribution and thanks).

I hope to extend it soon to include unary operators.

Without further ado, here is my first F# implementation of Dijkstra’s shunting-yard algorithm, including a basic test (based on the example at Wikipedia). This is probably not a minimal or most-efficient implementation, but it is a starting place.

(As always, this code is presented "as-is" and without warranty or implied fitness of any kind; use at your own risk.)

 
/// Simple token class.
type Token =
  /// An argument.
  | Arg of string
  /// A function call.
  | Fun of string
  /// Function argument separator.
  | Sep
  /// Operator of tag,precedence,left-assoc
  | Op of string*int*bool
  /// Open-parenthesis.
  | Open
  /// Close-parenthesis.
  | Close
 
 
/// Dijkstra’s shunting-yard algorithm.
/// Convert an infix stream to postfix.
let infixToPostfix =
 
  /// Handle a function separator.
  let rec sep qs qo = 
    match qs with 
    | Open::t -> qs,qo 
    | h::t -> t,(h::qo)
    | [] -> failwith "Syntax error in function call."
 
  /// Handle an operator.
  let rec op qs qo s1 p1 a1 = 
    match qs with 
    | Op(s2,p2,a2)::t -> 
      // Precendence and associativity.
      match p1<p2 with
        // < precedence, either associativity.
      | true -> op t ((Op(s2,p2,a2))::qo) s1 p1 a1
      | _ ->
        match (p2<p1) || a1 with
        | true -> ((Op(s1,p1,a1))::qs),qo
          // = precedence, right-associative.
        | _ -> op t ((Op(s2,p2,a2))::qo) s1 p1 a1
    | _ -> ((Op(s1,p1,a1))::qs),qo
 
  /// Handle close-parenthesis.
  let close qs qo = 
    let rec f qs qo =
      match qs with
      | Open::t -> t,qo
      | h::t -> f t (h::qo)
      | [] -> failwith "Mismatched parenthesis."
    match qs with 
    | [] -> failwith "Mismatched parenthesis."
    | _ ->
      let (qs0,qo0) = f qs qo
      match qs0 with 
      | h::t -> 
        match h with 
        | Fun(s) -> t,h::qo0
        | _ -> qs0,qo0
      | _ -> qs0,qo0
 
  /// Handle remaining operators.
  let rec outputOps qo = function 
    | [] -> qo
    | Open::t | Close::t -> failwith "Mismatched parenthesis."
    | h::t -> outputOps (h::qo) t
 
  /// Uncurried function.
  let rec f (qs,qo) = function
    | Arg(s)::t -> f (qs,((Arg(s))::qo)) t
    | Fun(s)::t -> f (((Fun(s))::qs),qo) t
    | Sep::t -> f (sep qs qo) t
    | Op(s,p,a)::t -> f (op qs qo s p a) t
    | Open::t -> f ((Open::qs),qo) t
    | Close::t -> f (close qs qo) t
    | [] -> outputOps qo qs
 
  // Curried function call.  
  f ([],[]) 
 
 
/// Simple test print.
let rec printqo = function
  | [] -> ()
  | Arg(s)::t | Fun(s)::t | Op(s,_,_)::t -> 
    printf "%s " s
    printqo t
  | _ -> failwith "Syntax error." 
 
// Basic operators.
let plus  = Op("+",10,false)
let minus = Op("-",10,false)
let times = Op("*",11,false)
let div   = Op("/",11,false)
let pwr   = Op("^",12,true)
 
// Test from Wikipedia entry.
// Infix: 3 + 4 * 2 / ( 1 − 5 ) ^ 2 ^ 3
// Postfix: 3 4 2 * 1 5 - 2 3 ^ ^ / +
let x0 = [
  Arg("3");
  plus;
  Arg("4");
  times;
  Arg("2");
  div;
  Open;
  Arg("1");
  minus;
  Arg("5");
  Close;
  pwr;
  Arg("2");
  pwr;
  Arg("3");
  ]
 
// Convert and print.
printqo (infixToPostfix x0 |> List.rev)
 
printfn "Done."

No comments: