Tuesday, May 4, 2010

S-Expressions for Semantic Networks

Continuing from the last post, this post shows how to use a simple tokenizer and evaluator to parse an S-expression (e.g. Lisp- or Scheme-like syntax) into a pared-down semantic network. This also builds on some of my earlier posts regarding using F# itself as a DSL for specifying semantic networks.

I apologize for the rudimentary nature of the semantic network code, I wanted to get something just complex enough to demonstrate the principles. It does at least provide a single, simple interface function: “Node.Link,” which could also front a much more sophisticated set of data structures.

Why do it this way instead of using F# itself as a DSL as I did earlier? Nothing definitive, it’s a judgment call really. If the F# compiler is readily available and the use-cases support the use of F# source, that’s still a good option. However, sometimes one needs data to be data, and compiled code to be compiled code, and in that case, some kind of processing of the data is required.

Why do it this way instead of say, Xml? Mostly readability. Some things will be more readable and more easily expressed using Xml. Other things will be more readable and more easily expressed using S-expressions, others by using custom infix operators, etc. Again, it’s a judgment call.

As before, please refer to Ashley Feniello’s excellent “Code Monkey Have Fun” Scheme REPL code for a more complete example of S-expressions in F#.

I wrote this fairly quickly between more mundane matters like work and life (lol); beware of bugs and typos. As always, presented "as-is" and without warranty or implied fitness of any kind; use at your own risk.

open System
open System.Collections.Generic

// This is a super-basic semantic network
// node designed only for demonstration.
// I apologize for making the node dictionary
// a static; I wanted to keep this really simple.
// In real life there would be separate graph container,
// probably passed into the evaluator below.
[<System.Diagnostics.DebuggerDisplay("{tag}")>]
type Node (tag:string) =

static let nodes = new Dictionary<string,Node>()

static let tryAddNode tag =
match nodes.TryGetValue tag with
| true,n -> n
| _ ->
let n = Node(tag)
nodes.Add(tag,n)
n

let mutable linksTo:(Node*string) list = []
let mutable linksFrom:(string*Node) list = []

static member Nodes with get() = nodes

static member Link tagFrom tagLink tagTo =
let nodeFrom = tryAddNode tagFrom
let nodeTo = tryAddNode tagTo
nodeFrom.AddLinkFrom tagLink nodeTo
nodeTo.AddLinkTo tagLink nodeFrom

member this.Tag with get() = tag

member this.AddLinkTo l n =
linksTo<-(n,l)::linksTo

member this.AddLinkFrom l n =
linksFrom<-(l,n)::linksFrom


// This little tokenizer is a gross simplification of
// Ashley Feniello's Scheme tokenizer posted 2010.01.15 at
// http://blogs.msdn.com/ashleyf/default.aspx
// See earlier posts for more about Ashley's excellent blog

type Token =
| Open
| Close
| Symbol of string


let tokenize source =
let rec symbol (a:string) l =
match l with
| (')'::_) as t -> a, t
| w::t when Char.IsWhiteSpace(w) -> a, t
| [] -> a, []
| h::t -> symbol (a+(h.ToString())) t
| _ -> failwith "Unexpected character."
let rec tokenize' a = function
| w::t when Char.IsWhiteSpace(w) -> tokenize' a t
| '('::t -> tokenize' (Open::a) t
| ')'::t -> tokenize' (Close::a) t
| h::t ->
let n,t' = symbol (h.ToString()) t
tokenize' (Symbol(n)::a) t'
| [] -> a
| _ -> failwith "Unexpected character."
tokenize' [] source


// The evaluator generates the network into
// the Node static dictionary.
let eval l =
let rec fLink nFrom link = function
| Symbol(nTo)::t ->
Node.Link nFrom link nTo
fLink nFrom link t
| Close::t -> Symbol(nFrom)::t
| _ -> failwith "Syntax error."
let fFrom = function
| Symbol(nFrom)::Symbol(link)::t -> fLink nFrom link t
| _ -> failwith "Syntax error."
let rec eval' a = function
| [] -> a
| Open::t ->
let a' = fFrom a
eval' a' t
| Close::t -> eval' (Close::a) t
| Symbol(s)::t -> eval' (Symbol(s)::a) t
eval' [] l


// Breakpoint and examine these.
let tokenList =
tokenize
(List.ofSeq "((Man isA Mammal Primate) hasA Name)")

let evalList = eval tokenList // [Symbol("Man")]

let nodes = Node.Nodes

printfn "All done."

System.Console.ReadLine() |> ignore

2 comments:

Anonymous said...

I never realized how much I relied on syntax highlighting until seeing code without it. Please colorize the code!

TechNeilogy said...

I'd love to! Unfortunately, the online formatter I've been using is language neutral. But it suddenly dawned on me that I might be able to accomplish this using MS Word or something. Let me experiment and if it works I'll retrofit some of the more recent entries.