Monday, March 22, 2010

Semantic Net Search 0.1

Here is the latest; a step towards deep search.

I decided to implement the search using an auxiliary class. This allows the network itself to remain more pure, with details such as the transitivity of links being a function of the search rather than the structure. (The epistemological implications of this are intriguing.)

I did augment the network with a dictionary which stores a table of the last item in a relationship keyed by the first item and the link. This is for efficiency only; it could also have been accomplished using the existing Find function.

Lastly, my first attempt at a search function is rather grim, messy, and imperative. It works, but I’m not proud of it. I tried to construct a version based on continuations, but it defeated my resolve to get something that at least worked posted to the blog today. Watch for a cleaner, more functional version in the next blog post or two. Until then, consider it an example of an anti-pattern, lol.

As always, no warranties or implied fitness. Use at your own risk. I rushed to post this, so there may be bugs.

Library:
module SemanticNet

open System.Collections.Generic


// I'm a fool for syntactic sugar.
type internal Rel = string*string*string
type internal RelHash = HashSet<Rel>
type internal RelDict = Dictionary<string,RelHash>
type internal FinderVal = bool*(RelHash option)
type internal RelChain = Dictionary<(string*string),HashSet<string>>



// Computation expression builder for finds.
type internal RelFinder () =

// "let!" function.
member this.Bind (((b,h):FinderVal),
(f:FinderVal->FinderVal)) =
match b with
| true -> f (b,h)
| _ -> (b,h)

// "return" function.
member this.Return bh = bh



// Holds the relationships.
type Graph () =

// Holds all the relationships.
let relHash = new RelHash()

// Holds all the relationships as chains.
// Supports deep search.
let relChain = new RelChain()

// Maps a node in each position
// to the relationships for that node.
let n0ToRelHash = new RelDict()
let n1ToRelHash = new RelDict()
let n2ToRelHash = new RelDict()

// Computation expression builder for finds.
let relFinder = RelFinder()

// Internal function for finds.
let relFinderComp (d:RelDict)
(so:string option)
(bhi:FinderVal) =
match so with
| None -> bhi
| Some(s) ->
match d.TryGetValue s with
| (true,h) ->
match bhi with
// Copy the first hash found.
| (true,None) -> (true,Some(RelHash(h)))
| (true,Some(hi)) ->
hi.IntersectWith(h)
(true,Some(hi))
| _ -> failwith "Internal program error."
| _ -> (false,None)

// // For a slightly more efficient first call,
// // this can be used.
//
// let relFinderCompFst (d:RelDict)
// (so:string option) =
// match so with
// | None -> (true,None)
// | Some(s) ->
// match d.TryGetValue s with
// | (true,h) -> (true,Some(RelHash(h)))
// | _ -> (false,None)

// Add a relationship to a dictionary.
let tryAddRel (d:RelDict) k r =
match d.TryGetValue k with
| (true,h) ->
h.Add r |> ignore
| _ ->
let h = new RelHash()
h.Add r |> ignore
d.Add(k,h)

// Add a relationship to the chain dictionary.
let tryAddRelChain (s0,s1,s2) =
(match relChain.TryGetValue((s0,s1)) with
| (true,h) -> h
| _ ->
let h = HashSet<string>()
relChain.Add((s0,s1),h)
h).Add s2 |> ignore

// Remove a relationship from a dictionary.
let tryRemoveRel (d:RelDict) k r =
match d.TryGetValue k with
| (true,h) ->
let rmv = h.Remove r
// Clean up empty entries.
if (h.Count=0) then d.Remove k |> ignore
//rmv
| _ -> ()

// Add a relationship to the chain dictionary.
let tryRemoveRelChain (s0,s1,s2) =
match relChain.TryGetValue((s0,s1)) with
| (true,h) ->
let rmv = h.Remove s2
// Clean up empty entries.
if (h.Count=0) then relChain.Remove((s0,s1)) |> ignore
//rmv
| _ -> ()

// Add a relationship.
member this.Add r =
match relHash.Add r with
| false -> ()
| _ ->
let (s0,s1,s2) = r
tryAddRelChain (s0,s1,s2) |> ignore
tryAddRel n0ToRelHash s0 r
tryAddRel n1ToRelHash s1 r
tryAddRel n2ToRelHash s1 r

// Add a bunch of relationships.
member this.Add (e:IEnumerable<Rel>) =
for r in e do
this.Add r

// Return a hash of all relationships.
member this.All =
RelHash(relHash)

// Return true if a relationship exists.
member this.Exists r =
relHash.Contains r

// Return a hash option of the matched relationships.
// Note: Uses string option; None acts as a wildcard in Find.
member this.Find (s0,s1,s2) =
match
(relFinder {
// // See note above.
// let! r0 = relFinderCompFst n0ToRelHash s0
let! r0 = relFinderComp n0ToRelHash s0 (true,None)
let! r1 = relFinderComp n1ToRelHash s1 r0
return relFinderComp n2ToRelHash s2 r1 }) with
// Three wildcards.
| (true,None) -> Some(RelHash(relHash))
// Something found.
| (true,h) -> h
// Nothing found.
| _ -> None

// Return a hash option of the matched conclusions.
// More efficient than Find(s0,s1,s2) with wildcard.
member this.Find (s0,s1) =
relChain.TryGetValue((s0,s1))

// Return a hash option of the matched relationships.
// Note: Uses strings; explicit wildcard.
member this.FindW w (s0,s1,s2) =
this.Find (
(if s0=w then None else Some(s0)),
(if s1=w then None else Some(s1)),
(if s2=w then None else Some(s2)))

// Remove a relationship.
member this.Remove r =
match relHash.Remove r with
| true ->
let (s0,s1,s2) = r
tryRemoveRelChain (s0,s1,s2)
tryRemoveRel n0ToRelHash s0 r
tryRemoveRel n1ToRelHash s1 r
tryRemoveRel n2ToRelHash s2 r
| _ -> ()

// Remove a bunch of relationships.
member this.Remove (e:IEnumerable<Rel>) =
for r in e do
this.Remove r

// + operator adds a relationship, returns graph.
static member (+) ((g:Graph),(r:Rel)) =
g.Add r
g

// - operator removes a relationship, returns graph.
static member (-) ((g:Graph),(r:Rel)) =
g.Remove r
g



// Abstract interface for a searcher.
[<AbstractClass>]
type SearcherBase () =
//abstract member Search: Graph -> Rel -> bool
abstract member Search: Graph -> Rel -> Rel list



// A simple searcher.
type SimpleSearcher () =
inherit SearcherBase()

// Used to prevent cycling.
let tested = new HashSet<string>()

// Indicates transitive links.
let transitive =
let transHash = new HashSet<string>()
transHash.Add "isA" |> ignore
transHash.Add "hasA" |> ignore
(fun s -> transHash.Contains(s))

// Search internal.
let rec searchDeep (gs:Graph) ((s0,s1,s2):Rel) =
match tested.Add(s0) with
| false -> None
| true ->
match gs.Exists (s0,s1,s2) with
| true -> Some([(s0,s1,s2)])
| _ ->
match gs.Find(s0,s1) with
| (false,_) -> None
| (_,h) ->
let mutable e = h.GetEnumerator()
let mutable l = true
let mutable rtn = None
while l && (e.MoveNext()) do
match searchDeep gs (e.Current,s1,s2) with
| None -> ()
| Some(l1) ->
l <- false
rtn <- Some((s0,s1,e.Current)::l1)
rtn

// Search interface.
override this.Search (gs:Graph) (r:Rel) =
let (_,s1,_) = r
match transitive s1 with
| true ->
tested.Clear()
match searchDeep gs r with
| None -> []
| Some(l) -> l
| _ ->
match gs.Exists r with
| true -> [r]
| _ -> []


Test:
open SemanticNet

// Create a graph and
// add some relationships.

let g =
Graph() +
("mammal","isA", "animal") +
("insect","isA", "animal") +
("dog", "isA", "mammal") +
("flea", "isA", "insect") +
("dog", "isA", "pet") +
("dog", "hasA", "tag") +
("chow", "isA", "dog") +
("dog", "scratches", "flea") +
("flea", "scratches", "itch") +
("cat", "isA", "pet")

let simpleSearch = (SimpleSearcher()).Search g

// Search.

let l0 = simpleSearch ("chow","isA", "dog")
let l1 = simpleSearch ("chow","isA", "animal")
let l2 = simpleSearch ("chow","isA", "sasquatch")
let l3 = simpleSearch ("dog", "scratches","itch")
let l4 = simpleSearch ("flea", "isA", "animal")
let l5 = simpleSearch ("flea", "isA", "mammal")

printfn "All done!"

3 comments:

Tom Berend said...

this seems like such a neat function. thank you.

i converted it to PHP for a small project i'm working on.

stripped debug to try to squeeze into your comment limits. blogger doesn't allow PRE or CODE, hope this formats OK.



Class semanticNet{

var $graphLeft = array(); // the KEY is "left&rel&right" for all three, so can find them quickly
var $graphRel = array(); // the VALUE is just the left, relation, or right
var $graphRight = array(); // Note, we only use $graphRel here, but we will use the other parts later

var $relationsGraph = array(); // a subgraph used for recursive search, only contains the elements with the same Relation as the search


// Add a relationship.
function add($s0,$s1,$s2){
$combo = $s0.'&'.$s1.'&'.$s2;

if (array_key_exists ($combo, $this->graphLeft))
echo "Cannot ADD $combo, it already exists in Graph
";
else{
$this->graphLeft [$combo] = $s0;
$this->graphRel [$combo] = $s1;
$this->graphRight[$combo] = $s2;
}
}


// Remove a relationship from a dictionary.
function remove($s0,$s1,$s2){
$combo = $s0.'&'.$s1.'&'.$s2;

if (array_key_exists ($combo, $graphLeft)){
unset($this->graphLeft [$combo]);
unset($this->graphRel [$combo]);
unset($this->graphRight[$combo]);
}else
echo "Cannot REMOVE $combo, it does not exist in Graph
";
}


// this is a recursive function that searches $subGraph (already prefiltered so only has relations of $r1)
function searchDeep($r0,$r1,$r2){
$combo = $r0.'&'.$r1.'&'.$r2; // create a hash string

// we start by seeing if the link we want is in the subset
if (array_key_exists ($combo, $this->relationsGraph)) return true;

// otherwise iterate on the RIGHT side. so if we have A IsA B and we are looking for A IsA C we recurse looking for B IsA C.
// we start by getting ALL the records that use this type of link
foreach($this->relationsGraph as $key=>$value){
$cArray = explode('&',$key);

// if the left side doesn't match, then no point is hunting on the right side
if ($cArray[0] !== $r0) continue; // we are looking for A IsA C, and this isn't an A statement

// let's not bother with useless searches
if ($cArray[2] == $r2) continue; // useless test - it will just be B IsA B
if ($cArray[2] == $r0) continue; // useless test - it will just be B IsA A (and we know A IsA B)

if ($this->searchDeep($cArray[2],$r1,$r2)){
return (true);
}

}

// if we get here, then the deep search did not find a match
return (false);
}



// A simple searcher.
function search($r0,$r1,$r2){

// we start by getting ONLY the records that use this type of link
$this->relationsGraph = array_filter($this->graphRel,create_function('$n',"return (\$n=='$r1');"));
return ($this->searchDeep($r0,$r1,$r2)); // see if A is part of C
}


}

TechNeilogy said...

Cool! Thanks for posting that.

One of the "old-fashioned" things I like about F# is the possibility for code and idea re-use. In “modern times” this has become a synonym for libraries and assemblies. But in the old days, it encompassed the idea of code that could be quickly copied and modified to new purposes, sometimes in a different language. Code snippets provide a little bit of this, but a well-designed language will make it easier to copy, modify and translate larger blocks while still repelling bugs and spaghetti code.

Also, lurkers, be sure to check out Tom’s blog from the link in his comment, it’s worth the trip!

Tom Berend said...

i couldn't really read your code. but your recursive structure and use of hashing is clear. just seeing how small it was gave me the courage to try.

however, there's a mistake in my code. for my application, i only needed something a bit stronger than a 'tree'. so i didn't consider chaining different relationships.

if (Mammal HasA WarmBlood) and (Dog IsA Mammal), then (Dog HasA WarmBlood). this code won't find that. extending the function required only a few lines, but it becomes too big to post as a comment.

thanks again for the inspiration.