Tuesday, June 30, 2009

Enumerators for Dependency Injection

A situation came up where I needed to select elements based on dependency injection. The elements were portions of a larger bitmap that were being used as tiles in a 2D graphics engine. Selection might be sequential, random, fixed, etc., but needed to be hidden from downstream processes. In particular, I needed to inject not the enumerators themselves, but enumerator factories.

At first I was going to use a custom interface. After some thought, however, I decided that the .NET IEnumerator<> interface fit the bill nicely. Technically an enumeration is some kind of ordered or unordered sequence from a collection. But a randomized sequence fits that bill, and it’s not too much of stretch to think of an infinitely looping sequence as an enumerator. (For example, think of the original collection as specifying the set of allowable items of an infinite sequence.)

So I came up the following helper functions:
static IEnumerator<TOutput> MakeEnumerator <TInput,TOutput,TState> (
TInput inputIn,
TState stateIn,
Func<TInput,TState,TState> funcMoveNext,
Func<TInput,TState,bool> funcMovedNextOk,
Func<TInput,TState,TOutput> funcCurrent)
{
while (true)
{
stateIn = funcMoveNext(inputIn,stateIn);

if (!funcMovedNextOk (inputIn,stateIn))
break;

yield return funcCurrent(inputIn,stateIn);
}
}


static Func<TInput,IEnumerator<TOutput>> MakeEnumeratorFunc <TInput,TOutput,TState> (
TState stateIn,
Func<TInput,TState,TState> funcMoveNext,
Func<TInput,TState,bool> funcMovedNextOk,
Func<TInput,TState,TOutput> funcCurrent)
{
return
(i)=> (
MakeEnumerator(
i,
stateIn,
funcMoveNext,
funcMovedNextOk,
funcCurrent));
}

The first helper is a generic enumerator builder that takes three functions and a state which control the enumeration. The functions transition from state to state, determine whether a transition has exhausted the enumerator, and select an enumerated item. (I separated the move next functionality and inverted its typical order because it makes things a lot more convenient.)

The second helper uses this first helper to build an enumerator factory function.

Below shows how this could be used to build an enumerator factory for canonical list enumeration. (You wouldn’t want to use these functions for this in a typical scenario; the traditional way would suffice there. They are mainly useful in dependency injection situations.)

Func<List<int>,IEnumerator<int>> xMakeEnumeratorFunc =
MakeEnumeratorFunc<List<int>,int,int>(
-1,
(i,s)=>(s+1),
(i,s)=>(s<i.Count),
(i,s)=>(i[s]));

List<int> xList = new List<int>() { 1, 3, 5, 7 };

IEnumerator<int> xEnumerator = xMakeEnumeratorFunc(xList);

// Inject xEnumerator here.

Fun, func, function!

(Tomorrow: I find a situation where a tradition boolean MoveNext is superior and post the overloaded functions.)

-Neil

Thursday, June 11, 2009

Enumerating Bits

OK, so the enumerating the set of subsets of non-zero bits is not a common task. But what about enumerating the set of non-zero bits in an integer? That’s something that tends to happen at least one or twice in any large project. Here’s how that can be done use a related function.

Given an integer (i), the least significant non-zero bit (b) can be found using this equation:

b = (i & -i)

(This is discussed as equation (37) in Donald Knuth’s The Art of Computer Programming, Volume 4, pre-fascicle 1A, page 8.)

It is easy to turn this into a loop which enumerates the bits in an integer:
while (xInteger>0)
{
int xBit = (xInteger & -xInteger);
xInteger &= ~xBit;
// Do something with xBit here.
}

(This is probably a case where one wouldn’t want to use a generic “yield return” enumerator, since if one is working at this level of detail, efficiency is probably the main concern.)

This can also be made to work in C# with unsigned integers by using various types of casting. But if efficiency is paramount, pay attention to the generated MSIL, since casting at some points generates more efficient code than does casting at other points.

Tests in release mode show that this construct is nearly four times faster in C# (.NET 3.5 version) than is shifting! But it looks arcane at first glance, so be sure to document the code well.

Tuesday, June 9, 2009

Power Set of a Bitmask

Here’s a cute trick for enumerating the power set of an integer bitmask. It’s based on equation (84) in Donald Knuth’s The Art of Computer Programming, Volume 4, pre-fascicle 1A, page 18, available in an "alpha test" version at:

http://www-cs-faculty.stanford.edu/~knuth/taocp.html

The idea is that, given a subset (x) of a mask (M) the next largest lexicographic subset (x') is:

x’ = (x – M) & M

Combined with the C# "yield return" statement, this allows a simple enumeration of the power set of a mask. You simply seed the subset with zero, and iterate until it is once again zero.

public static IEnumerable<int> PowerSet (
int maskIn)
{
int xSet = 0;

do
{
yield return xSet;
xSet = (xSet - maskIn) & maskIn;
}
while (xSet!=0);
}

For example, calling PowerSet(0x19) returns an enumerator which yields (in binary):

00000
00001
01000
01001
10000
10001
11000
11001


I can’t think of a specific use for it right off hand, but if a project uses enough bit manipulation, I’m sure something would arise.

Code Generator Oddity

The following is a exerpted example of my implementation of a well-known permutation algorithm. Note the code generation oddity flagged by the comment. This occurs in .NET 3.5 and Visual C# 2008. I leave the solution to the mystery to readers who like to disassemble and read MSIL, and think about what the JIT compiler may do to it.

protected static void Permute <TDatum> (
int beginIn,
int endIn,
TDatum[] dataIn,
Action<TDatum[]> dataActionIn)
where TDatum : IComparable
{
TDatum xDatumBegin;

if ((endIn-beginIn)<=1)
{
dataActionIn(dataIn);
}
else
{
Permute(
beginIn+1,
endIn,
dataIn,
dataActionIn);

for (int i=beginIn+1; i<endIn; i++)
{
xDatumBegin = dataIn[beginIn];

dataIn[beginIn] = dataIn[i];
dataIn[i] = xDatumBegin;

Permute(
beginIn+1,
endIn,
dataIn,
dataActionIn);

// Believe it or not, the compiler/loader
// will optimize the first block below
// to be almost 10% faster than the second.

#if thisIsFasterInDotNet
xDatumBegin = dataIn[beginIn];
dataIn[beginIn] = dataIn[i];
dataIn[i] = xDatumBegin;
#else
dataIn[i] = dataIn[beginIn];
dataIn[beginIn] = xDatumBegin;
#endif
}
}
}