Wednesday, January 14, 2009

Interfacing Outside the Box

Happy New Year, everyone! Another hiatus is over and I hope to make it the last one. I've never been the one for New Year's resolutions before, but I guess there's a first time for everything: this year I resolve to post at least once a month.

I gave my self a head start by accumulating a number of topics I want to write about. After some careful consideration, I decided to start off with a bang: by introducing a new design pattern!

Hacking the Language

A while ago, I wrote about design patterns and how they're often a result of having to work your way around language limitations. No language is perfect, so you're bound to run into an annoying limitation sooner or later, even if you're not spending your days coding in Blub. If you're lucky, there's already a design pattern that allows you to hack your way around it. If not, you'll have to invent a solution yourself.

One of such limitations in C# is the boxing that occurs when you cast a value type to an interface. I've wrote about a related topic before, back when I was just learning about value types and boxing in C#. You could probably tell, because I made a spectacular mistake in my speculations.

Safe Deposit Box

In the course of devising a solution for the problem described in Enum Conundrum, I erroneously suggested that a possible solution might have been to have my enum implement the IEquatable interface, had the language allowed it. That, of course, wouldn't have helped. The enum would still have been boxed, although for a different reason.

Why do value types get boxed when cast to an interface? The reason is simple enough: for safety. For example, if you have a local value type variable, it resides on the stack and gets destroyed after the call ends. Without boxing, if you stored an interface reference to that value type instance in an object field, that field would wind up with an invalid reference as soon as the original call is done and the variable goes out of scope.

Of course, you might not care about boxing. After all, boxing was invented precisely so that people could use value types and reference types on equal footing: the alternative to it is all that nasty wrapping code that was so widespread in Java before it incorporated boxing. So boxing is neat, isn't it?

Get Back in Line

When you define a value type, you (should) do it for a good reason. Most of the time, it's because you need a type with value semantics; in other words, you want a type that behaves in such a way that it's not possible to affect one variable of that type by performing operations on another variable of the same type.

The way value semantics is enforced is through memory allocation. Value types are allocated in-line. This means that the memory allocated for a value type instance is a part of the memory allocated for whatever contains that instance. If it's a local variable, the instance is allocated directly on the stack. If it's a field, the instance is contained in the memory allocated for the field's object. Only when it gets boxed does a value type wind up on the heap independently, all by itself.

At times, this is precisely the reason why you'll choose a value type: not so much for its semantics, as for its memory allocation. For example, you might be writing a game in XNA and you don't want the garbage collector kicking in and ruining your frame rate.

Whatever your reasons for wanting to keep them in line, the fact remains that you cannot cast your value types to interface without them getting boxed. What, then, is the alternative?

Generic Static

Fun fact: a static field in a generic class is allocated separately for each closed constructed type of that generic class. What does this mean? Take a look at the following code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace BeardsEye
{
    class GenericStatic<T>
    {
        public static int SomeVal;
    }

    class Program
    {
        static void Main(string[] args)
        {
               GenericStatic<int>.SomeVal = 1;
               GenericStatic<string>.SomeVal = 2;

               Console.WriteLine(GenericStatic<int>.SomeVal);
               Console.WriteLine(GenericStatic<string>.SomeVal);
        }
    }
}

If you run it, it will print out 1 and 2. Of course, such behavior stands to reason; if you're not sure why, just change the type of SomeVal to T.

Why am I suddenly talking about this, anyway? Because, in some cases, you can use this trick as a substitute for what would otherwise be a dictionary. Specifically, for some uses, you could replace a Dictionary<Type, Something> with:
public static class MyTypeDictionary<TType>
{
    public static Something Value;
}

Of course, this is a far cry from a fully functional dictionary. You don't have the Count property, there's no way to remove values, iterate over them or find out whether a type has an associated value or not. And you can't have an arbitrary number of these "dictionaries".

On the other hand, when you need to associate something with a type and that "something" changes from type to type, this technique is just perfect.

Fun With Functions

Now we finally get to the solution itself. Suppose you have an interface IConfusticatable, for all types that can be confusticated to a certain degree:
public interface IConfusticatable
{
    double Confusticate(int degree);
}

You want to be able to confusticate classes that implement this interface, but you would also like to confusticate value types, without boxing them. You could use the following trick:
public static class Confusticator<T>
{
    public static Func<T, int, double> Confusticate = ((what, degree) => ((IConfusticatable) what).Confusticate(degree));
}

What do you have to do in your value type? Not much:
public struct MyStruct
{
    static MyStruct()
    {
        Confusticator<MyStruct>.Confusticate = ((me, degree) => me.Confusticate(degree));
    }

    public MyStruct(double val)
    {
        Val = val;
    }

    public double Val;

    public double Confusticate(int degree)
    {
        return Val + degree;
    }
}

How do you use this when you want to confusticate something? Like this:
MyStruct myVar = new MyStruct(17);
Console.WriteLine(Confusticator<MyStruct>.Confusticate(myVar, 3));

I'm sure that, at this point, you're thinking "So what good is this? If I already know that myVar is a MyStruct, I can just call its own Confusticate method directly."

This, of course, is perfectly true, unless you're writing generic code. In that case you don't know beforehand what your type parameter will be.

Warning: May Contain .NUTS

A few words of caution here. The proposed implementation of MyStruct relies on a static constructor, which is supposed to "register" the confustication lambda with the Confusticator. The problem with this is that it's a bit tricky to make sure that a value type static constructor is invoked.

If you dig around the C# Annotated Standard, you'll find some fascinating incompatibilities between different standards and their respective implementations.

According to the C# Standard, a static constructor for a value type is executed only when the first of the following occurs:
  • An instance member of the struct is referenced.
  • A static member of the struct is referenced.
  • An explicitly declared constructor of the struct is called.
On the other hand, the CLI Standard requires execution only when the first of the following occurs:
  • A static member of the struct is referenced.
  • An explicitly declared constructor of the struct is called.
To further complicate the matters, the CLI Standard does not allow execution when an instance member of the struct is referenced. You'll note that this is in direct conflict with the C# Standard.

If you actually test what happens when you run a C# program on CLR, you'll see that the static constructor is called as soon as the first of the following occurs:
  • An instance method of the struct is referenced.
  • An instance property of the struct is referenced.
  • A static member of the struct is referenced.
  • An explicitly declared constructor of the struct is called.
However, the static constructor is not called if you reference an instance field of the struct. Not only does the CLR implementation fail to conform to either standard, it does so in a way that seems spectacularly arbitrary. I'm sure there are some perfectly valid -- if arcane and obscure -- reasons for this.

All in all, it would be a lot more reliable to stick the "registration" code in some initialization method that you know will be called. That's what you would have to do anyway, if you wanted to register a confustication function for an enum, for example.

Finishing Touches

What if your IConfusticatable interface has more than one method? There are several possible solutions, but the simplest one would be to give your Confusticator<T> one static delegate field per IConfusticatable method.

Speaking of Confusticator<T>, if you find it too ugly to write Confusticator<MyStruct>.Confusticate(myVar, 3), you can use type inference to introduce some syntactic sugar:
public static class Confusticator<T>
{
    public static Func<T, int, double> Confusticate = ((what, degree) => ((IConfusticatable) what).Confusticate(degree));
}

public static class Confusticator
{
    public static double Confusticate<T>(T what, int degree)
    {
        return Confusticator<T>.Confusticate(what, degree);
    }
}

With that, you would be able to simply write Confusticator.Confusticate(myVar, 3) and have the compiler infer the type automagically.

Share and Enjoy

Well, that's all for now. For those of you who have a legitimate need to avoid boxing your value types, I hope this proves useful. For the rest, I hope that you found this an interesting trick.

I'm planning to write more recipes for getting stuff that C# doesn't have, such as multiple dispatch, so stay tuned!