Home > .NET, C#, Interop, threads > Referential Identity and Internship

Referential Identity and Internship

I was thinking about a gotcha in Java, and started wondering about the idea of making it convenient to intern reference objects.

“Interning” refers to the idea of ensuring that there is only one instance of an object with a given internal configuration – the classic example being strings in Java and .NET. When you construct a new string from a string literal, code is inserted to ensure that you get back a reference to the same string object for any given string literal value. So if you have the quoted string "Hello, world" in two places in your source code, they will point to the same object. This is a form of referential identity.

This predictably causes confusion in Java because learners are duped into thinking that they will be able to compare string values with ==, but this is not generally true, because other operations that create strings (such as concatenation, or reading from a file) do not take part in interning. It’s not such a problem in C# because you can implement a special version of == that compares the actual values instead of references, and this is exactly what the built-in System.String type does.

But then there are types that frequently occur in C# programs where you don’t have the ability to override the == operator. An example is an interface type – no overloading of operators is allowed with them. In the code I’m working on, a lot of things are described by COM interfaces, which look just like interfaces. I’d frequently like to be able to intern references to such things, allowing me to casually compare two references in a meaningful way. Especially where I’m dealing with an external COM library that just allocates objects on the fly to represent data, and so the identity of the object is not currently being put to good use – it’s completely meaningless.

But the right way to do this depends on some detail of the interface. There is also the question of the scope in which internship should apply – a whole process, or just the current thread?

Trivial example:

public interface IMyWeirdInterface
{
    string Name { get;}
}

What I’d like is to be able to assume that any two objects of that type will be the same instance if they have the same Name. So what I need is an object that manages the “internship” in that way:

var i = new Internship(m => m.Name);

IMyWeirdInterface x = new MockMyWeirdInterface { Name = "Fred" };
IMyWeirdInterface y = new MockMyWeirdInterface { Name = "Fred" };
IMyWeirdInterface z = new MockMyWeirdInterface { Name = "Jim" };

i.Intern(ref x);
i.Intern(ref y);
i.Intern(ref z);

Debug.Assert(x == y);
Debug.Assert(x != y);

So no matter from where a method obtains some references to IMyWeirdInterface, it can intern them and they will have referential identity. It just needs access to the right Internship object to manage this. With most COM objects, they are “apartment threadsafe”, which is a coy way of saying that they aren’t particularly threadsafe at all and should not be shared between threads. So for such objects, the right thing to do is make the Internship object thread local:

public static class Internships
{
    [ThreadStatic] private static Internship WeirdThreadLocal;

    public static Internship Weird
    {
        get
        {
            if (WeirdThreadLocal == null)
                WeirdThreadLocal = new Internship(m => m.Name);

            return WeirdThreadLocal;
        }
    }
}

As well as being absolutely necessary to ensure objects don’t get mysteriously shared between threads, this also has the nice quality of avoiding the need for locks to coordinate between threads – it’s a per-thread singleton. Now, any method can say:

Internships.Weird.Intern(ref x);
Internships.Weird.Intern(ref y);

And know that if x and y have the same Name, they will now be the same object.

Here’s the declaration of Internship (not exactly rocket science, is it?):

public class Internship
{
    private readonly Dictionary _map = new Dictionary();
    private readonly Func _keySelector;
    
    public Internship(Func keySelector)
    {
        _keySelector = keySelector;
    }
    
    public void Intern(ref TRef obj)
    {
        TKey key = _keySelector(obj);
        TRef result;
        if (_map.TryGetValue(key, out result))
            obj = result;
        else
            _map.Add(key, obj);
    }
}

The keySelector parameter to the constructor is the thing that controls how intances can be treated as identical, by picking a key value from a property (the Name property in my examples above).

Advertisements
Categories: .NET, C#, Interop, threads Tags: , , ,
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: