Home > Uncategorized > Thread-Safe Mutable Objects using WSL (When, Silently and Loudly)

Thread-Safe Mutable Objects using WSL (When, Silently and Loudly)

What does this look like?

class ThreadSafeQueue<T> : ThreadStateful
{
    private readonly Queue<T> _queue = new Queue<T>();

    public void Enqueue(T item)
    {
        Loudly(() => _queue.Enqueue(item));
    }

    public T Dequeue()
    {
        return When(() => _queue.Count != 0)
               .Silently(() => _queue.Dequeue());
    }
}


Okay, it looks a bit weird. But basically, if you strip away the weird parts, it’s almost the same as:

class NotThreadSafeQueue<T>
{
    private readonly Queue<T> _queue = new Queue<T>();

    public void Enqueue(T item)
    {
        _queue.Enqueue(item);
    }

    public T Dequeue()
    {
        for (;;)
        {
            if (_queue.Count != 0)
                break;
        }
        return _queue.Dequeue();
    }
}

So it’s a queue that you can push items into freely, and when you try to pull an item out with Dequeue, the call will hang up until there’s an item to give you – in a way that will seriously heat up your CPU, as it just loops at full speed until the queue has something in it. It’s pretty pointless in this form, because if your one thread is hung up waiting for an item to appear in the queue, then you obviously need another thread to insert something in the queue, and that’s no good because this implementation is hopelessly non-threadsafe.

The advantage of the weird kind, as indicated by the name, is that it’s thread safe, and also it has a more efficient way of waiting than spinning in a loop. The methods When, Silently and Loudly are threading primitives. What do they mean?

When means “Wait until this condition is true before allowing one thread to execute what comes next.” It must always be followed by a “chained” call to one of the other two primitives. If two or more threads are waiting at the same time, when the condition becomes true only one thread will return from the wait.

Silently means “I need to access some mutable data, but I’m not going to affect the value of any of my When conditions.”

Loudly means “I need to modify some mutable data so as to possibly cause a When condition to become true.”

Both Loudly and Silently can be used without waiting first, i.e. without a When prefix.

The distinction between Loudly and Silently is actually not totally essential for correct programs. A When condition only needs to be re-evaluated right after some Loudly code executes. You could forgo Silently and simply use Loudly all the time. Silently is just a way of helping the system to minimize the time it spends re-evaluating When conditions.

So this makes writing a thread safe queue very neat. And it’s applicable to other more complex scenarios; regardless of how complex the state of our object is, and our inter-thread dependencies, by inheriting from ThreadStateful and using the three primitives we can keep the code pretty readable.

A hatch is a kind of window in an interior wall, like a serving hatch between a kitchen and a dining room. It has a small shelf, only big enough to hold one thing at a time. So it’s either occupied with a single occupant, or not. If you want to put something into the hatch, you have to wait for it to become unoccupied. If you want to take something from the hatch, you have to wait for it to become occupied. With our new primitives, this is ridiculously simple:

class Hatch<T> : ThreadStateful
{
    private T _occupant;
    private bool _occupied;

    public void Put(T item)
    {
        When(() => !_occupied).Loudly(() =>
        {
            _occupant = item;
            _occupied = true;
        });
    }

    public T Get()
    {
        return When(() => _occupied).Loudly(() =>
        {
            _occupied = false;
            return _occupant;
        });
    }
}

No need to use locks, events, whatever. Inside the primitives, we can just write ordinary code, because we know only one thread at a time is running the code inside the primitives. No need to worry about the exact order in which we update variables. And the fluent, chained way the primitives work means that our code stays left-to-right readable.

Here’s a little test that combines both the classes we’ve come up with so far:

var hatch1 = new Hatch<int>();
var hatch2 = new Hatch<int>();
var queue = new ThreadSafeQueue<int>();

const int count = 10000;

ThreadPool.QueueUserWorkItem(o => 
{
    for (var n = 0; n < count; n++)
    {
        hatch1.Put(n);
        queue.Enqueue(hatch2.Get());
    }
});

ThreadPool.QueueUserWorkItem(o =>
{
    for (var n = count; n < (count * 2); n++)
    {
        queue.Enqueue(hatch1.Get());
        hatch2.Put(n);
    }
});

for (var n = 0; n < count; n++)
{
    Debug.Assert(queue.Dequeue() == n);
    Debug.Assert(queue.Dequeue() == n + count);
}

Console.WriteLine("Done");

There are three threads there. Two are passing things back and forth via a couple of hatches, and also logging their activities to the queue, and the third thread is monitoring what appears in the queue, checking that everything appears in exactly the right order.

An important type of resource is a “many readers, single writer”. Rather like a file handle, it has one kind of use (“reading”) that any number of threads can indulge in simultaneously, but another kind of use (“writing”) that is exclusive. And writing is so exclusive, it even clashes with reading, not just other attempts at writing. So to open the resource for reading, you have to wait until any writer has closed the resource. To open it for writing, you have to wait for all readers or any writer to close it first. How hard is it to implement with these three primitives?

Firstly, for demonstration purposes only, we need something to represent a handle to the resource. The simplest version of that must be:

public class Handle : IDisposable
{
    private readonly Action _closeAction;

    public Handle(Action closeAction) 
    {
        _closeAction = closeAction; 
    }

    public void Dispose() 
    { 
        if (_closeAction != null)
        {
            _closeAction();
            _closeAction = null;
        }
    }
}

It’s really just an Action delegate wrapped up in IDisposable, to make it seem more resource-like. It means the caller will be able to say:

using (var handle = resource.OpenForWriting())
{
    // blah...
}

Obviously for a real resource, we’d make OpenForWriting return a class with operations that allow mutation of the state of the resource, and OpenForReading return a class that only has simple getter properties. Also I’m being extra picky about getting IDisposable right – according to the docs, the caller is allowed to call Dispose multiple times on the same object and it should be no different to calling it once. Strange but true. So I just guard against that by discarding the delegate after it’s been called once. In the same way, a real implementation would throw ObjectDisposedException if any of its other methods were called after Dispose.

Anyway, so on to the implementation:

class OneWriterManyReaders : ThreadStateful
{
    private bool _writing;
    private int _reading;

    public IDisposable OpenForReading()
    {
        When(() => !_writing)
            .Silently(() => { _reading++; });

        return new Handle(() => Loudly(() => { _reading--; }));
    }

    public IDisposable OpenForWriting()
    {
        When(() => !_writing && _reading == 0)
            .Silently(() => { _writing = true; });

        return new Handle(() => Loudly(() => { _writing = false; }));
    }
}

Behold the simplicity! It practically writes itself. The state is just ordinary variables. The code is almost plain English. You can see it’s correct. Each lambda that we pass to one of the primitives is like a tiny oasis of calm, where we don’t have to worry about the complexities of threading.

One more interesting class: a channel. This is something described in Tony Hoare’s famous book CSP. It’s a lot like a hatch, except when you put something on it, you hang around and wait until someone picks it up.

So it seems like it should be a simple matter of modifying Hatch. But this first version has a bug in it. Can you spot it?

class BrokenChannel<T> : ThreadStateful
{
    private T _occupant;
    private bool _occupied;

    public void Send(T item)
    {
        When(() => !_occupied).Loudly(() =>
        {
            _occupied = true;
            _occupant = item;
        });

        When(() => !_occupied)
            .Silently(() => { });
    }

    public T Receive()
    {
        return When(() => _occupied).Loudly(() =>
        {
            _occupied = false;
            return _occupant;
        });
    }
}

All I’ve really done (apart from some creative renaming) is put in that extra wait (passing a no-op, because I just want to wait, and When has no effect unless it is followed by another primitive). After storing a new occupant, I wait until there is no occupant. What could possibly be wrong with that?

The problem is that a second thread could put in a new occupant! So then we might never return from our wait; it depends on a third thread removing the second thread’s item. Our original thread can’t remove it, because it’s still mistakenly waiting for its own item to be removed. So this could lead to a disastrous permanent deadlock. This is an example of the ABA problem – the mistaken assumption that, because things now look the same as they did the last time I looked, then obviously nothing happened during that time.

But by taking advantage of the simplicity of the primitives, here’s one way to solve it:

class Channel<T> : ThreadStateful
{
    private T _occupant;
    private bool _occupied;
    private uint _counter;

    public void Send(T item)
    {
        uint myCounter = 0;

        When(() => !_occupied).Loudly(() =>
        {
            _occupied = true;
            _occupant = item;
            myCounter = ++_counter;
        });

        When(() => _counter != myCounter)
            .Silently(() => { });
    }

    public T Receive()
    {
        return When(() => _occupied).Loudly(() =>
        {
            _occupied = false;
            ++_counter;
            return _occupant;
        });
    }
}

We give each new occupant (or lack of occupant) an effectively unique identity by using a counter. This makes it extremely easy to see if our occupant is still in there – we just wait for the counter to change. The only danger is if the counter wraps all the way round during a single call to Send, but that will take 4.2 billion increments, so it’s pretty safe. If you’re concerned you could use ulong, and allow for 18 billion billion increments, but I think that would be a little paranoid!

So all the remains is to present the implementation of the primitives. It’s actually very simple. All they do is encapsulate a simple usage pattern for Monitor.Wait and Monitor.Pulse.

Firstly, to allow the fluent-style chained method calls, we need a type that we can return from When:

public struct WaitingOperation
{
    private readonly object _locker;
    private readonly Func<bool> _waitCondition;

    public WaitingOperation(object locker, Func<bool> waitCondition)
    {
        _locker = locker;
        _waitCondition = waitCondition;
    }

    public T Silently<T>(Func<T> readOp)
    {
        Debug.Assert(_locker != null);
        Debug.Assert(_waitCondition != null);

        lock (_locker)
        {
            while (!_waitCondition())
                Monitor.Wait(_locker);

            return readOp();
        }
    }

    public T Loudly<T>(Func<T> writeOp)
    {
        Debug.Assert(_locker != null);
        Debug.Assert(_waitCondition != null);

        lock (_locker)
        {
            while (!_waitCondition())
                Monitor.Wait(_locker);

            T result = writeOp();
            Monitor.Pulse(_locker);
            return result;
        }
    }

    public void Silently(Action readOp)
    {
        Silently(() => { readOp(); return 0; });
    }

    public void Loudly(Action writeOp)
    {
        Loudly(() => { writeOp(); return 0; });
    }
}

It just stores the wait condition (as a Func delegate) and an object to lock on. Then it has the Loudly and Silently methods, which encapsulate the lock/wait/pulse pattern. The only different between them is that Silently doesn’t bother to pulse. For convenience there are also overloads of the two methods which accept Action delegates, where no return value is needed.

Then there’s the base class used in all the above examples:

public class ThreadStateful
{
    private readonly object _locker = new object();

    protected WaitingOperation When(Func<bool> waitFor)
    {
        return new WaitingOperation(_locker, waitFor);
    }

    protected T Silently<T>(Func<T> readOp)
    {
        lock (_locker) { return readOp(); }
    }

    protected T Loudly<T>(Func<T> writeOp)
    {
        lock (_locker)
        {
            T result = writeOp();
            Monitor.Pulse(_locker);
            return result;
        }
    }

    protected void Silently(Action readOp)
    {
        Silently(() => { readOp(); return 0; });
    }

    protected void Loudly(Action writeOp)
    {
        Loudly(() => { writeOp(); return 0; });
    }
}

It creates a private object to use for locking, and has the When method which is just a factory for instances of WaitingOperation. It also has its own simpler versions of Loudly and Silently which don’t wait on a condition.

Throughout the design, it is ensured that when the code passed into the primitives is executing, the _locker object is owned by one thread. So in all the code passed to the primitives, you can directly access the mutable state of your derived class.

As a wrapping layer, it’s so thin it’s almost see-through. But I find it much more instantly comprehensible than using the Monitor APIs directly. They are actually quite difficult to explain clearly; they have somewhat spooky behaviour with regard to the state of the _locker object, temporarily unlocking it to allow other threads to progress. If you don’t have language support for anonymous methods with closure over local variables then the Monitor APIs are probably the simplest interface that could be offered. But as C# does have such support, we can make the programming interface more natural.

About these ads
Categories: Uncategorized Tags: ,
  1. jalf
    March 12, 2010 at 1:45 pm | #1

    Clever. Closures can definitely simplify a lot of otherwise nasty code, but this is one of the better examples. :)

    • earwicker
      March 16, 2010 at 12:10 pm | #2

      Thanks, incidentally this is a situation where Scala’s very succinct syntax would be nice. Basically if a method accepts Func, you can pass it an expression of type bool. If this was in C#, it would be like being able to omit the () => in front of all these lambdas. On the downside, it makes it less clear from the call site where/when the code will be executed. But some argue that it is the job of the enclosing method name/documentation to clarify that.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: