Archive

Archive for February, 2013

Eventless Programming – Part 2: Computed observables

February 28, 2013 Leave a comment

Posts in this series:

Welcome back for more dry philosophical musings (I promise there will be an actual working GUI with checkboxes and stuff in the next part). By the way, the code is here: github.com/danielearwicker/eventless/. Also note that I just realised that Getable and Setable would make more sense in a C# context than Readable and Writeable, so I’ve done some renaming.

Last time we got as far as defining a class that could hold a single value in a property called Value, and would fire an event called Changed whenever the value was modified.

Minor detour: for ultimate convenience let’s make a static helper method so we can take full advantage of the type inference that kicks in for generic methods but is sadly absent for constructors of generic classes:

public static class Setable
{
    public static Setable<T> From<T>(T initVal)
    {
        return new Setable<T>(initVal);
    }
}

Now we can really ramp up the hysteria as we add two numbers together:

var i = Setable.From(3);
var j = Setable.From(4);
var sum = Computed.From(() => i + j);

Now, I know it’s exciting but try to contain yourself. The point is, if (and only if) we modify i or j, we need sum to be recomputed (and thus in turn trigger recomputation of anything that consumes sum), just like it would be if these were cells in Excel. This will give us simplicity combined with optimal (that is, minimal) computation. When these are no longer integers but instead complicated data structures or large pieces of UI, it should seem a lot more useful.

So there has to be some magic plumbing involved that notices that we refer to i and j in our definition of sum.

First, Computed.From is just another handy static helper method:

public static class Computed
{
    // Excerpt...
    public static Computed<T> From<T>(Func<T> compute)
    {
        return new Computed<T>(compute);
    }
}

No surprises there, so let’s take a look at the more substantial generic Computed<T> class:

public sealed class Computed<T> : Forwarder<Setable<T>, T>, IGetable<T>, IDisposable
{
    private readonly Func<T> _compute;

    private ISet<IGetable> _subscriptions = Computed.EmptySubscriptions;

    public Computed(Func<T> compute)
        : base(new Setable<T>())
    {
        _compute = compute;
        Recompute();
    }

    private void Recompute()
    {
        var newSubscriptions = new HashSet<IGetable>();

        Computed.Listeners.Push(o => newSubscriptions.Add(o));
        var newVal = _compute();
        Computed.Listeners.Pop();
        Impl.Value = newVal;
        newSubscriptions.Remove(this);
        newSubscriptions.Remove(Impl);

        foreach (var sub in _subscriptions.Where(s => !newSubscriptions.Contains(s)))
            sub.Changed -= Recompute;
        
        foreach (var sub in newSubscriptions.Where(s => !_subscriptions.Contains(s)))
            sub.Changed += Recompute;
        
        _subscriptions = newSubscriptions;
    }

    public T Value
    {
        get { return Impl.Value; }
    }

    public void Dispose()
    {
        foreach (var sub in _subscriptions)
            sub.Changed -= Recompute;
    }
}

First key point: it only implements IGetable, not ISetable, so it has no setter for the value. This makes sense because it holds its own definition of what the value should be at any given moment.

Second (and this is really another detour), it kinda-sorta inherits from Setable, because it needs exactly the same kind of logic for triggering the Changed event only if the new value really is different. But I didn’t want to use full implementation inheritance, because then the features of Setable (such as the Value setter) would all be public, which we’ve already established doesn’t make sense. So instead I inherit a class called Forwarder, specifically designed to encapsulate an implementation of IGetable for the purposes of reuse. We’ll skip over the details of that, but basically if you’re familiar with private inheritance in C++, I’ve roughly simulated that. The end result is that we have a property Impl that holds the Setable we created in the base constructor call, and its Changed event is exposed for us.

Third: the meat of the class lies in its management of a set of other observables to which it subscribes for the Changed event. The currently subscribed observables are held in the _subscriptions set. The constructor calls Recompute the first time, and then Recompute has to make subscriptions so it will be called in the future, like this:

sub.Changed += Recompute;

The cunning part is this:

var newSubscriptions = new HashSet<IGetable>();

Computed.Listeners.Push(o => newSubscriptions.Add(o));
var newVal = _compute();
Computed.Listeners.Pop();
Impl.Value = newVal;

Remember that mystery call to Computed.Listeners.Notify(this) we encountered in part 1? Of course you do – you’ve been barely able to sleep for thinking about it constantly. Really, you look terrible. Just read the rest of this part and then try to get some rest.

Computed.Listeners is just defined as:

internal static readonly ListenerStack<IGetable> Listeners = new ListenerStack<IGetable>();

And ListenerStack is quite a handsome young utility class:

public class ListenerStack<T>
{
    private readonly ThreadLocal<Stack<Action<T>>> Stack =
                    new ThreadLocal<Stack<Action<T>>>(() => new Stack<Action<T>>());

    public void Push(Action<T> listener)
    {
        Stack.Value.Push(listener);
    }

    public void Pop()
    {
        Stack.Value.Pop();
    }

    public void Notify(T obs)
    {
        if (Stack.Value.Count != 0)
            Stack.Value.Peek()(obs);
    }
}

It allows us to notify (pass some value of some type or other) to the top-most listener in a stack of listeners on the current thread. The thread detail is something that knockout.js doesn’t have to worry about, but in the CLR it’s a must. If your UI thread is using Eventless and meanwhile some background worker has also found a use for it, they better not get their wires crossed. It would be a fine mess if the user of a library is careful not to share non-thread-safe objects (such as Setable) between threads, and then the library gets them tangled up anyway.

Computed registers its own listener at the top of the stack of listeners, which is a delegate that just records all the Getables that are notified about. And (as we saw in part one), Getables notify about themselves whenever their values are accessed.

So this means that Computed can (and does) easily find out exactly what Getables are read during the execution of its definition. It can then subscribe to them. In fact by using a HashSet it can pretty rapidly track any changes in its immediate dependencies. And they can be read anywhere while the definition is executing – way down the stack in some other method, without any need to explicitly pass any extra parameters around. The implications of that for constructing software are, I think, rather mind-blowing, and if your mind isn’t blown to smithereens by it, then either (a) you don’t fully get the implications yet, or (b) your mind is already in smithereens from previously discovering this.

One little bit of extra housekeeping:

newSubscriptions.Remove(Impl);

Rationale: it can be handy for a computed to read its old value, and while the definition delegate is executing that old value is there to be read, so we may as well allow it. But doesn’t make sense for it to consequently be dependent on itself.

Bonus Content

The other thing that will be very handy is a collection class. The idea is that (in part 3) we’re going to be displaying the contents of the collection in the UI, so it should be an ordered collection. So, not an ISet or an IDictionary. It must be possible to modify it through random access. In other words, it’s an IList. The CLR has an ObservableCollection class, but I’d like to start with a plain interface and build up from there (that way, it will be easy to bridge to any other frameworks):

public interface ISetableList<T> : IGetable<IList<T>>, IList<T>
{
    event Action<int> Added;
    event Action<int> Updated;
    event Action<int> Removed;
    event Action Cleared;
}

The events that take an int refer to the index within the collection. Note that the collection itself is an IGetable, not an ISetable – you can’t substitute in a new list. You have to clear it down and add new elements. This is no great hardship and simplifies a few things. But it is (effectively) a list of Setables, because you can update individual elements, hence the name ISetableList.

It is also an IList of the given element type. So in our implementation, SetableList, we have to implement all the methods of IList and make them trigger the appropriate events. And of course, whenever the value of an element is read, we better notify the Computed on the top of the stack (if any). Look at the code on Github for the details. for the details.

Okay, so we have most of the building blocks in place. Thank you for your patience. Next stop, part 3 and a working UI

Advertisements
Categories: Uncategorized Tags: , ,

Eventless Programming – Part 1: Why the heck?

February 25, 2013 11 comments

Posts in this series:

An “event” is a list of routines that all get executed (in unpredictable order) whenever something occurs. It is the low-level building block of complex interactive software.

But as abstraction, it fits poorly with our needs. Rather than the abstract idea of an event occurring, it is far more useful to think more specifically of a value changing. That is, we should always pair together two things: storage for a unit of information, and the event that the information has changed. This pairing is an observable.

In Windows Forms this pattern occurs, but only informally. For example, controls have a property called Enabled, and a corresponding event called EnabledChanged. Many controls have a property called Text, and an event called TextChanged. And so on. Unfortunately nothing links these together apart from the naming pattern.

WPF/Silverlight tried to fix this through Dependency Properties. A dependency property has a value, and there is a standard mechanism for notifying of a change in that value. But the results aren’t pretty: essentially it is just more manual event firing and handling, leading people to cook up various complex techniques to avoid writing so much boilerplate code.

There is a basic difficulty that occurs with stateful software. In mathematics once a value is defined and given a name, we don’t change it some time later on in the proof. In most programming languages that rule is tossed out carelessly. If x is said to be 5 and y is said to be x + 5, then y is surely 10… until x changes. Of course we could define y as a function of x, and recompute it every time we need to know the value, but in real programs we soon find that we’re repeating a lot of identical calculations, so instead we cache our values in variables, and then try to figure out manually when it would be an appropriate time to recompute them.

Events are the way we do this, but it’s crazy because we are repeatedly building, by hand, something that should be happening automatically, as simply as it does in Excel. If you put a 5 in cell B3 and then in another cell you put =B3+5, the spreadsheet is smart enough to know that it will need to recompute the second cell whenever the value in B3 changes. Excel gives us a model for declaring what our variables should contain, expressed in terms of other variables, and it harnesses the power of events to keep everything consistent, but we as users never have to manually hook up a single event handler, and nor should we.

Brothers and sisters, let us, here and now, build a C# framework for providing us with the convenience and minimal recomputation of Excel, but with the considerable advantage that we get to define the UI ourselves (and specify our own meaningful variable names instead of unhelpful cell references…)

(By the way, if you’re familiar with knockout.js, this is all going to seem eerily familiar, but a lot more statically typed.)

Now, buried beneath whatever we cook up, we will need an event. Let’s see how far we can get with a single solitary event doing all the work.

public interface IGetable
{
    event Action Changed;
}

[Sidenote: I was going to call it IObservable but as Brandon Wallace pointed out in the comments, there’s already System.IObservable. Then I called it IReadable, but then I realised properties have getters and setters, and as you’ll see, we’re really just building an abstraction over a property with a Changed event.]

I’m breaking the rules of the event pattern by using the super-bland Action delegate, but I don’t care. A basic principle here is that Changed will never need to provide any further information about the event, other than the fact that it has occurred. By the way, that interface is already finished! I’m not going to add anything more to it, ever.

Well, kind of. It is just a base for us to extend:

public interface IGetable<out T> : IGetable
{
    T Value { get; }
}

This turns out to be very useful because it means from an event-handling perspective we can treat all observables alike, regardless of what type of value they hold. And again, we’re entirely done with that interface. It’s already being all it will ever need to be. Note that we only have a getter for the value, however. Which leads us to:

public interface ISetable<T> : IGetable<T>
{
    new T Value { get; set; }
}

All very pretty and abstract, but still no code that does anything. Patience, I’m building to it. Good grief, this is only part 1! Seriously, don’t you ever shut up? Alright, alright, here’s our first class, the actual implementation of an observable:

public sealed class Setable<T> : ISetable<T>, IEquate<T>
{
    private T _value;
    private bool _changing;

    public Setable(T value = default(T))
    {
        _value = value;
        EqualityComparer = DefaultEqualityComparer;
    }

    public T Value
    {
        get
        {
            Computed.Listeners.Notify(this);
            return _value;
        }

        set
        {
            if (EqualityComparer(_value, value))
                return;

            if (_changing)
                throw new RecursiveModificationException();

            _value = value;

            var evt = Changed;
            if (evt == null) 
                return;

            _changing = true;
            try { evt(); } 
            finally { _changing = false; }
        }
    }

    public event Action Changed;

    public Func<T, T, bool> EqualityComparer { get; set; }

    public static bool DefaultEqualityComparer(T a, T b)
    {
        return ReferenceEquals(a, b) || (!ReferenceEquals(a, null) && a.Equals(b));
    }

    public static implicit operator T(Setable<T> from)
    {
        return from.Value;
    }
}

It fully implements ISetable – not especially hard to do. The critically important part is that the Value property setter doesn’t fire the event unless the value has truly changed.

[Update: In the comments Mark Knell mentioned the problem of re-entrance. The check for equality before firing the event is a partial solution to that; it silently allows recursive attempts to set Value to the same value it already holds, without triggering a further event, and so endless unnecessary ping-ponging events are eliminated. But originally I also allowed recursive modification of Value during the firing of Changed, in mimicry of Knockout.js. As an experiment, I’m going ban this outright. Hence the _changing field, which is true while the exception is being fired. If Value is genuinely modified again while Changed is firing, that suggests a problem has arisen in the way events have been wired up. So a “Please don’t do that” assertion seems called for. End of update]

Everything else in the class is about allowing you to supply an alternative way to compare values for equality. That other interface IEquate is part of that stuff, and is only there to support some generic constraints that aren’t particularly important:

public interface IEquate<T>
{
    Func<T, T, bool> EqualityComparer { get; set; }
}

The only puzzle here is that call to Computed.Listeners.Notify(this) in the Value getter. How can it be that a getter needs to do any notification? To see why that is, we’ll have to start trying to build support for computed observables, in part 2

Categories: Uncategorized Tags: , ,

How to write a “class” in JavaScript

February 16, 2013 2 comments

Of all the problems with JavaScript, by far the worst is the way people try to [ab]use it based on their training in Java and Object Orientation. They want to write a class, and then they want to write another class that inherits from the first class, because That How You Write A Software’sTM.

And this is made worse by a few features in the language that encourage people to make these mistakes: new, prototype, this.

The first hurdle to get over is inheritance. Consult any current text on OO and it will tell you that interface inheritance is fine, but implementation inheritance is a heap of trouble, so try to avoid it. In JavaScript, interface inheritance is completely unnecessary because there is no static typing at all. So that leaves implementation inheritance, which is what people usually mean by “inheritance” when they’ve been to Hollywood Upstairs Java College, and this is the part that even Java experts try to warn you away from.

So to get this absolutely clear: when you try to do inheritance in JavaScript, you’re trying to replicate something from Java that experts tell you not to do in Java. If you already know that, and you’re still insistent on trying, I can’t help you. You’re obviously insane.

For everyone else, this is the simplest way to write something that serves the approximate purpose of a “class” in JavaScript. It’s a factory function: when you call it, it manufactures an object:

var person = function(firstName, lastName) {
    return {
        getFullName: function() {
            return firstName + ' ' + lastName;
        },
        setFirstName: function(newName) {
            firstName = newName;
        },
        setLastName: function(newName) {
            lastName = newName;
        }
    };
};

// Usage:
var henryJones = person('Henry', 'Jones');

console.log(henryJones.getFullName());

henryJones.setFirstName('Indiana');

Note how there was no need to copy the constructor parameters into some separate fields. They already behave like private fields (actually more private than in the JVM or CLR, because in those runtimes you can use reflection to get to private fields).

There is a difference between the above pattern versus using prototype on a constructor function that has to be called with new – I mean aside from the convenience of not have to use new and this everywhere). The difference is that for each instance of a person, we create four objects: the one we return and the three functions stored in it.

If instead we did it the messy way:

function Person(firstName, lastName) {
    this.firstName = firstName;
    this.lastName = lastName;
};

Person.prototype.getFullName = function() {
    return this.firstName + ' ' + this.lastName;
};

Person.prototype.setFirstName = function(newName) {
    return this.firstName = newName;
}

Person.prototype.setLastName = function(newName) {
    return this.lastName = newName;
};

// Usage:
var henryJones = new Person('Henry', 'Jones');

Now the three function objects are shared between all instances, so only one object gets created per person instance. And to the typical, irrational programmer who places premature optimisation above all other concerns, this is vastly preferable. Never mind that the fields firstName and lastName are now public! That is a minor concern compared to the terrible thought of creating three extra function objects per instance.

But hang on one second: have you considered the following factors?

  • How many person instances are going to exist at any time while your code is running? If it’s small, then the cost per instance is irrelevant.
  • What other things are you going to do, per instance? If you’re going to build a load of DOM nodes with animations on them and fire off AJAX calls every second (come on, admit, that’s exactly what you’re going to do), then the cost of a few function objects per instance is lost in the noise.
  • Have you realised just how amazingly good modern JS runtimes are? They love allocating and throwing away lots of little objects.

If you do ever run into a situation where the difference is significant, you can apply the necessary optimisation in one place: inside your person factory function, without changing any other code:

var person = (function() {
    var vtable = {
        getFullName: function() {
            return this._firstName + ' ' + this._lastName;
        },
        setFirstName: function(newName) {    
            this._firstName = newName;
        },
        setLastName: function(newName) {
            this._lastName = newName;
        }
    };
    return function(firstName, lastName) {
        var p = Object.create(vtable);
        p._firstName = firstName;
        p._lastName = lastName;
        return p;
    };
})();

But in the vast majority of situations, there will never be any need to do that (i.e. when you do it, the difference will be practically unmeasurable), so why bother until you have a realistic model of your application with which to carry out your performance testing? (And you’ve also made sure that you have nothing better to do with your time…)

To find out precisely how much difference this makes, and thus when it might be worth worrying about it, all we have to do is try allocating some objects in the Chrome console. Let’s pick a ridiculously big number: a million.

var start = new Date().getTime();
var p = [];
for (var n = 0; n < 1000000; n++) {
    p.push(person('Henry', 'Jones' + n));
}
console.log(new Date().getTime() - start);

We can then run this test three times each with the plain and optimised implementations and find out what the overhead actually amounts to.

The first point of interest is that the allocation of a million objects takes about two and half seconds in a freshly-opened Chrome instance on my run-of-the-mill Dell notebook, with either implementation. Taking the average of three runs with each, the “fast” version took 2390 ms and the “slow” took 2570 ms. That’s 180 ms difference over a million allocations! We’re talking about mere nanoseconds per extra function object being allocated. Just forget about speed being an issue. It’s not.

What about memory? The relevant Chrome process showed an increase of 81 MB for the “fast” version, versus 204 MB for the “slow”, so a total overhead of 123 MB caused by those extra function objects. Again, that’s shared across a million objects, so the overhead per object is just 130 bytes or so. Is that a big deal?

If you’re writing a game and trying to model specks of dust floating around the screen, it might be realistic to allocate a million of something. But in real business apps, we don’t create a million person objects all at once. We create objects based on downloaded data that we retrieve in small pages from the server (otherwise if a thousand users try to open the app at the same time, the server will have transmit gigabytes of stuff that none of those users have enough lifetime left to ever read).

So a hundred objects is more realistic. Then the total overhead in this example would be 13 KB – the space consumed by a small image, the kind used in most real web apps without a moment’s thought.

Another comparison is to look at the overhead of DOM nodes. The objects we create in JavaScript are typically there to create and manage our user interface elements in the browser DOM. How much does a single empty DIV cost?

var p = []; 
for (var n = 0; n < 1000000; n++) {
    p.push(document.createElement('div')); 
}

That added 128 MB to a fresh Chrome process. What a coincidence! 134 bytes per empty DIV, a fraction more overhead than our three function objects.

So the conclusion must be: relax, stop worrying about illusory “performance” concerns, and focus instead on writing code that is readable, maintainable, hard to get wrong, hard to break accidentally when modifying it, and so on.

Categories: Uncategorized Tags: , ,