Archive

Archive for the ‘Uncategorized’ Category

Eventless Programming – Part 3: Binding to a UI

March 2, 2013 Leave a comment

Posts in this series (and if you’ve been following so far, have another quick read-through – there’s been some renaming again…):

Last time we figured out how to create a second kind of observable, one that defines its value using a function. The burning question is: how are we going to tie these things to a UI?

Consider knockout, the inspiration for this whole exercise. There, the UI is defined by HTML. Special data-bind attributes are added to easily link observables up to various properties.

Now, in the C# Windows GUI world there has been some turmoil over the last year or so (to put it mildly). Do we go with WPF, Silverlight or Windows Runtime? Wait a second, we’re forgetting poor old reliable, sleepy, uncontroversial, it-just-works Windows Forms! For fun, let’s start by rewriting the past. How should we have been making Windows apps in 2003?

Aside from anything else, Windows Forms has a very easy to use GUI design tool for us to layout controls on. But there’s no “mark-up” for us to extend with custom attributes. Aw, don’t cry! We’re not going to give up just yet. We can just provide a way to set up bindings from code and it’s all going to be fine.

var newNoteText = Setable.From(string.Empty);

textBoxNewNote.BindText(newNoteText);

// 'Add' button is only enabled if text box is non-empty
buttonAdd.BindEnabled(Computed.From(() => newNoteText.Value.Length != 0));

See how it works? I declare a Setable, and then I have an extension method on TextBox called BindText that I can call to bind the TextBox to the Setable.

Then I have a Button I want to enable only when the TextBox is non-empty. Instead of having to say what steps to carry out and when to carry them out (“evented” imperative programming), I just declare the facts that must be true at all times, under all states the system might get into, also known as the invariants of the system.

So our pattern here is simply to define extension methods on the existing standard Windows Forms controls, with names like BindSomething, where Something is one of those properties that has an associated event SomethingChanged.

The implementation would not be particularly tricky, were it not for one thing: we will need a convenient way to unbind all the event bindings we set up. This is not always necessary, but when you have a very dynamic UI with chunks of content appearing and disappearing, or one chunk being re-used to bind to different objects, it’s essential.

We can use an approach inspired by transactions, where we capture a log of actions that will (when “played back”) unbind everything that has been bound. To see how it works, let’s start with something very simple. Here’s the implementation of a fundamental binding for practically all interesting controls: Enabled.

public static void BindEnabled(this Control control, IGetable<bool> to)
{
    BindChanged(to, () => control.Enabled = to.Value);
}

It’s a one-way binding, obviously, because the user can’t change the “enabledness” of a control. Hence we only need IGetable, and we only need to say what we’re going to do when the observable changes: no need to bind to an event on the control. Here’s BindChanged:

public static void BindChanged<T>(IGetable<T> getable, Action changed)
{
    changed();
    getable.Changed += changed;
    BindingLog.Notify(() => getable.Changed -= changed);
}

The first thing it does is run the changed action we pass in, so that the control will be initialised to match the state of getable. Then it subscribes to the Changed event and finally it “logs” an action that would, if called, unsubscribe from that event. And so how is BindingLog implemented? Why, it’s our old friend from part 2:

private static readonly ListenerStack<Action> BindingLog = new ListenerStack<Action>();

And so elsewhere, whenever we want to do some random binding and capture an action that we execute later to undo that binding, we should carry out the binding activity inside a context:

public static Action CaptureUnbind(Action bindingActivity)
{
    try
    {
        Action multicastUnbinders = null;
        Log.Push(nestedUnbinder => multicastUnbinders += nestedUnbinder);
        bindingActivity();
        return multicastUnbinders ?? EmptyAction;
    }
    finally
    {
        Log.Pop();
    }
}

[Trivia: Whenever you are building up a list of actions so that later you can loop through them all and execute them, remember that C# actually has language support for that pattern: the multicast delegate. You can say action += anotherAction and now action has been replaced by a new action that calls the original two actions.]

Usage:

var unbind = Binding.CaptureUnbind(() =>
{
    textBoxNewNote.BindText(newNoteText);

    buttonAdd.BindEnabled(Computed.From(
          () => newNoteText.Value.Length != 0));

    // Any other bindings in this batch...
});

And then to unbind:

unbind();

So, to business. Here’s version one of my simple demo UI.

notetaker2

You can type in the top text box to enter a new “note”, by clicking Add. We display the current list of notes in a checked list box. You can check items and then click Delete to delete all checked items. And you toggle selection of all/none by clicking the Select all check box. It’s a nice example because the things on the screen are related in ways that aren’t as simple as they first appear.

In persistent terms, a note is just a string. But in our UI, a note is also either selected or not. So we really should model a note like this;

public class Note
{
    public readonly ISetable<string> Text = new Setable<string>();
    public readonly ISetable<bool> IsSelected = new Setable<bool>();
}

Now, you might be tempted to “correct” this as follows, in order to do things by-the-book (what are you, the encapsulation police?):

public class Note
{
    private readonly ISetable<string> _text = new Setable<string>();
    private readonly ISetable<bool> _isSelected = new Setable<bool>();

    public ISetable<string> Text { get { return _text; } }
    public ISetable<bool> IsSelected { get { return _isSelected; } }
}

But in this simple example (and many real-world cases), it would add nothing. Each of our properties is already encapsulated behind ISetable, which gives us total flexibility in the future to change how they are implemented internally without affecting our clients. ISetable is just another way of encapsulating a mutable property, with the added bonuses that (a) we can pass around a reference to it where necessary (rather like a “property delegate”, something absent from C# itself), and most important of all (b) it has a built-in Changed event.

Now we need a little nugget of UI to represent a note in the list. I’ll do something a bit weirder than a mere CheckBox, just to demonstrate a more powerful capability. We’ll define a new Form-derived class, NoteListItemForm, which is laid out very compactly:

noteitem1

There’s a CheckBox without a label, so next to it I can put a TextBox. This will mean that the user can edit the note text directly in the list (though it will make selection a bit less intuitive – however, we’ll come back to that).

The code-behind for this mini-form is very simple:

public partial class NoteListItemForm : Form, IBindsTo<Note>
{
    public NoteListItemForm()
    {
        InitializeComponent();
        TopLevel = false;
        Visible = true;
    }

    public void Bind(Note note)
    {
        textBoxContent.BindText(note.Text);
        checkBoxIsSelected.BindChecked(note.IsSelected);
    }
}

In the constructor we make the standard changes necessary to make it suitable for adding as a child of a Panel. Then there’s just a Bind method that wires it up to a Note (thanks to the magic unbinding context we built above, there’s never a need to write a corresponding Unbind method). By the way, that Bind method implements the IBindsTo interface, which will turn out to be important in a moment.

Strange as it may seem, the rest of our code can be put very neatly in the constructor of our main Form-derived class. First we declare the list of notes:

var notes = new SetableList<Note>();

Then we have a large Panel on our main Form where we’d like the list to appear. There’s a rather nifty extension method on Panel that we can use like this:

panelSelectionList.BindForEach(notes).As<NoteListItemForm>();

It has a slightly peculiar chained declaration, just to get the most out of type inference. BindForEach only needs one direct parameter, an ISetableList<TItem>, which is the source of data. Then that returns something on which you have to call As, which only needs a type parameter. Here that is our mini-Form, but it could be any class derived from the Windows Forms Control class which implements the IBindsTo interface and has a parameter-less constructor – or to put it another way:

where TControl : Control, IBindsTo<TItem>, new()

Every time the contents of the list change, the binding automatically creates (or destroys) child controls in the panel, laying them out vertically, and calling each child control’s Bind method to set it up. (Important: this only happens when we put different Note objects in the list. Nothing has to happen here when one of the internal properties of a Note changes, because those have their own bindings.)

Next we arrange for new notes to be added by the user (which is actually the same code I used as an example above):

var newNoteText = Setable.From(string.Empty);
textBoxNewNote.BindText(newNoteText);

// 'Add' button only enabled if text box is non-empty
buttonAdd.BindEnabled(Computed.From(() => newNoteText.Value.Length != 0));

buttonAdd.Click += (s, ev) =>
{
    notes.Add(new Note { Text = { Value = newNoteText.Value } });
    textBoxNewNote.Text = string.Empty;
    textBoxNewNote.Focus();
};

Oh look! An actual explicitly wired-up event handler. Button.Click is the only kind of event we should ever have to deal with like this: it’s intrinsically imperatively by nature. Otherwise, it’s declaration everywhere, including our first Computed.

A more thorny problem is that of the Select all CheckBox, which has three states. If we only select some of the notes, the CheckBox should be “indeterminate”. But with a couple more Computeds we can declare what we want it do and it’s really very easy, even though it would be surprisingly gnarly in traditional evented-imperative code.

Firstly, lets compute ourselves a list of just those notes that are currently selected (which is going to be useful for the Delete button also):

var selectedNotes = Computed.From(() => notes.Where(n => n.IsSelected.Value).ToList());

Two things to note:

  1. Just because it’s a list, it doesn’t need to be an ISetableList. It’s just an ordinary IGetable that happens to contain a computed list as its value.
  2. It’s really important that we say ToList(), because we have to access the IsSelected property directly during the execution of the lambda we pass to Computed.From, otherwise the library can’t detect our dependency on that property. If we leave it as a lazily evaluated list, we apparently won’t be dependent on anything.

But how do come up with something we can bind to our three-state checkBoxAllNotes? There is a suitable binding extension method, BindCheckState, that supports all three states using the standard enum CheckState as the value. But what about the value we’ll bind to?

Happily there’s another trick up our sleeves: the setable computed. This essentially lets you define on-the-fly a property with custom getter and setter methods.

checkBoxAllNotes.BindCheckState(Computed.From(

    get: () => selectedNotes.Value.Count == 0
                    ? CheckState.Unchecked
                    : selectedNotes.Value.Count == notes.Count
                            ? CheckState.Checked
                            : CheckState.Indeterminate,

    set: state =>
        {
            // only pay attention when setting to a definite state
            if (state == CheckState.Indeterminate)
                return;

            foreach (var note in notes)
                note.IsSelected.Value = state == CheckState.Checked;
        }));

We’re using an overload of Computed.From that takes two parameters, get and set. I’m using named parameter syntax even though its totally redundant here, just for clarity (and because it resembles the declaration of a property).

The getter lambda literally says: “if none are selected I’m unchecked; if all are selected I’m checked; otherwise I’m indeterminate.” The setter is even simpler: it sets all the notes to be either selected or not selected, depending on whether the new state is Checked or Unchecked. If the new state happens to be Indeterminate, it does nothing.

And we don’t really need this dang CheckBox to work at all if there aren’t currently notes to select:

checkBoxAllNotes.BindEnabled(Computed.From(() => notes.Count != 0));

Now the Delete button is child’s play:

buttonDelete.BindEnabled(Computed.From(() => selectedNotes.Value.Count != 0));
buttonDelete.Click += (s, ev) => notes.RemoveAll(selectedNotes.Value);

Recall that all this is in the constructor of our main Form. To recap:

var notes = new SetableList<Note>();
            
// For each each note, make a CheckedListBoxItem
panelSelectionList.BindForEach(notes).As<NoteListItemForm>();

var newNoteText = Setable.From(string.Empty);
textBoxNewNote.BindText(newNoteText);

// Add button only enabled if text box is non-empty
buttonAdd.BindEnabled(Computed.From(() => newNoteText.Value.Length != 0));
buttonAdd.Click += (s, ev) =>
{
    notes.Add(new Note { Text = { Value = newNoteText.Value } });
    textBoxNewNote.Text = string.Empty;
    textBoxNewNote.Focus();
};

// list of currently selected notes (important: ToList to avoid lazy evaluation)
var selectedNotes = Computed.From(() => notes.Where(n => n.IsSelected.Value).ToList());

// Two-way binding
checkBoxAllNotes.BindCheckState(Computed.From(

    get: () => selectedNotes.Value.Count == 0
                    ? CheckState.Unchecked
                    : selectedNotes.Value.Count == notes.Count
                            ? CheckState.Checked
                            : CheckState.Indeterminate,

    set: state =>
        {
            // only pay attention when setting to a definite state
            if (state == CheckState.Indeterminate)
                return;

            foreach (var note in notes)
                note.IsSelected.Value = state == CheckState.Checked;
        }));

checkBoxAllNotes.BindEnabled(Computed.From(() => notes.Count != 0));

// Delete button only enabled if selection is non-empty
buttonDelete.BindEnabled(Computed.From(() => selectedNotes.Value.Count != 0));
buttonDelete.Click += (s, ev) => notes.RemoveAll(selectedNotes.Value);

You can get the code from GitHub to see the NoteTaker application in action. Next time on this channel: let’s make the UI unnecessarily complicated and powerful, so it has multiple simultaneously updating views on the same information, all of which stay mutually consistent, yet we barely have to write any more code.

Categories: Uncategorized Tags: , ,

Eventless Programming – Part 2: Computed observables

February 28, 2013 Leave a comment

Posts in this series:

Welcome back for more dry philosophical musings (I promise there will be an actual working GUI with checkboxes and stuff in the next part). By the way, the code is here: github.com/danielearwicker/eventless/. Also note that I just realised that Getable and Setable would make more sense in a C# context than Readable and Writeable, so I’ve done some renaming.

Last time we got as far as defining a class that could hold a single value in a property called Value, and would fire an event called Changed whenever the value was modified.

Minor detour: for ultimate convenience let’s make a static helper method so we can take full advantage of the type inference that kicks in for generic methods but is sadly absent for constructors of generic classes:

public static class Setable
{
    public static Setable<T> From<T>(T initVal)
    {
        return new Setable<T>(initVal);
    }
}

Now we can really ramp up the hysteria as we add two numbers together:

var i = Setable.From(3);
var j = Setable.From(4);
var sum = Computed.From(() => i + j);

Now, I know it’s exciting but try to contain yourself. The point is, if (and only if) we modify i or j, we need sum to be recomputed (and thus in turn trigger recomputation of anything that consumes sum), just like it would be if these were cells in Excel. This will give us simplicity combined with optimal (that is, minimal) computation. When these are no longer integers but instead complicated data structures or large pieces of UI, it should seem a lot more useful.

So there has to be some magic plumbing involved that notices that we refer to i and j in our definition of sum.

First, Computed.From is just another handy static helper method:

public static class Computed
{
    // Excerpt...
    public static Computed<T> From<T>(Func<T> compute)
    {
        return new Computed<T>(compute);
    }
}

No surprises there, so let’s take a look at the more substantial generic Computed<T> class:

public sealed class Computed<T> : Forwarder<Setable<T>, T>, IGetable<T>, IDisposable
{
    private readonly Func<T> _compute;

    private ISet<IGetable> _subscriptions = Computed.EmptySubscriptions;

    public Computed(Func<T> compute)
        : base(new Setable<T>())
    {
        _compute = compute;
        Recompute();
    }

    private void Recompute()
    {
        var newSubscriptions = new HashSet<IGetable>();

        Computed.Listeners.Push(o => newSubscriptions.Add(o));
        var newVal = _compute();
        Computed.Listeners.Pop();
        Impl.Value = newVal;
        newSubscriptions.Remove(this);
        newSubscriptions.Remove(Impl);

        foreach (var sub in _subscriptions.Where(s => !newSubscriptions.Contains(s)))
            sub.Changed -= Recompute;
        
        foreach (var sub in newSubscriptions.Where(s => !_subscriptions.Contains(s)))
            sub.Changed += Recompute;
        
        _subscriptions = newSubscriptions;
    }

    public T Value
    {
        get { return Impl.Value; }
    }

    public void Dispose()
    {
        foreach (var sub in _subscriptions)
            sub.Changed -= Recompute;
    }
}

First key point: it only implements IGetable, not ISetable, so it has no setter for the value. This makes sense because it holds its own definition of what the value should be at any given moment.

Second (and this is really another detour), it kinda-sorta inherits from Setable, because it needs exactly the same kind of logic for triggering the Changed event only if the new value really is different. But I didn’t want to use full implementation inheritance, because then the features of Setable (such as the Value setter) would all be public, which we’ve already established doesn’t make sense. So instead I inherit a class called Forwarder, specifically designed to encapsulate an implementation of IGetable for the purposes of reuse. We’ll skip over the details of that, but basically if you’re familiar with private inheritance in C++, I’ve roughly simulated that. The end result is that we have a property Impl that holds the Setable we created in the base constructor call, and its Changed event is exposed for us.

Third: the meat of the class lies in its management of a set of other observables to which it subscribes for the Changed event. The currently subscribed observables are held in the _subscriptions set. The constructor calls Recompute the first time, and then Recompute has to make subscriptions so it will be called in the future, like this:

sub.Changed += Recompute;

The cunning part is this:

var newSubscriptions = new HashSet<IGetable>();

Computed.Listeners.Push(o => newSubscriptions.Add(o));
var newVal = _compute();
Computed.Listeners.Pop();
Impl.Value = newVal;

Remember that mystery call to Computed.Listeners.Notify(this) we encountered in part 1? Of course you do – you’ve been barely able to sleep for thinking about it constantly. Really, you look terrible. Just read the rest of this part and then try to get some rest.

Computed.Listeners is just defined as:

internal static readonly ListenerStack<IGetable> Listeners = new ListenerStack<IGetable>();

And ListenerStack is quite a handsome young utility class:

public class ListenerStack<T>
{
    private readonly ThreadLocal<Stack<Action<T>>> Stack =
                    new ThreadLocal<Stack<Action<T>>>(() => new Stack<Action<T>>());

    public void Push(Action<T> listener)
    {
        Stack.Value.Push(listener);
    }

    public void Pop()
    {
        Stack.Value.Pop();
    }

    public void Notify(T obs)
    {
        if (Stack.Value.Count != 0)
            Stack.Value.Peek()(obs);
    }
}

It allows us to notify (pass some value of some type or other) to the top-most listener in a stack of listeners on the current thread. The thread detail is something that knockout.js doesn’t have to worry about, but in the CLR it’s a must. If your UI thread is using Eventless and meanwhile some background worker has also found a use for it, they better not get their wires crossed. It would be a fine mess if the user of a library is careful not to share non-thread-safe objects (such as Setable) between threads, and then the library gets them tangled up anyway.

Computed registers its own listener at the top of the stack of listeners, which is a delegate that just records all the Getables that are notified about. And (as we saw in part one), Getables notify about themselves whenever their values are accessed.

So this means that Computed can (and does) easily find out exactly what Getables are read during the execution of its definition. It can then subscribe to them. In fact by using a HashSet it can pretty rapidly track any changes in its immediate dependencies. And they can be read anywhere while the definition is executing – way down the stack in some other method, without any need to explicitly pass any extra parameters around. The implications of that for constructing software are, I think, rather mind-blowing, and if your mind isn’t blown to smithereens by it, then either (a) you don’t fully get the implications yet, or (b) your mind is already in smithereens from previously discovering this.

One little bit of extra housekeeping:

newSubscriptions.Remove(Impl);

Rationale: it can be handy for a computed to read its old value, and while the definition delegate is executing that old value is there to be read, so we may as well allow it. But doesn’t make sense for it to consequently be dependent on itself.

Bonus Content

The other thing that will be very handy is a collection class. The idea is that (in part 3) we’re going to be displaying the contents of the collection in the UI, so it should be an ordered collection. So, not an ISet or an IDictionary. It must be possible to modify it through random access. In other words, it’s an IList. The CLR has an ObservableCollection class, but I’d like to start with a plain interface and build up from there (that way, it will be easy to bridge to any other frameworks):

public interface ISetableList<T> : IGetable<IList<T>>, IList<T>
{
    event Action<int> Added;
    event Action<int> Updated;
    event Action<int> Removed;
    event Action Cleared;
}

The events that take an int refer to the index within the collection. Note that the collection itself is an IGetable, not an ISetable – you can’t substitute in a new list. You have to clear it down and add new elements. This is no great hardship and simplifies a few things. But it is (effectively) a list of Setables, because you can update individual elements, hence the name ISetableList.

It is also an IList of the given element type. So in our implementation, SetableList, we have to implement all the methods of IList and make them trigger the appropriate events. And of course, whenever the value of an element is read, we better notify the Computed on the top of the stack (if any). Look at the code on Github for the details. for the details.

Okay, so we have most of the building blocks in place. Thank you for your patience. Next stop, part 3 and a working UI

Categories: Uncategorized Tags: , ,

Eventless Programming – Part 1: Why the heck?

February 25, 2013 11 comments

Posts in this series:

An “event” is a list of routines that all get executed (in unpredictable order) whenever something occurs. It is the low-level building block of complex interactive software.

But as abstraction, it fits poorly with our needs. Rather than the abstract idea of an event occurring, it is far more useful to think more specifically of a value changing. That is, we should always pair together two things: storage for a unit of information, and the event that the information has changed. This pairing is an observable.

In Windows Forms this pattern occurs, but only informally. For example, controls have a property called Enabled, and a corresponding event called EnabledChanged. Many controls have a property called Text, and an event called TextChanged. And so on. Unfortunately nothing links these together apart from the naming pattern.

WPF/Silverlight tried to fix this through Dependency Properties. A dependency property has a value, and there is a standard mechanism for notifying of a change in that value. But the results aren’t pretty: essentially it is just more manual event firing and handling, leading people to cook up various complex techniques to avoid writing so much boilerplate code.

There is a basic difficulty that occurs with stateful software. In mathematics once a value is defined and given a name, we don’t change it some time later on in the proof. In most programming languages that rule is tossed out carelessly. If x is said to be 5 and y is said to be x + 5, then y is surely 10… until x changes. Of course we could define y as a function of x, and recompute it every time we need to know the value, but in real programs we soon find that we’re repeating a lot of identical calculations, so instead we cache our values in variables, and then try to figure out manually when it would be an appropriate time to recompute them.

Events are the way we do this, but it’s crazy because we are repeatedly building, by hand, something that should be happening automatically, as simply as it does in Excel. If you put a 5 in cell B3 and then in another cell you put =B3+5, the spreadsheet is smart enough to know that it will need to recompute the second cell whenever the value in B3 changes. Excel gives us a model for declaring what our variables should contain, expressed in terms of other variables, and it harnesses the power of events to keep everything consistent, but we as users never have to manually hook up a single event handler, and nor should we.

Brothers and sisters, let us, here and now, build a C# framework for providing us with the convenience and minimal recomputation of Excel, but with the considerable advantage that we get to define the UI ourselves (and specify our own meaningful variable names instead of unhelpful cell references…)

(By the way, if you’re familiar with knockout.js, this is all going to seem eerily familiar, but a lot more statically typed.)

Now, buried beneath whatever we cook up, we will need an event. Let’s see how far we can get with a single solitary event doing all the work.

public interface IGetable
{
    event Action Changed;
}

[Sidenote: I was going to call it IObservable but as Brandon Wallace pointed out in the comments, there’s already System.IObservable. Then I called it IReadable, but then I realised properties have getters and setters, and as you’ll see, we’re really just building an abstraction over a property with a Changed event.]

I’m breaking the rules of the event pattern by using the super-bland Action delegate, but I don’t care. A basic principle here is that Changed will never need to provide any further information about the event, other than the fact that it has occurred. By the way, that interface is already finished! I’m not going to add anything more to it, ever.

Well, kind of. It is just a base for us to extend:

public interface IGetable<out T> : IGetable
{
    T Value { get; }
}

This turns out to be very useful because it means from an event-handling perspective we can treat all observables alike, regardless of what type of value they hold. And again, we’re entirely done with that interface. It’s already being all it will ever need to be. Note that we only have a getter for the value, however. Which leads us to:

public interface ISetable<T> : IGetable<T>
{
    new T Value { get; set; }
}

All very pretty and abstract, but still no code that does anything. Patience, I’m building to it. Good grief, this is only part 1! Seriously, don’t you ever shut up? Alright, alright, here’s our first class, the actual implementation of an observable:

public sealed class Setable<T> : ISetable<T>, IEquate<T>
{
    private T _value;
    private bool _changing;

    public Setable(T value = default(T))
    {
        _value = value;
        EqualityComparer = DefaultEqualityComparer;
    }

    public T Value
    {
        get
        {
            Computed.Listeners.Notify(this);
            return _value;
        }

        set
        {
            if (EqualityComparer(_value, value))
                return;

            if (_changing)
                throw new RecursiveModificationException();

            _value = value;

            var evt = Changed;
            if (evt == null) 
                return;

            _changing = true;
            try { evt(); } 
            finally { _changing = false; }
        }
    }

    public event Action Changed;

    public Func<T, T, bool> EqualityComparer { get; set; }

    public static bool DefaultEqualityComparer(T a, T b)
    {
        return ReferenceEquals(a, b) || (!ReferenceEquals(a, null) && a.Equals(b));
    }

    public static implicit operator T(Setable<T> from)
    {
        return from.Value;
    }
}

It fully implements ISetable – not especially hard to do. The critically important part is that the Value property setter doesn’t fire the event unless the value has truly changed.

[Update: In the comments Mark Knell mentioned the problem of re-entrance. The check for equality before firing the event is a partial solution to that; it silently allows recursive attempts to set Value to the same value it already holds, without triggering a further event, and so endless unnecessary ping-ponging events are eliminated. But originally I also allowed recursive modification of Value during the firing of Changed, in mimicry of Knockout.js. As an experiment, I’m going ban this outright. Hence the _changing field, which is true while the exception is being fired. If Value is genuinely modified again while Changed is firing, that suggests a problem has arisen in the way events have been wired up. So a “Please don’t do that” assertion seems called for. End of update]

Everything else in the class is about allowing you to supply an alternative way to compare values for equality. That other interface IEquate is part of that stuff, and is only there to support some generic constraints that aren’t particularly important:

public interface IEquate<T>
{
    Func<T, T, bool> EqualityComparer { get; set; }
}

The only puzzle here is that call to Computed.Listeners.Notify(this) in the Value getter. How can it be that a getter needs to do any notification? To see why that is, we’ll have to start trying to build support for computed observables, in part 2

Categories: Uncategorized Tags: , ,

How to write a “class” in JavaScript

February 16, 2013 2 comments

Of all the problems with JavaScript, by far the worst is the way people try to [ab]use it based on their training in Java and Object Orientation. They want to write a class, and then they want to write another class that inherits from the first class, because That How You Write A Software’sTM.

And this is made worse by a few features in the language that encourage people to make these mistakes: new, prototype, this.

The first hurdle to get over is inheritance. Consult any current text on OO and it will tell you that interface inheritance is fine, but implementation inheritance is a heap of trouble, so try to avoid it. In JavaScript, interface inheritance is completely unnecessary because there is no static typing at all. So that leaves implementation inheritance, which is what people usually mean by “inheritance” when they’ve been to Hollywood Upstairs Java College, and this is the part that even Java experts try to warn you away from.

So to get this absolutely clear: when you try to do inheritance in JavaScript, you’re trying to replicate something from Java that experts tell you not to do in Java. If you already know that, and you’re still insistent on trying, I can’t help you. You’re obviously insane.

For everyone else, this is the simplest way to write something that serves the approximate purpose of a “class” in JavaScript. It’s a factory function: when you call it, it manufactures an object:

var person = function(firstName, lastName) {
    return {
        getFullName: function() {
            return firstName + ' ' + lastName;
        },
        setFirstName: function(newName) {
            firstName = newName;
        },
        setLastName: function(newName) {
            lastName = newName;
        }
    };
};

// Usage:
var henryJones = person('Henry', 'Jones');

console.log(henryJones.getFullName());

henryJones.setFirstName('Indiana');

Note how there was no need to copy the constructor parameters into some separate fields. They already behave like private fields (actually more private than in the JVM or CLR, because in those runtimes you can use reflection to get to private fields).

There is a difference between the above pattern versus using prototype on a constructor function that has to be called with new – I mean aside from the convenience of not have to use new and this everywhere). The difference is that for each instance of a person, we create four objects: the one we return and the three functions stored in it.

If instead we did it the messy way:

function Person(firstName, lastName) {
    this.firstName = firstName;
    this.lastName = lastName;
};

Person.prototype.getFullName = function() {
    return this.firstName + ' ' + this.lastName;
};

Person.prototype.setFirstName = function(newName) {
    return this.firstName = newName;
}

Person.prototype.setLastName = function(newName) {
    return this.lastName = newName;
};

// Usage:
var henryJones = new Person('Henry', 'Jones');

Now the three function objects are shared between all instances, so only one object gets created per person instance. And to the typical, irrational programmer who places premature optimisation above all other concerns, this is vastly preferable. Never mind that the fields firstName and lastName are now public! That is a minor concern compared to the terrible thought of creating three extra function objects per instance.

But hang on one second: have you considered the following factors?

  • How many person instances are going to exist at any time while your code is running? If it’s small, then the cost per instance is irrelevant.
  • What other things are you going to do, per instance? If you’re going to build a load of DOM nodes with animations on them and fire off AJAX calls every second (come on, admit, that’s exactly what you’re going to do), then the cost of a few function objects per instance is lost in the noise.
  • Have you realised just how amazingly good modern JS runtimes are? They love allocating and throwing away lots of little objects.

If you do ever run into a situation where the difference is significant, you can apply the necessary optimisation in one place: inside your person factory function, without changing any other code:

var person = (function() {
    var vtable = {
        getFullName: function() {
            return this._firstName + ' ' + this._lastName;
        },
        setFirstName: function(newName) {    
            this._firstName = newName;
        },
        setLastName: function(newName) {
            this._lastName = newName;
        }
    };
    return function(firstName, lastName) {
        var p = Object.create(vtable);
        p._firstName = firstName;
        p._lastName = lastName;
        return p;
    };
})();

But in the vast majority of situations, there will never be any need to do that (i.e. when you do it, the difference will be practically unmeasurable), so why bother until you have a realistic model of your application with which to carry out your performance testing? (And you’ve also made sure that you have nothing better to do with your time…)

To find out precisely how much difference this makes, and thus when it might be worth worrying about it, all we have to do is try allocating some objects in the Chrome console. Let’s pick a ridiculously big number: a million.

var start = new Date().getTime();
var p = [];
for (var n = 0; n < 1000000; n++) {
    p.push(person('Henry', 'Jones' + n));
}
console.log(new Date().getTime() - start);

We can then run this test three times each with the plain and optimised implementations and find out what the overhead actually amounts to.

The first point of interest is that the allocation of a million objects takes about two and half seconds in a freshly-opened Chrome instance on my run-of-the-mill Dell notebook, with either implementation. Taking the average of three runs with each, the “fast” version took 2390 ms and the “slow” took 2570 ms. That’s 180 ms difference over a million allocations! We’re talking about mere nanoseconds per extra function object being allocated. Just forget about speed being an issue. It’s not.

What about memory? The relevant Chrome process showed an increase of 81 MB for the “fast” version, versus 204 MB for the “slow”, so a total overhead of 123 MB caused by those extra function objects. Again, that’s shared across a million objects, so the overhead per object is just 130 bytes or so. Is that a big deal?

If you’re writing a game and trying to model specks of dust floating around the screen, it might be realistic to allocate a million of something. But in real business apps, we don’t create a million person objects all at once. We create objects based on downloaded data that we retrieve in small pages from the server (otherwise if a thousand users try to open the app at the same time, the server will have transmit gigabytes of stuff that none of those users have enough lifetime left to ever read).

So a hundred objects is more realistic. Then the total overhead in this example would be 13 KB – the space consumed by a small image, the kind used in most real web apps without a moment’s thought.

Another comparison is to look at the overhead of DOM nodes. The objects we create in JavaScript are typically there to create and manage our user interface elements in the browser DOM. How much does a single empty DIV cost?

var p = []; 
for (var n = 0; n < 1000000; n++) {
    p.push(document.createElement('div')); 
}

That added 128 MB to a fresh Chrome process. What a coincidence! 134 bytes per empty DIV, a fraction more overhead than our three function objects.

So the conclusion must be: relax, stop worrying about illusory “performance” concerns, and focus instead on writing code that is readable, maintainable, hard to get wrong, hard to break accidentally when modifying it, and so on.

Categories: Uncategorized Tags: , ,

Arguing with a Decade-Old Eric Lippert Post

January 7, 2013 2 comments

But it’s on the first page of Google results for c++ smart pointer addref release so I have to do this, dammit!

The post in question is: Smart Pointers are Too Smart.

Eric asked:

Does this code look correct?

map[srpFoo.Disown()] = srpBar.Disown();

It sure looks correct, doesn’t it?

Nope, it looks like a disaster waiting to happen. The map ought to have keys and values that are smart pointers, so the correct code would be:

map[srpFoo] = srpBar;

If the map holds raw pointers, that in itself is the error, because it will necessitate some complicated dance to transfer ownership in and out, which means bandying around raw pointers casually, and in C++ that leads to disaster, because it can only be done safely in code paths that can be proven to not throw exceptions.

In fact its even more general and simple than that. If you write some code in your C++ program that initiates any state at all, and the previous state will not be restored automatically during stack unwinding, and you haven’t taken the time to prove that no exceptions will be thrown during this state, you’re doing it wrong.

Far from showing what goes wrong when you use smart pointers, Eric showed what goes wrong when you stop using them, even for a nanosecond. If you never use them, the wrongitude will be uninterrupted.

It’s quite possible that forcing yourself to put all the AddRef and Release calls in the right places by hand will have a sort of positive side-effect: you will be so distracted by this effort that you will not have sufficient time or patience to be productive, and so this will force you to write only very simple programs, which is definitely a good thing, but perhaps arrived at for the wrong reasons.

And conversely, when a problem becomes rarer, it seems more egregious. “I remember the good old days, in the Blitz, when we were being bombed every day, but at least we were used to it!”

The real problem underlying all of COM is probably that IUnknown doesn’t provide a way to attach a listener for when the object is destroyed. If it had this, we could create a safe “weak” pointer, one that doesn’t keep the object in existence but is automatically set to nullptr when the object expires. This is addressed in the more recent WinRT variety of COM, where all sensible objects should implement IWeakReferenceSource, and in fact classes created with C++/CX do this automatically.

And ultimately this is perhaps the simplest response to Eric’s blog post: with C++/CX, smart pointers are baked into the language. Reference count cycles will still occur, no question, which is why the whole exercise is a silly step backwards, but only relative to something much better: pick just about any managed GC-enabled runtime.

Categories: Uncategorized Tags: ,

A pure library approach to async/await in standard JavaScript

December 28, 2012 5 comments

I’m very keen on JavaScript gaining actual language support for the one thing it is mostly used for: asynchronous programming, so I never shut up about it.

The C# approach was trailed very nicely here back in Oct 2010. Over the years I’ve occasionally searched for usable projects that might implement this kind of thing in JavaScript, but they all seem to be abandoned, so I stopped looking.

And then I had an idea for how to do it purely with an ordinary JS library – a crazy idea, to be sure, but an idea nonetheless. So I searched to see if anyone else had come up with it (let me rephrase that: like all simple ideas in software, it’s a certainty that other people came up with it decades ago, probably in LISP, but I searched anyway).

I haven’t found anything yet, but I did find Bruno Jouhier’s streamline.js, which looks like a very nice (and non-abandoned!) implementation of the precompiler approach.

So what was my crazy idea? Well, as Streamline’s creator said in a reply to a comment on his blog:

But, no matter how clever it is, a pure JS library will never be able to solve the “topological” issue that I tried to describe in my last post… You need extra power to solve this problem: either a fiber or coroutine library with a yield call, a CPS transform like streamline, or direct support from the language (a yield operator).

Well, that sounds like a challenge!

If we really really want to, we can in fact solve this problem with a pure library running in ordinary JavaScript, with no generators or fibers and no precompiler. Whether this approach is practical is another matter, partly because it chews the CPU like a hungry wolf, but also because the main selling point for this kind of thing is that it makes life easier for beginners, and unfortunately to use my approach you have to understand the concept of a pure function and pay close attention to when you’re stepping outside of purity.

To set the stage, Bruno Jouhier’s example is a node.js routine that recurses through file system directories. Here’s a simplified version using the sync APIs:

var fs = require('fs');
var path = require('path');

var recurseDir = function(dir) {
    fs.readdirSync(dir).forEach(function(child) {
        if (child[0] != '.') {
            var childPath = path.join(dir, child);
            if (fs.statSync(childPath).isDirectory()) {
                recurseDir(childPath);
            } else {
                console.log(childPath);
            }
        }
    });
};

recurseDir(process.argv[2]);

And – ta da! – here’s a version that uses the async APIs but appears not to:

var fs = require('fs');
var path = require('path');

var Q = require('q');
var interrupt = require('./interrupt.js');

var readdir = interrupt.bind(Q.nfbind(fs.readdir));
var stat = interrupt.bind(Q.nfbind(fs.stat));
var consoleLog = interrupt.bind(console.log);

interrupt.async(function() {

    var recurseDir = function(dir) {
        readdir(dir).forEach(function(child) {
            if (child[0] != '.') {
                var childPath = path.join(dir, child);
                if (stat(childPath).isDirectory()) {
                    recurseDir(childPath);
                } else {
                    consoleLog(childPath);
                }
            }
        });
    };

    recurseDir(process.argv[2]);
});

The core of the program, the recurseDir function, looks practically identical. The only difference is that it calls specially wrapped versions of readdir, stat and console.log, e.g.

var readdir = interrupt.bind(Q.nfbind(fs.readdir));

The inner wrapper Q.nfbind is from the lovely q module that provides us with promises with (almost) the same pattern as jQuery.Deferred. Q.nfbind wraps a node API so that instead of accepting a function(error, result) callback it returns a promise, which can reduce yuckiness by up to 68%.

But interrupt.bind is my own fiendish contribution:

exports.bind = function(impl) {
    return function() {
        var that = this;
        var args = arguments;
        return exports.await(function() {
            return impl.apply(that, args);
        });
    };
};

So it wraps a promise-returning function inside that interrupt.await thingy. To understand what that is for, we have to go back to the start of the example, where we say:

interrupt.async(function() {

The function we pass there (let’s call it our “async block”) will be executed multiple times – this is a basic fact about all interruptible coroutines. They can be paused and restarted. But standard JavaScript doesn’t provide a way to stop a function and then start up again from the same point. You can only start again from the beginning.

In order for that to work, when a function reruns all its activity a second, third, fourth… (and so on) time, repeating everything it has already done, then it has to behave exactly the same as it did the previous time(s). Which is where functional purity comes in. A pure function is one that returns the same value when provided with the same arguments. So Math.random is not pure. Nor is reading the file system (because it might change under your feet). But quite a lot of things are pure: anything that only depends on our parameters, or the local variables containing whatever we figured out so far from our parameters.

So, inside interrupt.async we can do anything pure without headaches. But whenever we want to know about the outside world, we have to be careful. The way we do that is with interrupt.await, e.g.

var stuff = interrupt.await(function() {
    return $.get('blah-de-blah');
});

The first time the async block runs, when it goes into interrupt.wait, it executes the function we pass to it (the “initializer”), which in this case starts a download and returns a promise that will be resolved when the download is ready. But then interrupt.wait throws an exception, which cancels execution of the async block. When the promise is resolved, the async block is executed again, and this time interrupt.wait totally ignores the function passed to it, but instead returns the result of the download from the promise created on the first run, which I call an externality (because it’s data that came from outside).

The internal representation is actually quite simple. Here’s interrupt.async:

function Interrupt() {}

var currentContext = null;

exports.async = function(impl) {
    log('Creating async context');
    var thisContext = {
        ready: [],
        waiting: null,
        slot: 0,
        result: defer(),
        attempt: function() {
            log('Swapping in context for execution attempt');
            var oldContext = currentContext;
            currentContext = thisContext;
            currentContext.slot = 0;
            try {
                thisContext.result.resolve(impl());
                log('Completed successfully');
            } catch (x) {
                if (x instanceof Interrupt) {
                    log('Execution was interrupted');
                    return;
                } else {
                    log('Exception occurred: ' + JSON.stringify(x));
                    throw x;
                }
            } finally {
                log('Restoring previous context');
                currentContext = oldContext;
            }
        }
    }
    log('Making first attempt at execution');
    thisContext.attempt();
    return getPromise(thisContext.result);
};

The important part is the context, which has an array, ready, of previously captured externalities, and an integer, slot, which is the index in the ready array where the next externality will be recorded.

The more fiddly work is done in interrupt.await:

exports.await = function(init) {
    if (!currentContext) {
        throw new Error('Used interrupt.await outside of interrupt.async');
    }
    var ctx = currentContext;
    if (ctx.ready.length > ctx.slot) {
        log('Already obtained value for slot ' + ctx.slot);
        var val = ctx.ready[ctx.slot];
        if (val && val.__exception) {
            log('Throwing exception for slot ' + ctx.slot);
            throw val.__exception;
        }
        log('Returning value ' + JSON.stringify(val) + ' for slot ' + ctx.slot);
        ctx.slot++;
        return val;
    }
    if (ctx.waiting) {
        log('Still waiting for value for ' + ctx.slot + ', will interrupt');
        throw new Interrupt();
    }
    log('Executing initializer for slot ' + ctx.slot);
    var promise = init();
    if (promise && promise.then) {
        log('Obtained a promise for slot ' + ctx.slot);
        var handler = function(val) {
            if ((ctx.slot != ctx.ready.length) ||
                (ctx.waiting != promise)) {
                throw new Error('Inconsistent state in interrupt context');
            }
            log('Obtained a value ' + JSON.stringify(val) + ' for slot ' + ctx.slot);
            ctx.ready.push(val);
            ctx.waiting = null;
            log('Requesting retry of execution');
            ctx.attempt();
        };
        promise.then(handler, function(reason) {
            log('Obtained an error ' + JSON.stringify(reason) + ' for slot ' + ctx.slot);
            handler({ __exception: reason });
        });
        ctx.waiting = promise;
        throw new Interrupt();
    }
    if (ctx.slot != ctx.ready.length) {
        throw new Error('Inconsistent state in interrupt context');
    }
    // 'promise' is not a promise!
    log('Obtained a plain value ' + JSON.stringify(promise) + ' for slot ' + ctx.slot);
    ctx.ready.push(promise);
    ctx.slot++;
    return promise;
};

It can deal with an initializer that returns a plain value, and avoids the overhead of interrupting and restarting in that case, but still enforces the same behaviour of capturing the externality so it can be returned on any subsequent repeat run instead of running the initializer again.

In fact we have an example of this in the original example: console.log is not a pure function. It has side-effects: conceptually, it returns a new state-of-the-universe every time we call it. So it has to be wrapped in interrupt.await, just like any other impure operation, and we faithfully record that it returned undefined so we can return that next time we execute the same step. In this case we’re not really recording a particular external value, but we are recording the fact that we’ve already caused a particular external side-effect, so we don’t cause it multiple times.

As long as the await-wrapping rule is followed, it all works perfectly. The problem, of course, is that if there are a lot of asynchronous promises and side-effecting calls involved, then it will start to slow down as it repeatedly stops and re-executes everything it has done so far. Although in fact, it doesn’t repeat everything. A lot of the hard work involves interacting with the OS, and that is (of necessity) wrapped in interrupt.await, and so only happens once. On subsequent executions the value cached in the ready array is reused, which is looked up by position, which is quite fast. So each re-execution only involves “going through the motions” to get back to where it left off.

Even so, this extra grinding of the CPU does start to slow it down very noticeably after a healthy number of interrupts (modern JavaScript is fast, but not a miracle worker). The recursion of the file system is a very good test, because it has to effectively recursively revisit all the places it already visited so far, and has to do this for every single file (due to the stat call) and twice for directories.

One way to “cheat” would be to replace the Array.prototype.forEach function with something that understood how to interact with the current async context, and could skip forward to the right place in the iteration… but I’ll save that for another rainy day.

Categories: Uncategorized Tags: , ,

Use of Knockout in Nimbah

December 27, 2012 Leave a comment

As well as Nimbah being a tool that does something I occasionally need, it’s also a chance to experiment. (Also by writing this up, I’m probably going to spot some things I can simplify.)

While I’ve been using Knockout regularly in my work for a few months, it’s been as a tool that I apply here and there where appropriate. But in Nimbah I decided to go for broke and base the entire thing on Knockout. So in essence its a single view model, the whole document is bound once to the the view model, and when you edit the pipeline you are just editing the view model.

A simplified schematic version of the view model is:

var viewModel = {
    inputText: ko.observable(localStorage.getItem('savedInputText') || ''),
    selected: ko.observable(),
    pipeline: ko.observable(null),
    layout: ko.observable('vertical')
};

ko.computed(function() {
    var root = viewModel.pipeline();
    if (root) {
        root.inputValue(viewModel.inputText());
    }
});

viewModel.outputText = ko.computed(function() {
    var root = viewModel.pipeline();
    return root ? viewModel.stringify(root.outputValue()) : '';
});

In the UI the inputText and outputText properties are each bound to a textarea (the output one being readonly).

But internally, thanks to the ko.computed parts, they are really little more than aliases for two properties on the root pipeline: inputValue and outputValue. The underlying model of Nimbah is based on nodes that all follow this same pattern. A pipeline is a node and operators are nodes, and some operators contain their own pipelines as child nodes, and so on.

The building block is the node function, which builds a raw node – again in simplified schematic form:

var node = function() {
    var model = {};
    model.readOnlyChildren = ko.observableArray();

    model.insertAfter = function(newChild, existingChild) ...
    model.insertBefore = function(newChild, existingChild) ...
    model.remove = function() ...
    model.parent = ko.computed({
        read: function() ...
        write: function(val) ...
    });
    model.firstChild = ...
    model.lastChild = ...
    model.nextSibling = ...
    model.previousSibling = ...

    return model;
};

Each node has (possibly null) references to its parent, firstChild, lastChild, nextSibling and previousSibling, all of which are shielded by ko.computed with read/write operations so that they remain consistent. For example, if you assign node A to be the nextSibling of node B, that’s equivalent to B.parent().insertAfter(B, A), which ensures that B also becomes the previousSibling of A and that B’s parent is now also A’s parent.

There’s an observable array of the node’s children, called readOnlyChildren to try to emphasise that it isn’t meant to be directly modified. I have to expose it because it’s what the UI binds to, but its contents are automatically maintained (via individual insertions and deletions) by the above “proper” node properties.

Why do it this way? Because of the way nodes obtain their input values. If an operator has a previousSibling, it uses that sibling’s outputValue as its own inputValue. If it has a null previousSibling, it’s the first operator in its pipeline, so it should be using the pipeline’s inputValue. And guess what? The pipeline is its parent, so it has no problem getting the inputValue from it. Hence for any operator node, inputValue looks like this:

model.inputValue = ko.computed(function() {
    return model.previousSibling() ? model.previousSibling().outputValue() :
           model.parent() ? model.parent().inputValue() : null;
});

In a pipeline it’s a lot simpler:

model.inputValue = ko.observable(null);

This is because pipelines can act as the root of everything, or be used in whatever way an operator wants to use them. So it’s up to the owner of the pipeline to feed it with the right inputValue.

Of course, the outputValue of a node has to be entirely its own responsibility – the whole point of a node is how it turns a given input into the right output. For a pipeline, it’s just the last child node’s outputValue (or for an empty pipeline it’s just the inputValue):

model.outputValue = ko.computed(function() {
    return model.lastChild() ? model.lastChild().outputValue() : 
           model.inputValue();
});

Each kind of operator has to implement outputValue differently. Here’s split:

model.outputValue = ko.computed(function() {
    var input = model.inputValue();
    if (input && typeof input.split == 'function') {
        return input.split(model.separator());
    }
    return input;
});

So the upshot of all this is that whenever you edit the text in the input textarea, it ripples through all the operators in the pipeline completely automatically, through the miracle of ko.computed. If a node is unplugged from some location and then plugged in somewhere else, everything updates accordingly. It’s just beautiful!

And that’s before we even get on to the joy of how undo/redo just bolts onto the side without any real effort…

Categories: Uncategorized Tags: ,
Follow

Get every new post delivered to your Inbox.