Home > Uncategorized > SICP-style Streams in JavaScript

SICP-style Streams in JavaScript

In the not-famous-enough book Structure and Interpretation of Computer Programs (Abelson & Sussman, or “The Wizard book”) we learn about streams.

A stream is a tempting variation on the old school Lisp style of linked list. To get a plain old list, we can set up objects like this:

var a = {
    value: 'apple',
    next: null
};

var b = {
    value: 'banana',
    next: a
};

var c = {
    value: 'cantaloupe',
    next: b
};

So here our whole list is represented by c, and we can loop through it and print all the fruits:

for (var i = c; i != null; i = i.next) {
    console.log(i.value);
}

So far, so boring. The idea with a stream is very simple. Instead of storing the next object in the next property, we store a function that, if called, will return the next object. That is, we make it lazy. Note that our loop would still look much the same:

for (var i = c; i != null; i = i.next()) {
    console.log(i.value);
}

The only difference is we call next() instead of just reading it. And to set up the objects we’d have to say:

var a = {
    value: 'apple',
    next: function() { return null; }
};

var b = {
    value: 'banana',
    next: function() { return a; }
};

var c = {
    value: 'cantaloupe',
    next: function() { return b; }
};

So far, so pointless. But the value of this does not come from silly hand-built examples. In real software you would use this to generate streams from other data sources, or from other streams. It’s like Linq-to-objects in C#, but the foundations are actually more purely functional, because even the iteration process involves only immutable objects, and so everything is repeatable, nothing is destroyed merely by using it. Part-way through a stream you can stash the current node, and come back to it later. It will still represent “the rest of the stream”, even though you already used it once.

It is this extreme level of generality that persuaded me try using streams in a real JavaScript library. I want to write a rich text editor for HTML Canvas (more of that in a later post, hopefully). So I would have streams of characters, streams of words, streams of lines, etc. It seemed to fit, and also I have a week off work and it’s fun to re-invent the wheel.

I start with an object representing the empty stream. This is nicer than using null, because I want to provide member functions on streams. If you had to check whether a stream was null before calling methods on it, that would suck mightily.

var empty = {};

function getEmpty() {
    return empty;
}

Then we need a way to make a non-empty stream:

function create(value, next) {
    return Object.create(empty, {
        value: { value: value },
        next: { value: next || getEmpty }
    });
}

It uses the empty stream as its prototype, and adds immutable properties for value and the next function. If no next function is passed, we substitute getEmpty. So calling create('banana') would make a stream of just one item.

One very handy building block is range:

var range = function(start, limit) {
    return start >= limit ? empty : create(start, function() {
        return range(start + 1, limit);
    });
};

Note the pattern, as it is typical: the next works by calling the outer function with the arguments needed to make it do the next step. And you may be thinking – AHGGHGH! Stack overflow! But no, as long as we loop through the stream using our for-loop pattern, the stack will not get arbitrarily deep.

Here’s a favourite of mine, so often forgotten about:

var unfold = function(seed, increment, terminator) {
    return create(seed, function() {
        var next = increment(seed);
        return next === terminator ? empty :
            unfold(next, increment, terminator);
    });
};

You call it with a seed value, which becomes the first value of the stream, and also an increment function that knows how to get from one value to the next, and a terminator value that would be returned by the increment function when it has no more values. So in fact you could implement range in terms of unfold:

var range = function(start, limit) {
    return unfold(start, function(v) { return v + 1; }, limit);
};

It can also turn a traditional linked list into a stream:

var fromList = function(front) {
    return unfold(front, function(i) { return i.next; }, null);
};

Groovy! Now we have several ways to originate a stream, so lets add some methods. Recall that empty is the prototype for streams, so:

empty.forEach = function(each) {
    for (var s = this; s !== empty; s = s.next()) {
        each(s.value)
    }
};

Nothing to it! And we can use forEach to get a stream into an array:

empty.toArray = function() {
    var ar = [];
    this.forEach(function(i) { ar.push(i); });
    return ar;
};

Of course, how could we live without the awesome power of map?

empty.map = function(mapFunc) {
    var self = this;
    return self === empty ? empty : create(mapFunc(self.value), function() {
        return self.next().map(mapFunc);
    });
};

Again, that lazy-recursive pattern. And now we can very easily implement converting an array into a stream:

var fromArray = function(ar) {
    return range(0, ar.length).map(function(i) {
        return ar[i];
    });
}

How about concat? Well, this has a slight wrinkle in that if the argument is a function, I treat it as a lazy way to get the second sequence:

empty.concat = function(other) {
    function next(item) {
        return item === empty
            ? (typeof other === 'function' ? other() : other)
            : create(item.value, function() { return next(item.next()); });
    }
    return next(this);
};

And with concat we can easily implement the holy grail of methods, bind (known as SelectMany in Linq and flatMap in Scala):

empty.bind = function(bindFunc) {
    var self = this;
    return self === empty ? empty : bindFunc(self.value).concat(function() {
        return self.next().bind(bindFunc);
    });
};

Think that one through – it’s a mind-bender. The bindFunc returns a sub-stream for each item in the outer stream, and we join them all together. So:

assertEqual(

    // ordinary array of numbers
    [1, 2, 3, 4, 5, 6, 7, 8, 9],

    // making that same array in an interesting way
    Stream.fromArray(
        [[1, 2, 3], [4], [5, 6], [], [7], [], [], [8, 9]]
    ).bind(function(ar) {
        return Stream.fromArray(ar);
    }).toArray()

);

Anyway, I wrote my rich text layout engine using this stream foundation, and (as I like to do with these things) I set up an animation loop and watched it repeatedly carry out the entire word-break and line-wrap process from scratch in every frame, to see what frame rate I could get. Sadly, according to the browsers’ profilers, the runtime was spending a LOT of time creating and throwing away temporary objects, collecting garbage and all the other housekeeping tasks that I’d set for it just so I could use this cool stream concept. Interestingly, in terms of this raw crunching through objects, IE 10 was faster than Chrome 30. But I know that by using a simpler basic abstraction it would be much faster in both browsers.

How do I know? Well, because I found that I could speed up my program very easily by caching the stream of words in an ordinary array. And guess what… I could just use arrays in the first place. I am only scanning forward through streams and I definitely want to cache all my intermediate results. So I may as well just build arrays. (Even though I haven’t started the rewrite yet, I know it will be way faster because of what the profilers told me).

So, for now, we say: fair well streams, we hardly knew ye.

Advertisements
  1. December 6, 2013 at 2:17 pm

    Fantastic, thanks!
    Glad that I learnt the concept of streams and its manipulation.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: