Home > Uncategorized > Static vs. Dynamic Typing (Reflections on the Original Wiki)

Static vs. Dynamic Typing (Reflections on the Original Wiki)

I’m not sure how I stumbled on the original Wiki at C2. It was probably (like a lot of people) an interest in the hype around “extreme programming” (remember that? “If it doesn’t work, you’re doing it wrong” – great times!)

I’m pretty sure all of my tiny flurry of activity on it was concentrated into a few months during 2000 (possibly a little in 2001), when it was already a sprawling activity-hive. So at least I wasn’t able to graffiti over it too painfully.

It’s hard to be certain, because web.archive.org apparently didn’t start indexing the site until 2002, by which time all the violent arguments on my user page had been cleared down. And this was pioneering Wiki software, so… no editing history! Seems like an incredible omission now.

But it does at least have reverse-linking – so if you signed your edits with your username, then you can easily find the latest version of them by clicking on the title of your user page. So today all I can see is that I left 30 or so signed comments scattered around before I got bored. (Of course, there may have been more that have since been deleted by others.)

Despite the brevity, there was definitely one thing that stayed with me from this experience. As can be seen from several of my contributions, I had a zealot’s horror on encountering people who appeared to seriously believe that a dynamically-typed language could be used for successful large scale development. So it was immediately my mission to save them from themselves. Instead (like a lot of people assuming a missionary position) I rapidly learned a thing or two.

People who work with a dynamically typed language are immediately faced with a problem. Every time they touch their code, they accidentally break something and they don’t have a clue what, or how badly. So to get anywhere at all, they are practically forced to write a lot of separate test code that systematically checks their assumptions about the code. For every possible way of correctly using a class or method, there has to be a test. And also a test for each known likely incorrect way it might be used, to ensure that it fails in the expected, helpful (unambiguous) way.

Once the programmer has gone to this effort, they are in a considerably better position than the user of a static language who relies on their compiler to check for errors. The reason is that static type systems can only express a limited set of “correctness requirements” on a program. User-written tests can check absolutely anything necessary. So a substantial set of tests will dig much deeper for errors than the mere type-compatibility checks possible in a compiler.

Yes, of course, a static language user can also write tests. But doesn’t this mean that any ceremony they are forced to go through (due to the type system) is a waste of time?

This is still a bone of contention amongst academic experts, as well as a religious flashpoint for workaday hackers. Day to day, I continually switch back and forth between static (mostly C#) and dynamic (mostly JavaScript) languages, and see good and bad in both. From C# I’ve learned how incredibly productive “intellisense” and auto-completion are (to keep it general, let’s say: Interactive Type System, or ITS), when the implementation works reliably. In C++, where the complexity of the language is a serious challenge, a typical semi-working ITS had far less value.

So the user interface of the language becomes easier (at least on the type-consuming side) with static typing. JavaScript is the big practical case study here: it is the most widely distributed dynamic language, and ITS support for JavaScript is a much requested (and sometimes attempted) feature of IDEs. So the challenge becomes: how can pseudo-static type information be inferred from the source of a dynamically-typed program?

var util = {
    foo: function() { alert("Hello, world."); }
};

util.

After I type that dot, an interactive type inference engine is pretty secure in suggesting foo as something I would want to say next, and that I would therefore want to follow that with (), because it’s a function. But how about:

var util = {
    foo: function() { alert("Hello, world."); }
};

um(); // <------------ added this line

util.

The function um (actually it’s a variable pointing to a function object, or is it a named property of the global object, i.e. window["um"]? Whatever…) may be defined in a different file. Maybe a file that gets downloaded from the server dynamically and materialized with eval. And it might do this:

var um = function() {
    util.foo = 5;
};

So in JavaScript there are only vague clues for a practical type inference engine to follow, so ITS doesn’t really work that well. It’s not a whole lot more than a gimmick. You can’t usefully impose static types on a very dynamic language as an afterthought.

As long as you don’t have to explicitly declare types, languages with a firm statically-typed basis are just much more convenient to use – completely contrary to most people’s assumptions. The big irony here is that in “scripting” activities, you consume types far more often than you define them, and so you could be consuming ITS like crazy. You just want to quickly write the right thing first time, using a lot of types that other people went to the trouble of declaring for you, without you having to check a bunch of documentation. So static type systems actually make more sense in scripting scenarios. Why is the world so back-to-front on this?

So there’s that. Then there’s the fact that we need serious help when writing concurrent, parallel or multi-threaded code. This is one place where “my program definitely works – I wrote unit tests!” just doesn’t cut it. Threads are all about timing. You may have covered all your code with tests, but have you covered every possible meshing intersection between operations that can occur simultaneously on 32 cores and might interfere with each other? Short answer: no, you haven’t.

A language that lets you attempt any operation at runtime feels freeing, and dynamic typing enthusiasts use words like “straitjacket” to characterise static typing. But parallel programming is simply not a tractable problem without rigidly enforced restrictions. Total freedom for the programmer means explosive degrees of freedom for the program, which then becomes an utterly uncontrollable situation for multi-threaded programs. So to be productive on very parallel machines, people are going to have to accept some restrictions on what they are allowed to do. And that makes it even more useful to have early feedback from your development tools.

Yes, you could perhaps write by hand a lot of unit tests that attempt to completely prove that you don’t break the formal set of rules that have to be followed in your safe parallelism model (whatever those rules happen to be). But surely you’d be crazy to do that kind of thing by hand. Much easier to let a formal system do it all for you. Just add a keyword to a method that means “this method must have no side-effects” and let your tools check check it for you. Why would it make sense to hand craft the necessary tests to check whether a method might ever have side effects? (Think about all the things in the universe you’d have to examine to see if they’ve changed..)

Or it may be that (as I’ve been hoping for about a decade) threads will die out utterly as a programming model, and instead we’ll have something like multiple isolated processes communicating only via explicitly declared channels. But then I think… for efficiency, you want to be able to pass immutable objects through such channels (without having to deep copy them), because that’s perfectly safe to do, so why not? So provable immutability is going to be important, and static type systems are surely the sane way to achieve that, aren’t they?

So on the whole, I think static is the way to go. But this is based only on pure reason, which has its critics.

I’m also surprised looking back at the old Wiki to notice my early interest in JavaScript and its relation to Self. I had been fascinated by Self (I mean the language, not my self, although I am rather fascinating) since stumbling across it in a web search in about 1996, and got a bit excited about the similarities when I began messing around with JavaScript, which became unavoidable once it was added to IE3 that year. So this little report on the SelfLanguage page would have been old hat to me by the time I wrote it:

The objects in JavaScript are rather like this. Objects are just associative arrays or maps, from names to members, and members can be functions as well as variables, allowing you to reassign them whenever. There is a standard prototype ‘slot’ (or property, in JavaScript terminology) that causes the object to effectively inherit features of another object (the value of that property.) So in reality, something rather like Self is most likely enabled in the Web browser you are reading this page with. 🙂 It might make sense to consider the experiences of JavaScript users if you want a picture of what large scale Self usage would be like! Probably not much fun. (Personally I would miss StaticTypeSafety…)

Doug Crockford has a slide in his very entertaining and still-ongoing Crockford on JavaScript presentation, which states that the first important discovery of the 21st century is that JavaScript has some good parts. I can assure you some of us were hip to that way back in the 20th century, daddio!

Advertisements
  1. Mikhail
    March 24, 2010 at 9:25 pm

    I find it disturbing that ThoughtWorks so strongly support Ruby for large projects, see for example this presentation: “Rails in the Large: How Agility Allows Us to Build One Of the World’s Biggest Rails Apps” http://www.infoq.com/presentations/ford-large-rails

    They never mention in their presentations that 80% of software cost is maintenance. I think they are building an enormous programming debt now, I feel sorry for folks which will have to refactor their code.

    • earwicker
      March 24, 2010 at 9:51 pm

      @Mikhail – I wouldn’t go that far… there is no reason why code written in a specific language should necessarily be a debt that will be hard to refactor. We can create a huge pile of crap in any language! As long as they are covering their code well enough with tests, they can refactor with confidence. Tests play the same role as static typing in terms of refactoring. As dynamic languages go, Ruby is a fine language – at least from my (so far) limited experience of it it seems very well equipped, and I have little difficulty following the intention of other people’s Ruby code. I that is a characteristic of good, high-level languages: not too much ceremonial distraction, just getting on with the job. And so easier to see the programmer’s intention.

      Also I should clarify that my point about thread safe coding is somewhat moot today, at least for Java and C#, because they don’t (yet) have support for declared purity and immutability in their type systems, so at this point they aren’t really any better than dynamic languages. That was more about the future.

  2. jalf
    March 25, 2010 at 5:24 am

    One thing I’ve noticed is that often, when people seem to prefer dynamically typed languages, what it really means is that they prefer languages with extensive type inference. It’s not really the type system being dynamic that makes the language so convenient, it’s simply *not having to specify all those damn types everywhere*.

    Of course, the C family of languages are all horrible at this, even with C#’s var or C++0x’s auto, but a lot of functional languages do an impressive job of this, eliminating the need for almost all type information from the code.

    I wish I still had the link to the source, but I read that at some MS seminar, the attendees were asked which was their favorite dynamically typed language, and the majority answered F#. 🙂

    Most people don’t really seem to distinguish between “dynamically typed” and “no need to explicitly specify types in source code”

    • earwicker
      March 25, 2010 at 10:01 am

      @jalf – Absolutely, that’s a frequent problem. Another error is believing that static typing means “total absence of types at runtime”, which it pretty much does in C, but certainly not in more modern languages (or more accurately, their standard runtime environments and APIs).

      So when you ask someone to provide an example of something they do in their favourite dynamic language to demonstrate its superiority, they very often mention some use of reflection, which of course is very frequently used in Java and C# (and can easily be used to simulate dynamic access to objects).

  3. April 3, 2010 at 4:59 pm

    I think Kant would have shared your disdain for the threaded programming model.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: