Home > Uncategorized > Dao De Web (or: The Painful History of the URL)

Dao De Web (or: The Painful History of the URL)

Think of an application as being like a building. It has several rooms, and the URL tells you which room you’re in:

  • reception
  • accounts-dept
  • meeting-rooms/1
  • meeting-rooms/2
  • bathroom

This is the ideal way to use the URL because it will fit perfectly with how the user expects the browser to work. The URL tells the application which room to display, and so the user can:

  • Press the Back button to go back to the room they were in just now.
  • Press the Forward button if they change their mind again.
  • Bookmark (or “favourite”) the rooms they visit most frequently.
  • Look at their History to see which rooms they’ve been in.
  • Press the Refresh button to see an up-to-date view.
  • Copy the URL from the address bar, paste it in an email and send it to a co-worker to say “Look at this!”
  • Click a link (an underlined piece of text concealing a URL) to get to a different room.
  • Right-click and choose Open in new tab to see a room on a different tab in their browser.

The above operations are the core features of the web browser. People expect them to work, and they are part of the reason why web applications have an advantage over custom thick client software. The mantra should always be:

If you want to know where you are, look at the URL

The interesting thing is that in many interactive web-based applications, including some of the most widely-used commercial sites on the Web, URLs are not used in this way, and they break much of that idealised functionality.

State

In the real world we learn to expect our environment to be stateful. Suppose you go up to your desk and arrange things to your liking, and then you turn away for a second. But it’s a magical desk: when you turn back to look at it again, it has “reset” itself! All the things you moved have somehow moved back to their original positions.

That would be usually very counter-intuitive and unhelpful (not to mention scary). So there’s nothing more annoying than an application that plays dumb and doesn’t remember what you were doing with it five minutes ago.

On the other hand, imagine checking into your hotel room only to find that the room is still in the state that the previous guest created: wet towels on the floor, bed not made, and so on. There should be a strict boundary between hotel guests so they don’t have visibility of another guest’s room state. So state has to have a well-defined and limited scope during which it is applicable.

So, can the URL store the state of a web application? Obviously there’s more to the state of a building than the name of the room you’re currently in. It is possible to encode various other pieces of information into plain text and then append that to the URL. But would that be a good idea?

The Accidental Time Machine

Consider an online store. Suppose when you visit the page that shows you what’s in your cart, the URL was:

shoppingCart?contents=12,8,442,23

That list of numbers contains the product codes of all the things you put in your cart. Each time you add a product to your cart, the URL changes, and the previous URL is added to your browsing history. And this in turn means that your browsing history contains previous versions of that list of purchases.

So that’s a crazy implementation! When you press the Back button to go back to your shopping cart, it won’t show you what it contains now. Instead it will show you what it contained at some previous time.

So instead of the Back button allowing you to retrace your steps, it allows you to move backwards through time! Or to put it less outrageously, it sort of works a bit like an Undo button.

You may have encountered this problem in real online stores; one example for many years was play.com, although it is fixed now.

The Receptionist Fired Twice

An even worse problem occurs when the URL in the address bar represents a command. Suppose you want to go into reception and fire the receptionist. If that was represented on the address bar:

reception?fireReceptionist=true

The page comes back telling you that you successfully fired the receptionist. You then visit a meeting room to tell them the news. While you’re in that room, a new receptionist is sent by the agency.

Then you hit the Back button to go back to reception, and… oh dear. You just fired the new receptionist as well! When you go back in the browser history, the URL is sent to the server exactly the same as the first time, and the server obeys it. This is even worse than the time machine bug. So applications have to employ all kinds of trickery to stop commands being obeyed more than once.

You may have seen some sites that plead with you not to press the Back button during some process – issuing a credit card payment, for example. It’s obviously not a great experience for the user, but for many years it was the best that could be done in some situations.

Another Page Expires

The most common problem of all is caused by POST. This is another way to send requests to the server, favoured because it allows a larger amount of data to be sent, even large files, but it has a very unfortunate side-effect. It puts an entry into the browser’s history that doesn’t work properly. When the user presses their Back button, they typically just get a confusing error message:

The page you requested was created using information you submitted in a form. This page is no longer available. As a security precaution, Internet Explorer does not automatically resubmit your information for you.

This still happens today on Amazon when you try to back out of buying whatever’s in your shopping cart. One workaround favoured by experienced users is to right-click on the Back button (click and hold on the Mac) to reveal a popup menu of recent history, so you can skip the problematic entry in your history and instead go to the page before that one. But you can’t do that on the iPad. In any case, instead of being a helpful UI, the browser has been transformed into a puzzle to be solved.

A large proportion of web applications have historically suffered from these and other issues. It’s only in the last five years or so that a pattern has emerged that allows us to build dynamic applications and yet still consistently deliver the original intended user experience of the web.

Some Modern Websites

Next time you look at a website, look closely at the URL in your browser’s address bar. Here’s what the Gmail address bar looks like:

And here’s Twitter (when you’re signed in):

The special ingredient these have in common is the hash symbol: #. That marks a division in a URL. Everything on the left, before the hash, is an instruction to the remote Web server, and so is interpreted on the server side. Everything on the right, starting with the hash, is never used by the server.

In normal Web development terminology (which can be confusingly overloaded) the part of URL starting at the hash symbol is also known as “the hash” of the URL. So in the Gmail example, the hash is:

#inbox

If the server is going to ignore this part of the URL, why is it there?

A Brief History of Hash

The original purpose of the hash (and still a perfectly good use for it) was to allow the creation of links to other places within the same document. For example, visit this page on Wikipedia:

http://en.wikipedia.org/wiki/Web_browser

It has a contents box (right) that just contains a few links to sections of the document. When you click the Features link, the browser’s address bar changes to:

http://en.wikipedia.org/wiki/Web_browser#Features

The key point is that the browser doesn’t contact the server when the hash changes or is added. That part of the URL is of no interest to the server, so there’s no need to contact it again. By default, all the browser does when the hash changes is scan through the HTML looking for an element that has an ID that is the same as the new hash, and scrolls the page so that the identified element is at the top of the window.

If there is no element in the document with that ID, it doesn’t do anything. So you can edit the hash to say anything you want, and your browser will just ignore you.

But otherwise, hash URLs are treated just like server URLs: they go into the browser history. You can press the Back button to revisit then, bookmark them, and so on.

This is the key to solving the various issues with the browser interactive applications. The hash part of a URL is for linking to things within the current document, instead of linking to other documents.

But now suppose you were to accept the following astonishing principle:

An interactive application is a single document.

Therefore it follows unavoidably that to support navigation within the application, you would use only hash URLs. That is, you would only vary the stuff after the hash symbol. Every navigation performed by the user within the application will not directly result in a round-trip to the server. Something else has to happen instead.

The default behaviour, scrolling to an element with the specified ID, isn’t much use generally speaking. But with JavaScript we can respond to a change in the URL in any way we like.

Detecting a Hash Change

Newer versions of many browsers have a built-in scripting event that is raised whenever the URL changes. In less capable browsers it is necessary to start a timer that will periodically check the URL to see if it has changed. Of course, it’s simple enough to hide this inside a jQuery plugin, and sure enough, here’s such a plugin.

For even greater convenience, you will want to bind different handlers to a specific URL pattern. You could do that by handling the hashchange event and then parsing the URL manually, or you could use a library to do this for you. For this I’ve found sammy.js to be excellent. For example, you can register a handler to render meeting rooms:

this.get('#/meeting-rooms/:roomNumber', function(context) {

Within the handler, you can access context.params.roomNumber to get the variable part of the URL, because any path segment prefixed with a colon denotes a parameter.

The remaining question is: now that we’ve utterly eliminated the server from this picture, how do we bring it back?

Ajax To the Max

The answer is that we call it when we want to. The browser doesn’t dictate anything. Calls to “the backend” are decoupled from the UI, and it’s entirely up to us how we re-couple them.

The short answer is simple and obvious: when you need some more data, or you need to update some data stored on the server, you make a call. You may even find it convenient to make multiple calls in sequence or in parallel, depending on how much control you have over the backend’s API “surface”.

If you sign into Twitter and use a logging proxy like the wondrous Fiddler, you will see that once the main page of the application has been downloaded, practically all subsequent calls back to the server are to api.twitter.com and they return very clean JSON data structures, rather than HTML formatted on the server. All the updates to the page are performed in the client-side JavaScript.

And this allows the user experience to take on a new slickness – it isn’t necessary to discard and rebuild the whole page. You can update some patch of it, and perform an animated transition. There are suddenly a lot more possibilities for making your application more modern, fresh and engaging, and they are easy to do, especially with jQuery and its army of plugins – the Web is full of examples, because jQuery is so widely used already.

What About Search Engines?

The scope of this kind of architecture is limited by a simple fact: search engines (which is to say: Google) will not execute JavaScript code during their indexing processes, and so will not follow your hash-links.

So if you want Google to index your site, you either cannot use this approach, or you’ll need to develop a basic old-school version of the site so that Google can index it. It doesn’t have to be as snazzy or helpful as the JavaScript version – it just has to serve up all the indexable text.

Of course, a great many web applications are behind an “authentication wall” – the user must sign in before they can see anything. Such applications don’t want Google to index their data, and so they don’t need to worry about this.

Another possibility is that you may want to support users who suffer from the same limitation as Google’s indexer – they don’t want to execute JavaScript and choose to disable it. It’s up to you whether you see that as a market you need to serve, of course. It’s a rapidly shrinking market – and in fact some newer browsers don’t even have an option to switch JavaScript off.

Much more important is the issue of accessibility: how does a blind person access your site? But this is not actually solved automatically by the classic web model – you have to design accessibility. And you can do this for a heavily JavaScript-based application too.

The unfortunate truth is that most companies neglected accessibility when working under the classic web model, and they’ll probably continue to neglect it in the future, but this is an underlying problem that probably cannot be solved by technology.

The Unnameable Way

The term AJAX was introduced to describe an early variant of this pattern: the X stood for XML, which has nothing to recommend it nowadays. And when most people say AJAX, they are probably still talking about a traditional web application that contains a few specific uses of scripted calls to the server, but otherwise suffers from the flaws we identified earlier.

By fully embracing the pattern, those flaws can be solved. And to be clear, the pattern is:

  • A single static .html document establishes the permanent structural framework of the application, and pulls in the script that registers handlers for certain hash URL patterns.
  • Those handlers update the UI to reflect the location specified in the current hash URL, probably by making calls back to a clean JSON-based API exposed at the server.
  • State that is not related to navigational location never appears in the URL, and nor do commands that have side-effects on state, because URLs can be revisited via the browser’s history features.

Naturally there’s an ironic footnote to this story.

The Ironic Footnote

Back in the early 1990s, people wrote things that were grandly called “client-server applications”. In practise, this meant a Visual Basic UI with an RDBMS storing the data. Microsoft dominated the desktop, and it seemed only a matter of time before they would dislodge Oracle from the server side as well.

The web seemed to blow this model out of the water, because people were – quite rightly – impressed by the browser as a universal window to the information world and began to view it as something that applies to everything. The old model was derided as “thick client”, in contrast to the much-admired “thin client” represented by the browser.

In truth, the client server approach was evolving towards a “not-too-thick” client model, because the RDBMS was a fairly capable development platform itself. Business logic could be moved into stored procedures. Even so, that implied an unnecessary dependency on a highly proprietary platform, and was generally avoided for that reason.

One proposed alternative, prior to the Web, was the business logic server, embodied in products like Microsoft’s Transaction Server (MTS). There was much talk of tiered applications: the client (in Visual Basic, of course), the business logic in MTS, and the data layer in SQL Server. But again, it was obviously proprietary, and it was hard to see how it might fit into the all-conquering Web, so it was swept into the dustbin of history.

The LAMP stack was all-conquering! Or so it seemed.

The strange truth is that the built-in capabilities of the browser actually did not make it at all easy to support the browser’s navigation model in an interactive application. In fact it made it very easy to break it. If you wanted a nice browsable UI, it would have been easier to just add Back and Next buttons to your crummy Visual Basic application!

Where we have now arrived, the world doesn’t look too dissimilar to the proposed pre-Web model. The vital difference is that it is a victory for the vendor-neutral approach:

  • There is now a great variety of ways to store data, not all of them SQL-based.
  • The business logic API is exposed by any Web server, simply so it can be called directly in standard ways from any browser.
  • And within the browser we can write “not-too-thick” clients that present an ideal user interface into the business logic.

We took quite a tortured route to get to the destination. But we’re there now. Phew.

About these ads
Categories: Uncategorized Tags: ,
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: