Sunday, October 24, 2010

A MXHR loader - Part 2 - How does it work

In the last post I described the overhead problem when transfering lots of files in your browser. The basic idea of Multipart XHR is to have a little dealer script (that is my personal denotation) which sits and waits for requests. A request could look like:

Javascript: "Hey dealer, deliver me foobar.jpg, crockford.jpg, example.html and test.js"

That dealer script is doing nothing else but opening, reading and concatenating the files content into one big chunk of data. For images it needs to encode the data as base64 since we are only transfering a "string".
One more exercise for that script is to "mark" each data block with a MIME-Type, so our Javascript knows how to interpret that incoming data. Last thing to do here is to sepperate each data chunk. Sounds easier than it is because we're reading and concatening all kind of data (maybe even binary data encoded as base64), so we can't just declare a slash (or whatever) as our boundary character. We need a dead sure delimiter. As it turns out, the prime vathers of ASCII will help us here.

"ASCII reserves the first 32 codes (numbers 0–31 decimal) for control characters: codes originally intended not to represent printable information, but rather to control devices (such as printers) that make use of ASCII, or to provide meta-information about data streams such as those stored on magnetic tape."

Sounds pretty sweet. And it's indeed a great way to separate a stream. in Perl and PHP you can just call

chr(1) . "some text";

That would concat "some text" to the first ASCII control character (SOH). In Javascript we can encode it as unicode character like

var fieldDelimiter = '\u0001';

That dealer script can be written in any language, since it is really doing trivial work. I've written versions for Perl and PHP only so far. Since I'm releasing the complete code on Github the time I release this post's, feel free to contribute. I'd love to see a Rails or NodeJS version.

That whole concept is not actually new. I believe one of the very first releases and experiments were done from the Digg User Interface Library guys. I picked up the topic from the book "High Performance Javascript" by Nicholas C. Zakas
So a ton of credits goes to these guys.

Back to topic. The reason for using MXHR instead of normal XHR is performance. It's just faster transfering n-files in one request instead of n-requests (see Part 1).
But if we're after performance, we should look for more! So now we're receiving one big chunk of data instead of several small ones. That data stream is delimited with ASCII control characters so we should try to do something useful with the data as soon as we got one complete chunk, instead of waiting until we fetched the whole data string. That is where the XMLHttpRequest "Interactive" mode comes in handy. So when receiving our data we do ask the browser to inform us when data has arrived. We can do this by checking the readyState of the XHR object. If it's on "3", the responseText should contain any data that was received so far, see The XMLHttpRequest Object from W3C.

All major browsers do support readyState3 nowadays. Older versions (especially Internet Explorers) might have some trouble on this. On that browsers we have to workaround this issue. But if the browser supports it, we gain even more speed on loading data on the client.

Part 3 will finally show some code snippets and the complete resource of "Supply", my MXHR loader script.

No comments:

Post a Comment