Sunday, October 24, 2010

A MXHR loader - Part 2 - How does it work

In the last post I described the overhead problem when transfering lots of files in your browser. The basic idea of Multipart XHR is to have a little dealer script (that is my personal denotation) which sits and waits for requests. A request could look like:

Javascript: "Hey dealer, deliver me foobar.jpg, crockford.jpg, example.html and test.js"

That dealer script is doing nothing else but opening, reading and concatenating the files content into one big chunk of data. For images it needs to encode the data as base64 since we are only transfering a "string".
One more exercise for that script is to "mark" each data block with a MIME-Type, so our Javascript knows how to interpret that incoming data. Last thing to do here is to sepperate each data chunk. Sounds easier than it is because we're reading and concatening all kind of data (maybe even binary data encoded as base64), so we can't just declare a slash (or whatever) as our boundary character. We need a dead sure delimiter. As it turns out, the prime vathers of ASCII will help us here.

"ASCII reserves the first 32 codes (numbers 0–31 decimal) for control characters: codes originally intended not to represent printable information, but rather to control devices (such as printers) that make use of ASCII, or to provide meta-information about data streams such as those stored on magnetic tape."

Sounds pretty sweet. And it's indeed a great way to separate a stream. in Perl and PHP you can just call

chr(1) . "some text";

That would concat "some text" to the first ASCII control character (SOH). In Javascript we can encode it as unicode character like

var fieldDelimiter = '\u0001';

That dealer script can be written in any language, since it is really doing trivial work. I've written versions for Perl and PHP only so far. Since I'm releasing the complete code on Github the time I release this post's, feel free to contribute. I'd love to see a Rails or NodeJS version.

That whole concept is not actually new. I believe one of the very first releases and experiments were done from the Digg User Interface Library guys. I picked up the topic from the book "High Performance Javascript" by Nicholas C. Zakas
So a ton of credits goes to these guys.

Back to topic. The reason for using MXHR instead of normal XHR is performance. It's just faster transfering n-files in one request instead of n-requests (see Part 1).
But if we're after performance, we should look for more! So now we're receiving one big chunk of data instead of several small ones. That data stream is delimited with ASCII control characters so we should try to do something useful with the data as soon as we got one complete chunk, instead of waiting until we fetched the whole data string. That is where the XMLHttpRequest "Interactive" mode comes in handy. So when receiving our data we do ask the browser to inform us when data has arrived. We can do this by checking the readyState of the XHR object. If it's on "3", the responseText should contain any data that was received so far, see The XMLHttpRequest Object from W3C.

All major browsers do support readyState3 nowadays. Older versions (especially Internet Explorers) might have some trouble on this. On that browsers we have to workaround this issue. But if the browser supports it, we gain even more speed on loading data on the client.

Part 3 will finally show some code snippets and the complete resource of "Supply", my MXHR loader script.

Thursday, October 21, 2010

A MXHR loader - Part 1 - The downside of XHR

This is going to be somekind of a tutorial how to create a Multipart XMLHttpRequest Javascript loader script. I don't think I will go trough all of it in only one post, so it's probably going to be a series.
I guess a pretty good start for this is talking about MXHR in general. So,

what is MXHR ?

Ajax had a major impact on modern web development. It revolutionized the fashion of websites in the way that we no longer needed to reload a whole website to load new data into it. So that was a big chunk of the such called "Web 2.0".
Anyway, what is this all about. To create a such called "AJAX request" we need a "XMLHttpRequest" object in Javascript. That object provides us all necessary methods and propertys. A classical example would look like:

var _msxml_progid  = [
                    'Microsoft.XMLHTTP',     // no readystate === 3 support
                    'MSXML2.XMLHTTP.3.0',    // no readystate === 3 support

xhr = (function() {
         var req;
         try {      
              req = new XMLHttpRequest();                
         } catch(e){                
             var len = _msxml_progid.length;
                     req = new ActiveXObject(_msxml_progid[len]);
                 } catch(e2){ }
         } finally { return req; }

Ok, to be honest that is a little bit more than a classical example, but it's probably the way a standard XHR setup should look like. If you don't understand what is going on here, a brief description:

_msxml_progid is an array with possible ActiveX XHR strings (used by Internet Explorer's of this world). I then call a self-invoking anonymous function to initialize the XHR object. I'm trying to create a "standard" XMLHttpRequest object, if that fails we are most likely in an Internet Explorer environment.
In that case, I'm trying to create an ActiveXObject with the strings of my array in reversed order.
The finally condition returns the newly created XHR object or undefined if we had no success.

So far so good. Where is the Multipart part you might wonder. Well it's not here yet. 

A big downside of an Ajax request is the overhead. The overhead consists of header information, cookies and other data beside the data you actually want to send or receive. That is a bad thing. Imagine we want to transfer five small HTML pages with Ajax requests, that would be a huge waste of bandwidth & performance due to the overhead each HTTP request creates.

But not only Ajax requests show this behavior. Let's assume you're displaying five images on your site. To do that, you place <img> tags in your markup. Guess what, each <img> tag will create a request to your server before loading the file you specify in the src attribute. In other words that method will also create overhead information. Of course that information is not completely unnecesarry, you're browser is telling the server what encodings, charsets, mime-types, etc. it accepts. The response from the server on the other hand, is telling your browser what data the server is actually going to send. So we need that information, but we don't need / want it for every single file!

Multipart XHR to the rescue

If there was a way, we could transfer multiple files in one request, that would be really great. That is the point were MXHR comes on the stage. It is actually doing exactly that, transfering multiple files over single XHR requests.
In a few days part 2 will follow with the topic - "How does it work?"

Tuesday, October 12, 2010

jQuery: Go for Gold!

I was pretty busy over the last weeks, which was the reason for not posting any content here.
Whatsoever, for this incidence I'm taking my time. You may know that I am a pretty active member of the Stackoverflow community. Yesterday I had quite an accomplishment, I earned the "jQuery Gold Badge". That makes me the 13th guy on the community holding that badge.

Not too shabby... :-)

Enough self-adulation for now. I will come up with a series of MXHR (Multipart XMLHttpRequest) posts soon. What that is? You will know in a few days.