Cross-domain XHR, Access-Control, preflight

It looks like my previous post about the browsers sending OPTIONS request instead of GET has nothing to do with Dojo, which got quite obvious as I saw Prototype is also behaving the same way. I’ve researched about the topic and here’s my insights.

It turned out that some new specifications were implemented in IE8, Safari 4, FF 3.5 and Chrome which allows you to do cross-domain XHR. Which means the pure JS implementation I have demonstrated wasn’t supposed to work at all unless this new specification was implemented. Here’s what the old XHR spec has to say about cross-domain (cross-origin) requests. Taken from http://www.w3.org/TR/XMLHttpRequest/#the-open-method

If the origin of url is not same origin with the XMLHttpRequest origin the user agent should raise a SECURITY_ERR exception and terminate these steps.

Not allowing cross-domain XHR was and is really a deal breaker and actually it pretty much stops you from implementing SOA (service oriented architectures) flexibly. But for some good reasons.

Here are a few theoretical scenarios:

  1. Imagine you are visiting attacker.com which serves a script that requests bank.com/?action=money_transfer&to=attacker&amount=999999. Assuming you have an active session with the bank, if your browser sends this request to bank.com along with the session cookie, attacker would be able to transfer money to himself. This is called CSRF (Cross-site request forgery)
  2. Imagine you are visiting attacker.com which serves a script that requests 10.0.0.50/confidential_intranet_document.html and sends it to himself via script. This means any client in the trusted LAN network might leak information from the LAN to internet.
  3. Imagine you are visiting trusted.com which happens to have a security hole so that the attacker can inject malicious code in its web pages. For instance, imagine you could embed Javascript in the messages in Facebook. When other users see that message and the Javascript code you injected works on their browser, you could read their cookies, hence steal Facebook session. This is called XSS (cross-site scripting).

Though there are other transport mechanisms, such as <script> element which is not restricted by this Same Origin Policy. These mechanisms were used instead of the obvious XHR method to achieve cross-domain requests so far.

There is a new specification being drafted to address these issues,  http://www.w3.org/TR/access-control/ which is the reason why OPTIONS request was being sent instead of GET in my previous post. The new spec says that it is OK to send a simple request (which is defined as GET, HEAD and POST) cross-domain as long as there’s no custom header in it. If these conditions are not met, there should be a preflight request to ensure that the domain we’re requesting the document from allows us to fetch it — much like Flash’s policy file.

Not the custom headers clause above. That’s the exact reason why Prototype and Dojo was causing an OPTIONS request instead of GET, where regular JS was simply sending GET request. Dojo and Prototype adds custom headers to the requests.

So you might ask; cross-domain XHR was not allowed for a good reason, why is it being allowed now ?

Yes, cross-domain XHR is allowed now, but apparently no different than cross-domain requests you can send via img or script elements. Remember that you could always do cross-domain requests with img element too, but img element has two features that makes it not a security problem:

  1. img only can send the cookies for the domain it is loaded from. i.e. it is hard to use a remote session since it won’t send the target site’s cookie.
    Consider the first scenario above. If the request does not include a cookie for the bank.com, there’ll be no session. It will be a anonymous request. (Of course unless the target site uses session ID as a part of the URL, and the attacker got that SID, which is very unlikely. And if he has the SID he’ll hijack your session all together anyway).
  2. You cannot read the contents of an img element, hence you cannot steal sensitive information which you aren’t supposed to read.

Now, I have demonstrated myself in my previous post that cross-domain XHR worked out fine. My server received the GET request. BUT in the client xhr.responseText was empty and xhr.status was 0 (not 200). It is true that the request was actually made, but you cannot read the contents of the resource. Here’s what access-control spec says about this in http://www.w3.org/TR/access-control/#requirements

  1. Should not allow loading and exposing of resources from 3rd party servers without explicit consent of these servers as such resources can contain sensitive information.

One of the requirements of the spec is not to expose resources without explicit consent. From what I understand, here, explicit consent means Access-Control-Allow-Origin header. If the third party server allows other hosts to read its resources via this header, everything will be fine. So, this means that the new XHR is no security hole bigger than the IMG itself.

In fact, I’ve tested this. It turns out that when you add this header to your resource, cross-domain XHR starts to work to the fullest. i.e. you can read the content of the requested resource, as in, it is readable in xhr.responseText.

For your information, you can add any headers to your resources with mod_header module of Apache httpd. Just add this directive for whatever directory you want;

Header set Access-Control-Allow-Origin "*"

Keep in mind that, this will expose all of your resources in that directory for anyone to read. So, do this if your resources are public anyway. Or just allow the hosts you want. It could be better to do this in the programming layer, such as PHP or ASP.NET.

So in conclusion, with the new access-control spec, XHR is pretty similar to the Flash’s security design. Browser checks if the third party host allows you to read your resources, if so your script is allowed to read it. Note that you can make the request anyway, but reading the resource is not allowed.

This is a nice step forward actually, but since it will take some time that majority of the market is using browsers implement this new spec, web developers are bound to use iframe or script transports for cross-domain request.

First Dojo impressions

I started implementing a daemon in Java. Essentially all our devices will connect to it and wait for commands over TCP/IP. Additionally it will offer a web service REST API over HTTP, so that administrators can send/receive data from the devices. This is basically a relaying architecture between devices and administrators. It overcomes any network topology problem (i.e. NAT traversals.) with performance penalty and bandwidth costs of relaying. Web service API is going to be totally self-sufficient, such that totally static HTML pages with Javascript can interact with it.

I’ve implemented the skeleton of the daemon, devices connect to it and it also offers web service API. I tested the web service API by simply requesting the web service URL from the browser as I would do any other URL and confirmed that the correct (JSON) response is given. It was time to see how it is to build a web UI for it. Given its reputation and apparent support, I chose Dojo to implement the web UI with totally static HTML pages. Here’s my experience.

Documentation

From the main Dojo site, it was stated that latest stable release was 1.4.0, and it was the default download link. So I grabbed it. By looking at an example in the demo section, I get an Ajax query working in seconds, only to find out that it is not working for me. Instead of a GET request it was sending OPTIONS request, more on this later. Obviously, I wanted to look at the documentation. Clicking on the Documentation link on the main Dojo site takes you to a place where documentation for 1.4.0 is not offered.

Luckily there were a handful of helpful folks in #dojo @ irc.freenode.net, whom told me that new documentation UI is on its way.

The first one was inaccurate by the time I’m writing this Dojo.xhrGet property list was quite short. I found doc-staging to be more accurate and since it is documentation rather than just reference it also offered much more detail.

Dojo.xhrGet results in OPTIONS request instead of GET

The firs thing I’ve tried with Dojo was obviously the Ajax API.

function getText() {
  dojo.xhrGet({
    url: "http://localhost:8182/hello",
    load: function(responseObject, ioArgs){
      return responseObject;
    },
    error: function(response, ioArgs){
      dojo.byId("toBeReplaced").innerHTML =
        "An error occurred, with response: " + response;
      return response;
    },
    handleAs: "json"
  });
}

This code snippet is taken from Dojo examples which can be found in the official web site. I removed the content of the first function though, it was supposed to do DOM manipulation obviously.

When I run this code, I noticed OPTIONS request in my daemon’s logs. When I was requesting the same URL by writing it to the address bar of the browser all I see was GET requests in my logs, as expected.

Then I’ve tried a pure JS implementation.

var client = new XMLHttpRequest();
client.onreadystatechange = handler;
client.open("GET", "http://192.168.1.94:8182/hello");
client.send();

With this simple implementation I started seeing GET requests in my server as expected. So Dojo should be doing something in a different way. I walked through the Dojo code, thanks to Firebug.  But it turned out that the code is indeed very similar to the regular JS code and there were no obvious bugs, as I expected. Then, I examined the HTTP requests via invaluable Wireshark.

Here’s what I got for Dojo request.

OPTIONS /hello HTTP/1.1
Host: 192.168.1.94:8182
Connection: keep-alive
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_2; en-US) AppleWebKit/532.5 (KHTML, like Gecko) Chrome/4.0.249.49 Safari/532.5
Cache-Control: max-age=0
Access-Control-Request-Method: POST
Origin: file://
Access-Control-Request-Headers: X-Prototype-Version, X-Requested-With, Content-type, Accept
Accept: */*
Accept-Encoding: gzip,deflate
Accept-Language: tr-TR,tr;q=0.8,en-US;q=0.6,en;q=0.4
Accept-Charset: ISO-8859-9,utf-8;q=0.7,*;q=0.3

And here’s what a regular JS XHR looks like.

GET /hello HTTP/1.1
Host: 192.168.1.94:8182
Connection: keep-alive
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_2; en-US) AppleWebKit/532.5 (KHTML, like Gecko) Chrome/4.0.249.49 Safari/532.5
Cache-Control: max-age=0
Origin: file://
Accept: */*
Accept-Encoding: gzip,deflate
Accept-Language: tr-TR,tr;q=0.8,en-US;q=0.6,en;q=0.4
Accept-Charset: ISO-8859-9,utf-8;q=0.7,*;q=0.3

Obviously the difference is Access-Control-* properties. I tracked down the source of this was the lines from 10474 to 10477 of dojo.js

 // FIXME: is this appropriate for all content types?
10474 xhr.setRequestHeader("Content-Type", args.contentType || _defaultContentType);
10475 if(!args.headers || !("X-Requested-With" in args.headers)){
10476 xhr.setRequestHeader("X-Requested-With", "XMLHttpRequest");
10477 }

Funny thing, there’s a FIXME there :) Anyway, when I commented out these lines the request headers were the same for both pure JS Ajax implementation and the Dojo Ajax implementation. And both are now sending GET requests as expected.

I consulted the folks at #dojo about this problem and at first they couldn’t create the problem. Than I’ve stated that the web UI is hosted at X and the web service API of the deamon was at X:P. So it is a cross-domain request. To be precise the script was hosted at http://192.168.1.94/test.html and web service API of the daemon was available at http://192.168.1.94:8182/hello. sfoster from #dojo generously spent some time to test this situation and he also confirmed that OPTIONS requests were being sent. Apparently when Access-Control-* headers are set and it is a cross-domain request browsers decide to send OPTIONS request instead of GET. This is tested with Chrome and Firefox.

Though I believe Access-Control-* properties are there for a good reason. This same problem could also be demonstrated on prototype javascript framework, apparently they are taking the same approach on this.

I’m not sure what is the best practice on this yet, I’ll try to consult some core Dojo developers about this and figure it out.

Application deployment in jailbreak iPhone 3.1.2 with Xcode 3.2.1

I recently got a Macbook to develop an application for our company to show off at the industrial automation fair this year. I’ll probably post about the project later, if I can manage to build it.

Since I have no intention to put any application on App Store, and I don’t want to wait for the approval process, I decided to deploy my application on a jailbroke iPhone. Here’s how:

  1. Visit blackra1n.com and jailbreak your iPhone.
  2. Follow these instructions carefully. (backup)
  3. Above instructions does not mention that you have to select the certificate you have just created explicitly in Xcode. You can do it by selecting your project in “Groups & Files” pane, then hitting Command + I. This will pop up Get Info window for the project. In the Build tab, Code Signing category select the certificate you have just created.
  4. Now you can get the infamous “No provisioned iPhone OS device is connected.” error as I did. With the inspiration from this article (backup), in Xcode I did Window -> Organizer and selected my iPhone and clicked Use for development.
  5. Now I’m able to deploy my apps to the jailbroke iPhone.

Final words… I must say, in contrast to this, programming a Windows Mobile device is as easy as plugging the device to your computer and clicking debug button.

Even though Apple’s intention to strictly supervise the applications going into App Store is a good choice (because you don’t have crap-ware that cripples your device as you’d see in Windows platforms), restricting one from programming his own device is plain stupid.

NOTE: Above method is a pain in the ass and it does not support build-and-go/debug feature of Xcode — though there are documentation that explains how to do it. I end-up buying a subscription for $99, the whole process took 16h, and I had to fax a signed document to Apple. So if you are in a region that you can do subscription online, you’d be done in much shorter time.

Conclusion: Buy a subscription :)

Fikifiki – Very simple sudoku solver in C

It was like 4 months ago. I was waiting for something indefinitely in a hospital. Luckily though I had my old cute iBook with me, which includes a gcc in it! Even Eclipse! Then I saw the sudoku puzzle in the papers. So I quickly coded a sudoku solver in C in a couple of hours. I could have had added many algorithms in it, but I just added the most simplest one and it surprisingly worked in my first try :) This one simple algorithm is able to solve easy leveled sudoku puzzles. Though one can add as many algorithms as necessary. Everything is 655 lines of C code — with all the formatting and the comments (if any). Here’s the code. This will probably be used by some lazy ass students  :)

You can compile by either invoking “make” in Release or Debug directories, or just import the project in Eclipse and enjoy there.

Macros in C/C++, the right way.

When used appropriately macros are very useful, yet they are very easy to misuse. Before getting into cons and pros, first lets make sure if we really know what is a macro and how does it work.

Roughly there are three stages of compilation in C; preprocessing, compiling, linking. It really makes sense and it is very easy to understand, don’t think of this as a complex deal. We’ll mostly talk about the preprocessor stage in this post. As the name implies, this stage just pre-processes the source code before compiling it, it all happens prior to compiling and that’s it. Preprocessing includes defining some macros, and replacing each macro found in the source code with its value, so that it can be compiled. So get this right once and for all, macros are processed before compilation, they won’t be in the run-time code.

As an exercise you can use -E parameter of GCC, which will make it stop right after the preprocessing stage.

Now, let’s begin with a few examples;

#define ERROR_SEEK 152
if( errno == ERROR_SEEK )
{
    // Handle the error.
}

This directive defines a macro ERROR_SEEK. Preprocessor will replace every occurrences of ERROR_SEEK with 152 prior to compiling. So, the code that is going to be sent to compiler will have “if( errno == 152)” not the ERROR_SEEK because the compiler simply does not know what ERROR_SEEK is. Nothing fancy, dead simple. But it makes the code much more readable. Look at this example;

#ifdef __GNUC__
#define COMPILER "gcc"
#else
#define COMPILER "proprietary"
#endif

If __GNUC__ is defined somehow then every occurrences of COMPILER will be replaced with “gcc” and “proprietary” otherwise. Please note that this all happens before compilation, and this will not result in any executable code. In this special case __GNUC__ is defined by GCC itself. Though we can define a macro via #define directive in the source code, or with -D parameter of the compiler (which will work on most compilers).

Actually the name preprocessor says it all. All these preprocessor directives are pre processed before compilation. Makes sense uh ? Now I think we begin to understand the nature of macros/preprocessors.

Macros can be used to

  • Improve readability (ERROR_SEEK is much more meaningful than 152)
  • Improve maintainability (When you change ERROR_SEEK in one header, it will be replaced all over the source code)
  • Ensuring that a block of code is inlined (more on this later)

Though, if misused, the first two list items above will do  just the opposite :)

As for the possible disadvantages of preprocessors

  • Could make debugging harder as lines of source lines before and after the preprocessing will be different, so debugger can be confused while stepping the executable code and matching which source line of code it is.
  • It is easy to misuse preprocessors.
  • Could cause trouble to static code analyzers.

Now, possible misuse scenarios:

1. Operator precedence

The most common misuse scenario which you’ll see in almost every C book is this;

#define FOO_CONST 83+22
printf("%d\n", FOO_CONST * 5 ); // This macro will expand to 83 + 22 *  5, hence will result 193.

One could expect the result to be printed 525. But operator precedence would make the calculation 22 * 5 first, then add 83 to the result, so you’ll see 193 as a result instead of 525. Fixing this problem is easy;

#define FOO_CONST (83+22)
printf("%d\n", FOO_CONST * 5 ); // This macro will expand to (83+22) * 5, hence will result 525.

Enclosing the value of a macro is an easy way to ensure that they are evaluated in the right order.

2. Multiple statements

You can use multiple-statement macros to force inlining of a code block. Though note that this will increase the code size of your program. Anyway, the problem with the multiple-statements is a less known problem. Now, look at this.

#define HELLO(X) printf("Hello "); printf( X "\n" );

The problem with this code is, if you want to conditionally run this code with “if” this code will not do what you intend to do.

if(0)
    HELLO("world"); // You'd expect that this HELLO() is never "executed".

Above code will expand into

if(0)
    printf("Hello ");
printf( "world" "\n" );

As you can see in the above example only the first statement in the macro is conditionally executed, this is not what we intended for. Above code will always print “world\n”.

So, first thing comes to mind is to put them in curly brackets. Now let’s evaluate that.

#define HELLO(X) { printf("Hello "); printf( X "\n" ); }

Above code looks fine at first glance, but it will also introduce a sneaky bug that will result in compilation error. Now imagine above macro is used like this;

if(1)
    HELLO("world");
else
    HELLO("baby");

This will expand into;

if(1)
{
    printf("Hello ");
    printf("world" "\n");
}; // WE GOT ERROR HERE
else
{
    printf("Hello ");
    printf("baby" "\n");
};

So, here’s the trick which works perfectly. It is the do{} while(0) trick. OK, let’s see.

#define HELLO(X) do { printf("Hello "); printf( X "\n"); } while(0)

You can use this as you please, imagine the if else scenario.

if(1)
    HELLO("world");
else
    HELLO("baby");

This will expand into;

if(1)
    do { printf("Hello "); printf( "world" "\n" ); } while(0);
else
    do { printf("Hello "); printf( "baby" "\n" ); } while(0);

So do {} while(0) does not break when you place a semi-colon after it and also has a scope. So it’s a ideal for making sure your multiple-statements are executing as you expected.

Well, I guess that’s all I’ve got to say.

PENSE – oPEN Simulation Environment

Now, that I got a IDE/SATA USB case, I started looking at my very old HDDs. I found very old codes of mine, this is one of them. PENSE (oPEN Simulation Environment) was my thesis project. It is a framework which you can use to implement simulation easily. I wrote it in C++. Only dependency is GNU’s libmatheval to implement algorithms out of mathematical expressions easily. I even wrote documentation in LaTeX! :)

Anyway here’s libpense and pensedemo. Please note the autoconf masterpiece in the libpense :) it was a bitch to get it working but once it is working… well, it works. I remember compiling these codes on WIN32, Mac OS X and GNU/Linux without a single problem. Yes, I was young and stupid. I developed this on GNU/Linux :)

Oh, the documentation in LaTeX, PDF and the presentation in PPT format is available. Also there’s a reference manual for libpense, I guess I just had too much time :)

A-hem, and you have to excuse any lameness you can spot, since this is a 4-year old code ;)

A sample code from pensedemo;

        Environment env;
	Device::Source::PWM pwm( "pwm", &env );
	pwm.setOn( true );
	pwm.setFrequency( pwm_freq );

	Device::Source::VoltageSource vs( 0, 4.8, "voltage source", &env );
	vs.setOn( true );
	vs["output"] = 4.8;

	Device::Plant::DCMotor motor( "Maxon_118465", &env );
	motor.setLoad( "0.0" );
	motor["J_r"] = 0.0000000503;
	motor["k_n"] = 252.374609;
	motor["I_o"] = 0.029;
	motor["V"] = 0.0;
	motor["R"] = 2.16;

	Device::Controller::FuzzyLogic f( 3, "fuzzy logic controller", &env );

	f.setSetPoint( set_point );
	f.setInputDomainWidth( 5 );
	f.setOutputDomainRange( 0, 100 );

        // This is where we connect the devices together to form a feedback loop.
        // We connect the PWM controller to the Voltage Source so that PWM can turn
        // the VS on and off. Then we connect the voltage source to the DC Motor, so
        // that it can, well, run. Then we connect the angular velocity parameter of the
        // motor to the Fuzzy Logic controller, so that it can adjust the PWM controller
        // and control the speed of the motor.
	connect( &pwm, "output", &vs, "on" );
	connect( &vs, "output", &motor, "V" );
	connect( &motor,"w", &f, "input" );
	connect( &f, "output", &pwm, "duty" );

Some declaration details for pointers in C/C++

First of all let’s remember what a pointer is. A pointer is an address pointing to an object. “Address” term is open for debate, some pedantic assess would argue against it.

So let’s start giving examples.

const char* str ="Foo";
str[0] = 'B'; // error: assignment of read-only location
str = "Bar"; // Perfectly OK

So, it is obvious that first line declares str as n pointer pointing to a constant object and the object is not modifiable but the pointer itself is.

Let’s review another example.

char * const str = "Foo";
str[0] = 'B'; // Perfectly OK (See the node below)
str = "Bar"; // error: assignment of read-only variable

Here we can see that the declaration tells compiler that the pointer value (address) is constant but the object it is pointing to can be modified. Please note that even though the above assignment “str[0] = ‘B’” will compile just fine, it will most probably result in a segmentation fault during runtime since your compiler will most likely put it in read only memory region.

Now let’s make another example.

const char * const str = "Foo";
str[0] = 'B'; // error: assignment of read-only location
str = "Bar"; // error: assignment of read-only variable

This declaration tells the compiler that neither the object the pointer is pointing to can be modified, nor the pointer value (address) itself. i.e. nothing is modifiable :)

Conclusion

So, we can conclude that qualifiers (const, volatile) that are placed before the * applies to the object the pointer is pointing to, and qualifiers placed after the * applies to the pointer value itself, i.e. the address.

So we can have weird looking declaration like this one:

extern volatile const char * volatile const str = "Foo";

I know it looks a bit unusual but it is perfectly valid.

Note: For hardware programming volatile keyword is a must. If you don’t know what it is, read this article at wiki.answers.com first. Here’s the backup, just in case.