Apr 9, 2011

REST and Stateless Session IDs

Nowadays there's a general reluctance to introduce (more) server-side session state because of scalability. And there's specific reluctance to session state in RESTful web services, due to design principles.

In the stateless requirement of REST we read:

"The client–server communication is constrained by no client context being stored on the server between requests. Each request from any client contains all of the information necessary to service the request, and any session state is held in the client. The server can be stateful; this constraint merely requires that server-side state be addressable by URL as a resource." [Wikipedia]

This is a tough requirement, especially if we want features such as authentication and sessions.

So, can we have session ids without server-side session state? Yes.

The Relation Between Sessions and Authentication
There's often a tight relationship between authenticating users and holding their sessions. Anonymous sessions are not very sensitive whereas authenticated sessions have to be protected against hijacking, fixation, forging, and replay. Actually, a valid session token authenticates the session, so you're basically authenticating yourself every request. Which leads us to the first stateless session solution ...

No Sessions => Authenticate Every Request
If session ids are in fact authentication tokens we might as well use the mental model of no sessions, instead authenticate each request. The old HTTP Basic Authentication does this by storing your username and password for subsequent requests. But there are more advanced versions such as authentication in Amazon's S3 REST API.

They use a custom HTTP scheme based on a keyed-HMAC (Hash Message Authentication Code). To authenticate a request, you first concatenate selected elements of the request to form a string. You then use a shared "AWS Secret Access Key" to calculate the HMAC of that string, i.e. you sign the request. Finally, you add this signature as a parameter of the request.

GET /photos/puppy.jpg HTTP/1.1
Host: johnsmith.s3.amazonaws.com
Date: Mon, 26 Mar 2007 19:37:58 +0000


Authorization: AWS 0PN5J17HBGZHT7JJ3X82:frJIUN8DYpKDtOLCwo//yllqDzg=


Looking deeper into how this scheme works you find the following spec:


Authorization = "AWS" + " " + AWSAccessKeyId + ":" + Signature;


Signature = Base64( HMAC-SHA1( UTF-8-Encoding-Of( YourSecretAccessKeyID, StringToSign ) ) );


StringToSign = HTTP-Verb + "\n" +
________Content-MD5 + "\n" +
________Content-Type + "\n" +
________Date + "\n" +
________CanonicalizedAmzHeaders +
________CanonicalizedResource;


Canonicalization of course requires some processing such as converting headers to lower-case and sorting them lexicographically. The date works as a timestamp and narrows the replay window.

The server then does the same signing with the shared secret associated with the AWSAccessKeyId. So we're switching from server-side session state to more cycles and latency on both client and server.

Worth noting:
  • Signed requests are much stronger than mere session ids. Cross-site request forgeries will be mitigated with this scheme.
  • By authenticating all requests with a shared secret we don't have any time-bound sessions or timeouts. Just fire whenever you want.
  • The persistent shared secret is much more sensitive than a temporary session id. A cross-site scripting attack will steal the shared secret which is much worse than session hijacking. This means the scheme is less suitable for browsing sessions and more suitable for machine-to-machine communication.

Stateless, Hashed Session ID and Salt
The server can generate session id cookies by hashing usernames and a random global salt:

sessionIdCookie_v1 = username ":" SHA256(username + global salt)

The salt is used for all sessions but only valid for a certain timeframe, say 15 minutes. A new salt is produced every 5 minutes and incoming session ids produced with the previous but still valid salt will be exchanged for a new session id with the fresh salt. That means a session timeout of 15-5=10 minutes.

If we truly want to go stateless we cannot kill such a session since that would require a server-side table of revoked session ids. So in the stateless case an attacker will have a 15 minute replay window in which he/she will refresh the session and have endless access.

Stateless, Encrypted Session ID
By just storing a server-side symmetric crypto key we can effectively decrypt incoming session IDs and trust their contents. Imagine a cookie based on:

sessionIdCookie_v2 = AES_GCM(128 bit key, auth tag, username + timestamp)

This means we don't have to store each session ID. Instead we pay the price of decrypting incoming cookies and checking that the timestamp is within a timeframe, say 15 minutes. For all incoming session IDs older than 5 minutes we regenerate a new cookie to effectively run a 15-5=10 minute session timeout window.

Again, if we want to go stateless we cannot kill such a session since that would require a server-side table of revoked session ids. So an attacker will have a 15 minute replay window in which he/she will refresh the session and have endless access.

Conclusion
There are three competing parameters to prioritize between:
  • Server CPU cycles per request
  • Server-side session state
  • The replay window
The tradeoff between CPU cycles and memory footprint will change with new technologies such as non-blocking IO in node.js. So yesterday's best practice might not be valid today.

The difference between regular session hijacking and hijacking of stateless session ids is that successful theft of a stateless session id authenticates the attacker even if the victim has logged out. Remember, the server doesn't store the session state. And even if the server would store the boolean isLoggedIn for each user, an old session id will still be valid if the user logs in again, as long as it hasn't timed out.

So ask yourselves what your tradeoff between CPU cycles and server-side state is. Then consider the replay+refresh leverage of a successful cross-site scripting attack.

Apr 8, 2011

Friday JavaScript & Web Dev Links

I'm summing up some reading tips for JavaScript and web development. Just thought you'd like 'em.

JavaScript

Command-Line JavaScript on Rhino
So you want to write command-line JavaScript on Rhino? Here's how you do it on Mac OS:
  1. Download Rhino 1.7R2: http://www.mozilla.org/rhino/download.html
  2. Unzip Rhino in for instance Applications/Utilities/Java
  3. Download JLine: http://jline.sourceforge.net/
  4. Unzip JLine in for instance Applications/Utilities/Java
  5. Move jline-0.9.94.jar to /Library/Java/Extensions
  6. In a shell: cd /Applications/Utilities/Java/rhino_1_7R2
  7. In the very same shell: java org.mozilla.javascript.tools.shell.Main
Code away!

Building Large-Scale jQuery Applications
A good read on RIA architecture and links to lib and framework choices, not only for jQuery junkies:

JavaScript Primitive Types Becoming Objects
About JavaScript's primitive types and how they become objects when their properties are used:

Scoping and Hoisting in JavaScript
If you haven't looked into scoping and variable assignments in JavaScript, read this and improve your programs:

'String'.replace() Only Replaces First Instance
String.prototype.replace, i.e. 'yourString'.replace(), only replaces the first instance of the regexp. So beware. Twitter made the mistake and got vulnerable because of it. Read about it and a suggested patch:

Non-Blocking JavaScript Loading (and more) With head.js
With Head JS your scripts load like images - completely separated from page rendering, and in parallel!

Web Development

RESTful Design, Patterns and Anti-Patterns
A nice webcast on REST design. For instance brings up the idea of session ids with constant state on the server. But as always, I wonder when the CSRF storm is going to hit all these REST services out there?

Chrome Web Dev Extensions
Google Chrome is becoming many web developers' favorite browser. The bundled developer tools are good. But check out the extensions too, for instance the CSS reloader:

iframe Loading Techniques and How They Affect Performance
Want your iframes to stop blocking and allow onLoad to fire earlier? Check these techniques out:
http://www.aaronpeters.nl/blog/iframe-loading-techniques-performance


Did I miss a good resource or read? Just fire away below.