Apr 9, 2011

REST and Stateless Session IDs

Nowadays there's a general reluctance to introduce (more) server-side session state because of scalability. And there's specific reluctance to session state in RESTful web services, due to design principles.

In the stateless requirement of REST we read:

"The client–server communication is constrained by no client context being stored on the server between requests. Each request from any client contains all of the information necessary to service the request, and any session state is held in the client. The server can be stateful; this constraint merely requires that server-side state be addressable by URL as a resource." [Wikipedia]

This is a tough requirement, especially if we want features such as authentication and sessions.

So, can we have session ids without server-side session state? Yes.

The Relation Between Sessions and Authentication
There's often a tight relationship between authenticating users and holding their sessions. Anonymous sessions are not very sensitive whereas authenticated sessions have to be protected against hijacking, fixation, forging, and replay. Actually, a valid session token authenticates the session, so you're basically authenticating yourself every request. Which leads us to the first stateless session solution ...

No Sessions => Authenticate Every Request
If session ids are in fact authentication tokens we might as well use the mental model of no sessions, instead authenticate each request. The old HTTP Basic Authentication does this by storing your username and password for subsequent requests. But there are more advanced versions such as authentication in Amazon's S3 REST API.

They use a custom HTTP scheme based on a keyed-HMAC (Hash Message Authentication Code). To authenticate a request, you first concatenate selected elements of the request to form a string. You then use a shared "AWS Secret Access Key" to calculate the HMAC of that string, i.e. you sign the request. Finally, you add this signature as a parameter of the request.

GET /photos/puppy.jpg HTTP/1.1
Host: johnsmith.s3.amazonaws.com
Date: Mon, 26 Mar 2007 19:37:58 +0000


Authorization: AWS 0PN5J17HBGZHT7JJ3X82:frJIUN8DYpKDtOLCwo//yllqDzg=


Looking deeper into how this scheme works you find the following spec:


Authorization = "AWS" + " " + AWSAccessKeyId + ":" + Signature;


Signature = Base64( HMAC-SHA1( UTF-8-Encoding-Of( YourSecretAccessKeyID, StringToSign ) ) );


StringToSign = HTTP-Verb + "\n" +
________Content-MD5 + "\n" +
________Content-Type + "\n" +
________Date + "\n" +
________CanonicalizedAmzHeaders +
________CanonicalizedResource;


Canonicalization of course requires some processing such as converting headers to lower-case and sorting them lexicographically. The date works as a timestamp and narrows the replay window.

The server then does the same signing with the shared secret associated with the AWSAccessKeyId. So we're switching from server-side session state to more cycles and latency on both client and server.

Worth noting:
  • Signed requests are much stronger than mere session ids. Cross-site request forgeries will be mitigated with this scheme.
  • By authenticating all requests with a shared secret we don't have any time-bound sessions or timeouts. Just fire whenever you want.
  • The persistent shared secret is much more sensitive than a temporary session id. A cross-site scripting attack will steal the shared secret which is much worse than session hijacking. This means the scheme is less suitable for browsing sessions and more suitable for machine-to-machine communication.

Stateless, Hashed Session ID and Salt
The server can generate session id cookies by hashing usernames and a random global salt:

sessionIdCookie_v1 = username ":" SHA256(username + global salt)

The salt is used for all sessions but only valid for a certain timeframe, say 15 minutes. A new salt is produced every 5 minutes and incoming session ids produced with the previous but still valid salt will be exchanged for a new session id with the fresh salt. That means a session timeout of 15-5=10 minutes.

If we truly want to go stateless we cannot kill such a session since that would require a server-side table of revoked session ids. So in the stateless case an attacker will have a 15 minute replay window in which he/she will refresh the session and have endless access.

Stateless, Encrypted Session ID
By just storing a server-side symmetric crypto key we can effectively decrypt incoming session IDs and trust their contents. Imagine a cookie based on:

sessionIdCookie_v2 = AES_GCM(128 bit key, auth tag, username + timestamp)

This means we don't have to store each session ID. Instead we pay the price of decrypting incoming cookies and checking that the timestamp is within a timeframe, say 15 minutes. For all incoming session IDs older than 5 minutes we regenerate a new cookie to effectively run a 15-5=10 minute session timeout window.

Again, if we want to go stateless we cannot kill such a session since that would require a server-side table of revoked session ids. So an attacker will have a 15 minute replay window in which he/she will refresh the session and have endless access.

Conclusion
There are three competing parameters to prioritize between:
  • Server CPU cycles per request
  • Server-side session state
  • The replay window
The tradeoff between CPU cycles and memory footprint will change with new technologies such as non-blocking IO in node.js. So yesterday's best practice might not be valid today.

The difference between regular session hijacking and hijacking of stateless session ids is that successful theft of a stateless session id authenticates the attacker even if the victim has logged out. Remember, the server doesn't store the session state. And even if the server would store the boolean isLoggedIn for each user, an old session id will still be valid if the user logs in again, as long as it hasn't timed out.

So ask yourselves what your tradeoff between CPU cycles and server-side state is. Then consider the replay+refresh leverage of a successful cross-site scripting attack.

6 comments:

  1. AES_CBC is probably not a great recommendation; a naive implementation would be vulnerable to padding oracle attacks; possibly CBC-R exploits would even allow creating any sessionid remotely.

    Consider changing to AES_GSM, which adds authentication tag to crypto stream; ought to prevent attempts to exploit padding oracle.

    ReplyDelete
  2. Thanks, PM. I changed it to AES_GCM.

    ReplyDelete
  3. I like your thinking here. Thanks for writing this up!

    One perfectly legitimate way to access REST-based web services would be through a browser. I'm curious: Have you given any thought as to how one could implement stateless session ids in a browser?

    ReplyDelete
  4. Any sort of stateless token can work, used as a cookie. HMACs, for example, work nicely (and don't require storing any state on the service side).

    ReplyDelete
  5. Any tips on how to share the global salt between servers for creating a hashed session id? Also, should you use bcrypt instead of SHA256 for the hashing function?

    ReplyDelete