May 25, 2013

Should String Be An Abstract Class?

Why are HTTP headers handled as plain strings in programming?

Is there anything in software engineering that is just a string? If not, shouldn't String be an abstract class, forcing developers to subtype and at least name datatypes?

Domain-Driven Security

Former colleague Dan Bergh Johnsson, application security expert Erlend Oftedal, and I have been evangelizing the idea of Domain-Driven Security. We truly believe proper domain and data modeling will kill many of the standard security bugs such as SQL injection and cross-site scripting.

This blog post is a case for Domain-Driven Security and a case against strings.

The addHeader() Method in Java

Let's be concrete and dive directly into programming with HTTP headers.

In Java EE's interface HttpServletResponse we find the following method (ref):

void addHeader(java.lang.String name,
               java.lang.String value)

Not a heavily debated method as far as I know. On the contrary it looks like most such interfaces do. An implementation of the interface may look like this (ref):

public void addHeader(String name, String value) {
  if (isCommitted())
    return;

  if (included)
    return;     // Ignore any call from an included servlet

  synchronized (headers) {
    ArrayList values = (ArrayList) headers.get(name);
    if (values == null) {
      values = new ArrayList();
      headers.put(name, values);
    }
    values.add(value);
  }
}

It shows we can really set any string as an HTTP header. And that's convenient, right?

The Ubiquitous String

java.lang.String is the ubiquitous datatype that solves all our problems. It can contain anything and nothing and of course it has its sibling in any popular programming language out there. Let's have a look at what a string is.

Java uses Unicode strings in UTF-16 code units which handle over 100,000 characters. As far as I know C# and JavaScript does the same. The max size of strings is often limited by the max size of integers, typically 2^31 - 1 which is just over 2 billion.

So, a string …
  • is anything between 0 and 2 billion in length, 
  • can contain 100,000 different characters, and 
  • can be null.
Hardly a good spec for HTTP headers.

HTTP Headers By the Spec

RFC 2047 gives us the formal specification of how HTTP headers should look. An excerpt will suffice for our discussion.

message-header = field-name ":" [ field-value ]
       field-name     = token
       field-value    = *( field-content | LWS )
       field-content  = <the OCTETs making up the field-value
                        and consisting of either *TEXT or
                        combinations 
of token, separators, and
                        quoted-string>

token          = 1*<any CHAR except CTLs or separators>

CHAR           = <any US-ASCII character (octets 0 - 127)>

CTL            = <any US-ASCII control character
                        (octets 0 - 31) and DEL (127)>

separators     = "(" | ")" | "<" | ">" | "@" |
                 "," | ";" | ":" | "\" | <"> |
                 "/" | "[" | "]" | "?" | "=" |
                 "{" | "}" | SP  | HT

LWS            = [CRLF] 1*( SP | HT )

CRLF            = CR LF

OCTET          = <any 8-bit sequence of data>

TEXT           = <any OCTET except CTLs,
                        but including LWS>

Let's summarize.
  • HTTP header names can consist of ASCII chars 32-126 except 19 chars called separators.
  • Then there shall be a colon.
  • Finally the header value can consist of any ASCII chars 9, 32-126 except 19 chars called separators … or a mix of tokens, separators, and quoted strings.
  • On top of this web servers such as Apache impose length constraints on headers, somewhere around 10,000 chars.
There's clearly a huge difference between just a string and RFC 2047.

The Dangers of Unvalidated HTTP Headers

Can this go wrong? Is there any real danger in using plain strings for setting HTTP headers? Yes. Let's look at HTTP response splitting as an example.

We have built a site where an optional URL parameter tells the server which language to use.

www.example.com/?lang=Swedish

… redirects to …

www.example.com/

… with a custom header telling the web client to use Swedish. After all, we don't want that language parameter pestering our beautiful URL the rest of the session.

So in the redirect response we do the following:

response.addHeader("Custom-Language",
                   request.getParameter("lang"));

The result is an HTTP response like this:

HTTP/1.1 302 Moved Temporarily
Date: Wed, 24 Dec 2013 12:53:28 GMT
Location: http://www.example.com/
Set-Cookie: 
JSESSIONID=1pMRZOiOQzZiE6Y6iivsREg82pq9Bo1ape7h4YoHZ62RXj
ApqwBE!-1251019693; path=/
Custom-Language: Swedish
Connection: Close

But what if the request looks like this (%0d is carriage return, %0a is linefeed):

www.example.com/?lang=foobar%0d%0aContent-Length:%200%0d%0a%0d%0aHTTP/1.1%20200%20OK%0d%0aContent-Type:%20text/html%0d%0aContent-Length:%2019%0d%0a%0d%0a<html>Well, hello!</html>

That would generate the following HTTP response (linefeeds included):

HTTP/1.1 302 Moved Temporarily
Date: Wed, 24 Dec 2013 15:26:41 GMT
Location: http://www.example.com/
Set-Cookie: 
JSESSIONID=1pwxbgHwzeaIIFyaksxqsq92Z0VULcQUcAanfK7In7IyrCST
9UsS!-1251019693; path=/
Custom-Language: foobar
Content-Length: 0

HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 19 
<html>Well, hello!</html>
Content-Type: text/html

… which will be interpreted as two responses by the web browser. This is an example of the security attack called HTTP response splitting (link to WASC from where I've adapted my example). And that's just one of the dangers of letting users mess with headers. Setting or deleting cookies is another. In fact, the whole header section is in danger.

The HTTP splitting vulnerability has been fixed under the hood in at least Tomcat 6+, Glassfish 2.1.1+, Jetty 7+, JBoss 3.2.7+. (Thanks for that info, Jeff Williams.)

Should We Fix the addHeader() API?

Now we can ask ourselves two different things. The first is – should we fix the addHeader() and related APIs? Yes. They should look something like this:

void addHeader(javax.servlet.http.HttpHeaderName name,
               javax.servlet.http.HttpHeaderValue value)

… where the two domain classes HttpHeaderName and HttpHeaderValue accept strings to their constructors and validate that the strings adhere to the RFC 2047 specification. In one blow all Java developers are relieved of the burden to write that validation code themselves and relieved of always having to remember running it.

Should String Be An Abstract Class?

The larger question is about strings in general. Yes, they are super convenient. But we're fooling ourselves. We think the time we save by not modeling our domain, by not writing that validation code, by not narrowing down our APIs to do exactly what they're supposed to, we think that time is better spent on other activities. It's not.

I truly believe nothing is just a string. Nothing is any of 100,000 characters and anything between 0 and 2 billion in length.

Therefore String should be an abstract class, forcing us developers to subtype and think about what we're really handling.

Even better, why not have a way to declare that a class can only be used in object composition? That way programmers could choose if an "is-a" relation or a "has-a" relation is most suitable for narrowing down the String class.

May 13, 2013

Introduction to Software Security

April 22, 2013 I successfully defended my PhD in computer science, more specifically in the area of software security [fulltext pdf]. I thought I'd share some parts of the thesis in a more digestible format and allow myself to augment our results, comment, and have opinions, things you typically don't see in academic publications.

Let's start with my introductory chapter …


The cover.

``To put it quite bluntly: as long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a mild problem, and now we have gigantic computers, programming has become an equally gigantic problem. In this sense the electronic industry has not solved a single problem, it has only created them, it has created the problem of using its products.''
–Edsger W.Dijkstra, The Humble Programmer, 1972

Computer software products are among the most complex artifacts, if not the most complex artifacts mankind has created (see Dijkstra's quote above). Securing those artifacts against intelligent attackers who try to exploit flaws in software design and construct is a great challenge too.

Our research contributes to the field of software security. Software as an artifact meant to interact with its environment including humans. Security in the sense of withstanding active intrusion attempts against benign software.

Software Vulnerabilities

Software can be intentionally malicious such as viruses (programs that replicate and spread from one computer to another and cause harm to infected ones), trojans (malicious programs that masquerade as benign) and software containing logic bombs (malicious functions set off when specified conditions are met).

However, attacks against computer systems are not limited to intentionally malicious software. Benign software can contain vulnerabilities and such vulnerabilities can be exploited to make the benign software do malicious things. A successful exploit has traditionally been the same as an intrusion. But in the era of web application vulnerabilities that term is not used as often. Nevertheless, a successful cross-site scripting attack (XSS) can be seen as executing arbitrary code inside the web application. And arbitrary code execution in a web application may very well be of high impact if the application handles sensitive information (password fields, credit card numbers etc) or is authorized to do sensitive state changes on the server (money transfers, profile updates, message posting etc). I would therefore argue that XSS is an intrusion attack.

Vulnerabilities can be responsibly reported to the public by creating a so called CVE Identifier – a unique, common identifier for a publicly known information security vulnerability. Identifiers are created by CVE Numbering Authorities for acknowledged vulnerabilities. Larger software vendors typically handle identifiers for their own products. Some of these participating vendors are Apple, Oracle, Ubuntu Linux, Microsoft, Google, and IBM.

The National Institute of Standards and Technology (NIST) has a statistical database over reported software vulnerabilities with a publicly accessible search interface. Two specific types of vulnerabilities are of specific interest in the context of our research, namely buffer overflows and format string vulnerabilities in software written in the programming language C. The statistics for Buffer Errors and Format String Vulnerabilities are shown below.



Reported software vulnerabilities due to buffer errors have increased significantly since 2002. Their percentage of the total number of reported vulnerabilities has also increased from 1-4 % between 2002 and 2006 to 10-16 % between 2008 and 2012. These statistics are in stark contrast to the statistics from CERT that Wagner et al used to show that buffer overflows represented 50 % of all reported vulnerabilities in 1999 [pdf]. We have not investigated if there are significant differences in how the two statistics were produced. Still, up to 16 % of all reported vulnerabilities is a significant number.

The reported format string vulnerabilities peaked between 2007 and 2009 but have never reached 0.5 % of the total. Our experience is that format string vulnerabilities are less prevalent, easier to fix, and harder to exploit than buffer overflow vulnerabilities. Nevertheless format string vulnerabilities are still being used for exploitation such as the Corona iOS Jailbreak Tool.

Avoiding Software Intrusions

Intrusion attempts or attacks are made by malicious users or attackers against victims. A victim can be either a machine holding valuable assets or another human computer user. Securing software against intrusions calls for anti-intrusion techniques as defined by Halme and Bauer. We have taken the liberty of adapting and reproducing Halme and Bauer's figure showing anti-intrusion approaches, see below.


  1. Preempt – strike offensively against likely threat agents prior to an intrusion attempt. May affect innocents.
  2. Prevent – severely handicap the likelihood of a particular intrusion’s success.
  3. Deter – increase the necessary effort for an intrusion to succeed, increase the risk associated with an attempt, and/or devalue the perceived gain that would come with success.
  4. Deflect – leads an intruder to believe that he or she has succeeded in an intrusion attempt, whereas in fact the intrusion was redirected to where harm is minimized.
  5. Detect – discriminate intrusion attempts and intrusion preparation from normal activity and alert the operations. Detection can also be done in a post mortem analysis.
  6. Actively countermeasure – counter an intrusion as it is being attempted.

Avoiding the Vulnerabilities

There are many ways to achieve more secure software, i.e. avoiding to have vulnerabilities. Microsoft's Security Development Lifecycle (SDL) defines seven phases where security enhancing activities and technologies apply:

  1. Training
  2. Requirements
  3. Design
  4. Implementation
  5. Verification
  6. Release
  7. Response

Further things can be done in an even wider scope. Programming languages can be constructed with security primitives which allow programmers to express security properties of the system they are writing – so called security-typed languages, a part of language-based security [pdf]. Operating systems and deployment platforms can be hardened and secured both in construction and configuration.

Our research objectives have been on the Requirements and Implementation phases of Microsoft's SDL and on hardening of the runtime environment for software applications. Want to know what we found out? Stay tuned for upcoming posts where we dive into the details of our studies.

Jan 6, 2013

Review of The Tangled Web by Michal Zalewski


My family and I spent Christmas and New Year abroad so I got the chance to do some reading. Sitting in my bookshelf for far too long it was time to take on The Tangled Web by Michal Zalewski.


The Best Book on Web Application Security

Let me start by saying The Tangled Web is the best text I've read on web application security. Yes, that includes blog posts, articles, and papers. This is a must-read for any technical OWASPer or the like.

The book covers a lot of ground, serves nice examples, and shows why Michal is one of the most highly regarded experts in the app sec field. Additionally Chris Evans served as technical reviewer and people like Adam Barth and Tavis Ormandy are frequently referenced in the text which means much of Google's best security people made this book what it is.

Detailed But Soon Dated

Most of The Tangled Web is spent on browser security, meaning how browsers implement web standards and de facto standards. Michal has an in-depth knowledge in both the history and the current state of how browsers work and why. This means you'll get plenty of browser quirks and curious differences between browsers and browser versions as well as up and coming features.

The downside of this is that the book quickly gets dated. Several of the features presented as forthcoming or WebKit-only are in fact available and in use today, a year after the book was published. As a reader you're either aware of this or will have to check what has changed.

For this book to be the reference I'd love to repeat, Michal will have to update it. And if he didn't plan for that ahead I'm guessing a second edition will be a major undertaking and not as rewarding for him as the first edition.

So my advice is to read it now while it's still highly relevant. Then hope for new editions with release notes on what has changed. Otherwise The Tangled Web will become a historic exposé of web security.

More For Security Pros Than For Developers

While the subtitle of the book says "A Guide To Securing Modern Web Applications" it's more geared toward security professionals and pentesters than toward developers. And in my experience the developers are the ones who need to secure the applications where security professionals either review proposals or prove that the system is not secure enough.

I'm not saying The Tangled Web is a guide on how to break web apps. On the contrary every chapter has a "Security Engineering Cheat Sheet" and the text is mostly about how to avoid pitfalls. But developers (such as myself) always look for concrete ways to test our code and Michal does not provide guidance on how to do that nor suggests data sets for security testing. If you read between the lines you start getting ideas of good test data yourself so the information is probably in Michal's head. I'm not looking for pen test guidance but for security unit and integration testing guidance.

To be concrete – I would have liked a description and an online reference on a data set for unit testing my URL parser(s). Michal's coverage of problems in parsing URLs is great so you are totally ready to put your code to the test after reading it.

From a developer's perspective The Tangled Web is more of an exposé of web security problems than a guide on how to secure your apps.

My Technical Takeaways

Here's a glimpse of what I underlined and took notes of in the book.

  • Newline handling quirks. Michal's coverage gives new life to header injection or response splitting, a subject not discussed too often these days. Especially worrying is the discrepancy between how Apache and IIS handle a lone CR and how browsers handle it.
  • Attacker-controlled 401 Unauthorized responses. Any resource hosted by the attacker (images etc) can in an instance be configured to respond with 401 and provoke a Basic Authentication dialog to appear in the browser. The victim(s) have no chance of seeing which origin is asking for credentials and thus will believe it's the main site.
  • HTML parsing behavior. Although people like Mario Heiderich and Gareth Heyes regularly treat me to new parsing flaws in browsers I really liked Michal's coverage of the subject.
  • Multiple parsing steps of inline JavaScript. User input in nested JavaScript such as event handlers or setTimeouts is almost impossible to get right in a complex application. Michal shows why with an example where the user input has to first be double encoded using JavaScript backslash sequences and then encoded again using HTML entities. In that exact order.
  • Cookie problems with SOP and country-level TLDs. Some countries require example.co.[country code] domains for businesses. Others allow example.[country code]. Japans allows both. This messes up the restrictions for how host can be set for cookies. They aren't allowed to be set for *.com but what about *.com.pl? With the idea of arbitrary generic TLDs we will head back into the dark ages with cookie leakage and overwriting.
  • The X-Content-Type-Options: nosniff response header. Should be default on all your content HTTP responses unless you actually want browsers to try to sniff content type and do all kinds of dangerous interpretation of untyped responses. As of 2011 only 0.6% of top 10,000 sites use this header.

Things I Miss in the Book

While weighing in on 268 pages there are some things I expected to see in there but didn't.

Parameter pollution

It is mentioned but no proper coverage is provided around parameter pollution. Both HTTP and JSON are susceptible to multiple instances of parameters. HTTP parameters, HTTP cookies, and JSON padding (not JSONP) are examples. I would like a general discussion on this topic and its implications for the server-side.

DOM clobbering

Again some parts of this topic are mentioned but there is no real coverage. I would like to read Michal's take on global DOM ids and CSS classes being generated, injected, or mistakingly duplicated in mashups. He does reference Heiderich et al's "Web Application Obfuscation" though.

CSRF in-depth

Cross-site request forgeries are mentioned very briefly. I believe there's room for a few pages on the many nuances such as GET vs POST, Referer header reliance, blindness of the attack, multi-step CSRF, and how CORS eases the attack scenarios.

Script inclusion from outside the app

Working with CSP you quickly see that JavaScript gets included from several places outside the application. ISPs, hotels, and venues add their stuff via proxies. And browser plugins add JavaScript in the browser which means not even HTTPS will help you avoid it. The dangers and diversity of this situation is not covered in the book.

Handling of legacy web

Many organizations are in a legacy state with ten year old asp, jsp, or php sites. There are techniques for moving such applications forward but the book doesn't cover them at all. On the contrary Michal specifically advices against a technique that works wonders in my experience – sandboxing same-origin legacy content in iframes to allow for CSP and clean global footprint in new code.

Security of infrastructure and frameworks

The frontend technology stack is overflowing of frameworks and micro frameworks such as jQuery, Bootstrap, Backbone, Ext JS, and Dojo. Many of them offer flawed security controls. Even more are plagued with insecure defaults. Some are making steady progress in security whereas others bluntly ignore even reported flaws. This whole situation is very tangible on the web today but not covered in the book. A detailed guide on top frameworks is probably out of scope but given the prevalent use of frameworks something should be written on the topic.

Architectural considerations

The web has interesting aspects such as the typical mix of programming languages within the system (fronted and backend), the loose and untyped coupling between server and client, and problems of mixing code, content, and style. This has bearing on security. System-wide static analysis is unheard of for one. Versioning and validity checksums are super hard to get right. I would love to read Michal's thoughts on where we are and where we should be headed in terms of secure architecture on the web.

The Best Part – The Epilogue

While the technical parts of this book (95 % that is) are really great I cannot help but think Michal's epilogue was the best part. It's short and not in the least the kind of self-indulging stuff you typically come across. Instead Michal challenges the whole security industry and academia. Are we really helping society with our paranoia and foil hats? Or are we a breed of IT pros about to be extinct? After all, no other part of mankind or society is "secure". It's all about trust and the tradeoff between development and risk.

I think Michal is on to something important and it makes me happy I decided long ago to go 70 % development and 30 % security. That's the app sec productivity sweet spot in my opinion.


Disclaimer: I got a free copy of the book from the publisher.

Nov 24, 2012

Is XSS Solved?

In academic research a problem is solved when it is fully understood and a solution is shown to work in a practical setting. If we define "XSS solved" as every instance of XSS eradicated from earth we will probably not see a solution in our lifetime. So, from a research perspective, is XSS solved already?

Geeks in a Castle

Early October I attended the weeklong seminar on web application security at castle Dagstuhl, Southwest Germany. An awesome opportunity to socialize and discuss with leading experts in web appsec academia.

Group photo outside the castle (original).


"XSS Is Solved"

One of the break-out sessions was on XSS. Someone had voiced the opinion that XSS is solved already the day before. The break-out session took the claim seriously and hashed it out.

From a principal standpoint, which is the typical standpoint of academic research, a problem like XSS is solved when a) we fully understand the problem and its underpinnings, and b) have a PoC solution that is practical enough to be rolled out and has the potential to solve the problem fully.

Do we fully understand XSS and its underpinnings?


Important Papers on XSS

Looking at recent publications we arrived at the following short list that we felt summarizes how academia understands XSS today:
  • Context-Sensitive Auto-Sanitization in Web Templating Languages Using Type Qualifiers [pdf]
  • ScriptGard: Preventing Script Injection Attacks in Legacy Web Applications with Automatic Sanitization [pdf]
  • A Symbolic Execution Framework for JavaScript [pdf]
  • Gatekeeper: Mostly Static Enforcement of Security and Reliability Policies for JavaScript Code [pdf]
  • Scriptless Attacks – Stealing the Pie Without Touching the Sill [pdf]
The conclusion was that yes, we think the understanding of XSS is fairly good. But we lack a definition of XSS that would summarize this understanding and allow new attack forms to be deemed XSS or Not XSS.


Current Definitions of XSS

Can you believe that? We still don't have a reasonable definition of XSS.

Wikipedia says "XSS enables attackers to inject client-side script into web pages viewed by other users". 

But it can easily be shot down. Do we need "web pages" to have XSS? Does an attack have to be "viewed by other users" to be XSS? More importantly the Wikipedia definition doesn't say whether the attackers' scripts have to be executed or not or in what context. With default CSP in place you can still inject the script into a page, right? With sandboxed JavaScript you can both inject and execute without causing an XSS attack. And what about these "attackers"? Can they be compromised trusted third parties, legitimate users of the system, or even clumsy business partners?

OWASP says "Cross-Site Scripting attacks are a type of injection problem, in which malicious scripts are injected into the otherwise benign and trusted web sites. Cross-site scripting (XSS) attacks occur when an attacker uses a web application to send malicious code, generally in the form of a browser side script, to a different end user."

Again "web sites" seem to be a prerequisite, but are they? Here the injected scripts have to be "malicious", but do they? And does the target web site have to be "benign and trusted"? OWASP just like Wikipedia fails to state that the injected script has to be executed. Then OWASP changes its mind and says XSS happens when an attacker "uses a web application to send malicious code". Clearly, this widens the scope beyond JavaScript. But look at that sentence and imagine Alice using gmail.com to send an email to Bob containing a malicious code sample. Alice has done XSS since she used a web application to send malicious code.

I know I'm nit-picking here. Neither Wikipedia nor OWASP have proposed an academic definition of XSS. They're trying to be pedagogical and reach out to non-appsec people.

But we still need a (more) formal definition. To be clear, we need a definition of XSS that allows us to say if a certain vulnerability or attack is XSS or not. Without such a definition we cannot know if countermeasures such as CSP "solves XSS" or not.

Also, Dave Wichers brought up an interesting detail at this year's OWASP AppSec Research conference in Athens. We need to redefine reflected XSS, stored XSS, and DOM-based XSS into server-side XSS reflected and stored, and client-side XSS reflected and stored.

Current, insufficient categorization of XSS.

Proposed new categorization of XSS.


A New Candidate Definition of XSS

To get the juices flowing at the castle we came up with a candidate definition of XSS that the rest of the participants could shoot down.

Candidate definition of XSS: An XSS attack occurs when a script from an untrusted source is executed in rendering a page.

It was shot down thoroughly, in part by yours truly :). 

Terms more or less undefined in the candidate definition:
  • Script. JavaScript, any web-enabled script language, or any character sequence that sort of executes in the browser?
  • Untrusted. What does trusting and not trusting a script mean? Who expresses this trust or distrust?
  • Source. Is it a domain, a server, a legal entity such as Google, or the attacker multiple steps away in the request chain?
  • Executed. Relates to "Script" above. Does it mean running on the JavaScript engine, invoke a browser event, invoke an http request, or what?
  • Rendering. Does rendering have to happen for an attack to be categorized as XSS?
  • Page. Is a page a prerequisite for XSS? Can XSS happen without a page existing?


So Is XSS Solved?

Back to the original question. The feeling at Dagstuhl was that CSP is the mechanism we're all betting on to solve XSS. Not that it's done in version 1.0, not even 1.1. But it's a work horse that we can use to beat XSS in the long run.

What we need right now is a satisfactory definition of XSS. That way we can find the gaps in current countermeasures (including CSP) and get to work on filling them. Don't be surprised if the gaps are fairly few and academic researchers start saying "XSS is solved" within a year. Hey, they need to work on application security problems of tomorrow, not the XSS plague in all the legacy web apps out there.

Please chip in by commenting below. If you can give a good definition of XSS, even better!

Nov 7, 2012

The Rugged Developer

I took part in the intense, weeklong Rugged Summit this spring. Rugged as in Rugged Software. Rugged Software as in secure and robust software. The major outcome of the summit and the homework afterwards was a strawman of The Rugged Handbook. It's free, available here (as docx).

My main contribution to the handbook was being lead author of The Rugged Developer chapter. I'd like to share it with you as a stand-alone blog post below. Hopefully you can give me and the other authors some feedback!

The Rugged Developer
As a Rugged Developer, I want my software to be secure against attacks, interference, corruption, random events, and more.

To achieve my goals I have come to value...

  • Software Quality over Security Products
  • Defensive Code over Patching
  • Ruggedizing Your Own Systems over Waiting To Be Hacked

Your Mission as a Rugged Developer
Good news – you're already Rugged ... in part. We developers do all sorts of things to ensure our code is robust and maintainable. To become a fully rugged developer you only need to add security to the quality goals you try to achieve.

The interesting part of security is that you're protecting your code against intelligent adversaries, not just random things. So in addition to being robust against chaotic users and errors in other systems, your code also has to withstand attacks from people who really know the nuts and bolts of your programming languages, your frameworks, and your deployment platform.

Being a Rugged developer means you have a key role in your project’s security story. The story should tell you what security defenses are available, when they are to be used, and how they are to be used.  Your job is to ensure these things happen. You should also strive to integrate security tests into your development life cycle, and even try to hack your own systems. Better you than a “security” guy or bad guy, right?

Ideas for Being an Effective Rugged Developer
Add Security Unit Tests. Perhaps you've been to security training or you've read a blog post on a new attack form. Make it a habit of trying to add unit tests for attack input you come across. They will add to your negative testing and make your application more robust. A certain escape character, for instance ', may be usable in a nifty security exploit but just as well produce numerous of errors for benign users. A user registration with the name Olivia O'Hara should not fizzle your SQL statement execution. You as a developer have the deepest knowledge of how the system is designed and implemented and thus you are in the best position to implement and test security.

Model Your Data Instead of Using Strings. Almost nothing is just a string. Strings are super convenient for representing input, but they are also capable of transmitting source code, SQL statements, escape characters, null values, markup etc. Write wrappers around your strings and narrow down what they can contain. Even if you don't spend time on a draconic regular expressions or check for syntax and semantics, a simple input restriction to unicode letters + digits + simple punctuation may prove extremely powerful against attacks. Attackers love string input. Rugged developers deny them the pleasure.

Hack Your Own Systems. Even more fun, do it with your team. If management has a problem, tell them it's better you do it than someone on the outside.

Get Educated. There are many materials available to help you learn secure coding, including websites, commercial secure coding training, and vulnerable applications like WebGoat. Also, although top lists can be lame, the OWASP Top 10 and CWE Top 25 are great places to start. As luck would have it, most of the issues in these lists are concrete and you can take action in code today. There are a lot more good materials available at both OWASP and MITRE.  

Make Sure You Patch Your Application Frameworks and Libraries. Know which frameworks and libraries you use (Struts, Spring, .NET MVC, jQuery etc) and their versions. Make sure your regression test suite allows you to upgrade frameworks quickly. Make sure you get those patch alerts. A framework with a security bug can quickly open up several or all your applications to attacks. All the major web frameworks have been found to have severe security bugs the last two years so the problem is very much a reality today.

Metrics
The Rugged Developer should evaluate success based on how well their code stands up to both internal and external stresses. How many weaknesses are discovered after the code is released for testing? How often are mistakes repeated? How long does it take to remediate a vulnerability? How many security-related test cases are associated with a project?

Mar 23, 2012

Rugged Summit Summary

I spent the last week in Washington DC as an invited expert to the Rugged Summit, part of the Rugged Software initiative.

The very minute I announced I'd be participating I got several messages on Twitter saying Rugged is a failure and I shouldn't go. Those messages were sent from people I like and trust. Sure, I was reluctant to a manifesto written to developers by security experts. Also, I hadn't heard much since the Rugged announcement in 2010.

But shouldn't I try to bring my view---a developer's view---to the table? Of course I should!

At the summit I got to work with some amazing people. Ken van Wyk, Joshua Corman, Nick Coblentz, Jeff Williams, Chris Wysopal, John Pavone, Gene Kim, Jason Li, and Justin Berman. Four very intense days. And still no silver bullet :).

Rugged Software In Short
My take on rugged is defensible software free from well-known bug types. A rugged application should be able to withstand a real attack as long as the attack doesn't exploit unknown bugs in the platform or unknown bug categories in the app. If the rugged application is breached the developers and operations should be able to recover gracefully.

Rugged also applies to operations and there's an ongoing Rugged DevOps initiative.

Why Should Organizations Become Rugged?
We first focused on *why* organizations should produce or require rugged software. Does software security also enhance software quality? Should we try to measure return on investment, reduced cost, reduced risk or what? What would make a CIO/CTO decide to go rugged?

Fundamentally we believe rugged software is part of engineering excellence. And we all need to do better. Software is enhancing our lives and revolutionizes almost everything mankind does. We want software to be good enough to enable further revolution.

Software security is currently in a state of vulnerability management. That's a negative approach and it hasn't made frequent breaches go away. Rugged is a more positive approach where you're not supposed to find a bunch of vulnerabilities in pentesting.

Here's three examples of motives for rugged we worked on.

"Telling your security story" could be a competitive advantage. Look at http://www.box.com/enterprise/security-and-architecture/ and put it in the context of Dropbox's recent security failures. Imagine the whole chain of people involved in a system being built to chip in to produce evidence of why their product or service is secure.

Another idea is to define tests that prove that you're actually more secure after becoming rugged than before. We believe executives feel security is a black art and a pentest+patch doesn't show if the organization is 90 % done or 1 % done. HDMoore's Law could be such a test (works without Rugged too of course). How to actually test against Metasploit will have to be figured out.

Third, if buyers of software started demanding more secure software that would drive producers to adopt something like Rugged. So we worked on a Buyers' Bill of Rights and a Buyer's Guide. Buyer empowerment if you will.

The Rugged Software Table of Contents
The rest of the summit was spent on various aspects of how we think software and security can meet successfully. Our straw man results will be published further on and there will be plenty of chances to help making it the right thing.

But the table of contents may give you an impression of where we're headed:

  1. Why Rugged?
  2. This is Rugged
    _
  3. The Rugged Tech Executive
  4. The Rugged Architect
  5. The Rugged Developer (I will write the first version of this section)
  6. The Rugged Tester
  7. The Rugged Analyst
    _
  8. Becoming a Rugged Organization
  9. Proving Ruggedosity
  10. Telling the Rugged Story (internally and publicly)
  11. How Rugged Fits in with Existing Work
  12. Success Case Study

Feb 18, 2012

We Need a Free Browser, Not Just an Open Source Browser

The security community was chocked when Trustwave came clean and revoked a subordinate root certificate it had sold to a third party which explicitly said it would use it to introspect SSL traffic.

The news of Trustwave's severe malpractice sparked demands for removing the Trustwave root certificates from the Mozilla trust stores (Firefox, Thunderbird, SeaMonkey). The demand was filed as a bug in Bugzilla and the issue has also gotten a fair amount of attention on the Mozilla-dev-security-policy mailing list.

We experienced the same process – Bugzilla + mailing list outbursts – during the recent DigiNotar and Comodo scandals.

Kill or Not to Kill, That's the Question
According to Trustwave they had to sell the man-in-the-middle certificate since other CAs do it. That in itself is extremely worrying. These bastards who've been charging us $$$ for maintaining trust on the Internet. They've not only been negligent in their security operations but also done business selling out the trust built in by all browsers.

So, should Mozilla kill the Trustwave root because of their misconduct? Tricky question.

On the one hand I feel Trustwave's CA business deserves nothing less than the ditch. They did the wrong thing with open eyes.

On the other hand we probably have a large scale problem at our hand – CAs worldwide have been issuing subCA certs that allow employers, governments, and agencies to intercept the traffic we all thought was authenticated, encrypted, and integrity checked. Killing the Trustwave root doesn't fix that.

Think about it. The whole trust model crumbles. Can customers now claim someone else must have manipulated their buy order for the stock that later plummeted? Can payment providers who leak credit cards now claim somebody must have MItMed them? Will the increase in online shopping continue once mainstream media understands and writes about this issue?

Whichever path we take it has to lead to reestablished trust in the CA model. The alternatives such as building on DNSSEC or Moxie's excellent Convergence are nowhere near mainstream roll-out.

But you know what? Democracy and openness seem to work. Mozilla has made the right decision.

CAs world-wide have until April 27 to come clean. Mozilla says the following on its security blog:

"Earlier today we sent an email to all certificate authorities in the Mozilla root program to clarify our expectations around certificate issuance. In particular, we made it clear that the issuance of subordinate CA certificates for the purposes of SSL man-in-the-middle interception or traffic management is unacceptable. We made it clear that this practice remains unacceptable even when the intended deployment of such a certificate is restricted to a closed network.


In addition to this clarification, we have made several requests. We have requested that any such certificates be revoked, and their HSMs destroyed. We have requested the serial numbers of those certificates and fingerprints of their signing roots so that we, and other relying parties, can detect and distrust these subCA certificates if encountered. We have requested that any CAs who have issued subCA certificates fulfill these requests no later than April 27, 2012."

Where else did you see such a clear message to the CAs who have abused our trust? Mozilla makes me proud.

We Need a Free Browser, Not Only an Open Source Browser
The handling of Comodo, DigiNotar and Trustwave tells me we truly need Mozilla and Firefox. Nowhere else in the web community have I seen such openness, freedom of speech, and focus on regular users' interests. Hey, even internet trolls get their say on the mailing list :).

Sure, I love my Chrome, I know Google is a high-paying partner to Mozilla, and I know Firefox has been lagging behind in performance and developer tools. But there's something really great and important in a free alternative.

Speaking of lagging behind ... JavaScript performance and Chrome's V8 have been industry standard for a few years. But when I run the SunSpider 0.9.1 benchmark on my new MacBook Air I get:

  • Chrome v17.0.963.56: 280.8 ms
  • Firefox v10.0.2: 233.0 ms

Therefore I would like to urge the Mozilla Foundation to get us tab sandboxing and silent auto-upgrades in Firefox so I can go all-in!

We need a free browser, not just an open source browser.