In academic research a problem is solved when it is fully understood and a solution is shown to work in a practical setting. If we define "XSS solved" as every instance of XSS eradicated from earth we will probably not see a solution in our lifetime. So, from a research perspective, is XSS solved already?
Geeks in a Castle
Early October I attended the weeklong seminar on web application security at castle Dagstuhl, Southwest Germany. An awesome opportunity to socialize and discuss with leading experts in web appsec academia.
Group photo outside the castle (original).
"XSS Is Solved"
One of the break-out sessions was on XSS. Someone had voiced the opinion that XSS is solved already the day before. The break-out session took the claim seriously and hashed it out.
From a principal standpoint, which is the typical standpoint of academic research, a problem like XSS is solved when a) we fully understand the problem and its underpinnings, and b) have a PoC solution that is practical enough to be rolled out and has the potential to solve the problem fully.
Do we fully understand XSS and its underpinnings?
Important Papers on XSS
Looking at recent publications we arrived at the following short list that we felt summarizes how academia understands XSS today:
- Context-Sensitive Auto-Sanitization in Web Templating Languages Using Type Qualifiers [pdf]
- ScriptGard: Preventing Script Injection Attacks in Legacy Web Applications with Automatic Sanitization [pdf]
- A Symbolic Execution Framework for JavaScript [pdf]
- Gatekeeper: Mostly Static Enforcement of Security and Reliability Policies for JavaScript Code [pdf]
- Scriptless Attacks – Stealing the Pie Without Touching the Sill [pdf]
The conclusion was that yes, we think the understanding of XSS is fairly good. But we lack a definition of XSS that would summarize this understanding and allow new attack forms to be deemed XSS or Not XSS.
Current Definitions of XSS
Can you believe that? We still don't have a reasonable definition of XSS.
Wikipedia says "XSS enables attackers to inject client-side script into web pages viewed by other users". 
But it can easily be shot down. Do we need "web pages" to have XSS? Does an attack have to be "viewed by other users" to be XSS? More importantly the Wikipedia definition doesn't say whether the attackers' scripts have to be executed or not or in what context. With default CSP in place you can still inject the script into a page, right? With sandboxed JavaScript you can both inject and execute without causing an XSS attack. And what about these "attackers"? Can they be compromised trusted third parties, legitimate users of the system, or even clumsy business partners?
OWASP says "Cross-Site Scripting attacks are a type of injection problem, in which malicious scripts are injected into the otherwise benign and trusted web sites. Cross-site scripting (XSS) attacks occur when an attacker uses a web application to send malicious code, generally in the form of a browser side script, to a different end user."
Again "web sites" seem to be a prerequisite, but are they? Here the injected scripts have to be "malicious", but do they? And does the target web site have to be "benign and trusted"? OWASP just like Wikipedia fails to state that the injected script has to be executed. Then OWASP changes its mind and says XSS happens when an attacker "uses a web application to send malicious code". Clearly, this widens the scope beyond JavaScript. But look at that sentence and imagine Alice using gmail.com to send an email to Bob containing a malicious code sample. Alice has done XSS since she used a web application to send malicious code.
I know I'm nit-picking here. Neither Wikipedia nor OWASP have proposed an academic definition of XSS. They're trying to be pedagogical and reach out to non-appsec people.
But we still need a (more) formal definition. To be clear, we need a definition of XSS that allows us to say if a certain vulnerability or attack is XSS or not. Without such a definition we cannot know if countermeasures such as CSP "solves XSS" or not.
Also, Dave Wichers brought up an interesting detail at this year's OWASP AppSec Research conference in Athens. We need to redefine reflected XSS, stored XSS, and DOM-based XSS into server-side XSS reflected and stored, and client-side XSS reflected and stored.
Current, insufficient categorization of XSS.
Proposed new categorization of XSS.
A New Candidate Definition of XSS
To get the juices flowing at the castle we came up with a candidate definition of XSS that the rest of the participants could shoot down.
Candidate definition of XSS: An XSS attack occurs when a script from an untrusted source is executed in rendering a page.
It was shot down thoroughly, in part by yours truly :). 
Terms more or less undefined in the candidate definition:
- Script. JavaScript, any web-enabled script language, or any character sequence that sort of executes in the browser?
- Untrusted. What does trusting and not trusting a script mean? Who expresses this trust or distrust?
- Source. Is it a domain, a server, a legal entity such as Google, or the attacker multiple steps away in the request chain?
- Executed. Relates to "Script" above. Does it mean running on the JavaScript engine, invoke a browser event, invoke an http request, or what?
- Rendering. Does rendering have to happen for an attack to be categorized as XSS?
- Page. Is a page a prerequisite for XSS? Can XSS happen without a page existing?
So Is XSS Solved?
Back to the original question. The feeling at Dagstuhl was that CSP is the mechanism we're all betting on to solve XSS. Not that it's done in version 1.0, not even 1.1. But it's a work horse that we can use to beat XSS in the long run.
What we need right now is a satisfactory definition of XSS. That way we can find the gaps in current countermeasures (including CSP) and get to work on filling them. Don't be surprised if the gaps are fairly few and academic researchers start saying "XSS is solved" within a year. Hey, they need to work on application security problems of tomorrow, not the XSS plague in all the legacy web apps out there.
Please chip in by commenting below. If you can give a good definition of XSS, even better!



 




