Philippe De Ryck

Hi, I'm Philippe, and I help developers protect companies through better web security. As the founder of Pragmatic Web Security, I travel the world to teach practitioners the ins and outs of building secure software.

Want to learn more about OAuth 2.0 and OpenID Connect?

Save yourself days of digging through dozens of specs with this online course

More information

Are you causing XSS vulnerabilities with JSON.stringify()?

JSON.stringify() is perhaps one of the most mundane APIs in modern browsers. The functionality to translate a JavaScript object into a string-based representation is hardly thrilling. But when the stars align, a simple JSON serialization operation can result in a significant XSS vulnerability. This article tells the nail-biting story of such a vulnerability in Scully, an Angular-based static site generator.

26 January 2021 SPA Security XSS, Angular, Scully, Single Page Applications

Server-side rendering is gaining a lot more traction, making it a part of my training curriculum. In my React Security classes, I spend an entire module talking about the dangers of XSS in server-side rendered React. Recently, I needed to dig a bit deeper into server-side rendering in Angular applications, so I reached out to Sam Vloeberghs. Sam has been working with both Angular Universal and Scully for a while, so I was sure I could learn a thing or two from him.

While digging into concrete scenarios, we accidentally discovered an XSS vulnerability in Scully. The vulnerability is caused by using the JSON.stringify() API in the wrong circumstances. Scully was not the first framework to make this mistake, which is why I knew to look for this exact pattern.

But we’re getting ahead of ourselves here. Let’s take a step back and dig a bit deeper into the problem. Keep reading to learn about …

  • Typical XSS attack vectors in server-side rendering
  • The vulnerability caused by using JSON.stringify()
  • The specifics of the vulnerability in Scully
  • A way to avoid XSS through JSON.stringify()
  • The process I followed to report a vulnerability in an NPM package

XSS in server-side rendering

If you have been following my security content for a while, you’re probably well aware that Angular offers secure defaults to render user-provided data into HTML pages. Other frameworks, such as React and Vue, offer a secure starting point, but beyond that, the developer has to be aware of several secure coding guidelines to avoid XSS. We discuss these guidelines specifically for React in this three-part blog post series.

With server-side rendering, the same security advice holds. Since the JS framework puts data into the pre-rendered pages, the data will either be encoded or sanitized, effectively mitigating XSS vulnerabilities. In a nutshell, following secure coding guidelines for your framework also helps avoid XSS vulnerabilities in server-side rendered pages.

So where’s the problem then?

The vulnerability we will discuss below appears as a side-effect of server-side rendering. When the browser renders the pre-rendered HTML, the data is visible, but interactive components are not yet loaded. To achieve that, a server-side rendered page also contains code to rehydrate components. This process happens in the background, while the user is viewing the page’s content. By the time the user is ready to interact, the full application will be loaded in the browser, giving the user a snappy user experience.

Part of the rehydration process is loading state into client-side components. This state is embedded in the page by the server-side rendering process. And this is where something interesting happens …

The vulnerability

A couple of years ago, the React world got shaken up by a nasty XSS vulnerability. This vulnerability was present in an official code example included in the Redux documentation pages. As a result, many applications contain this exact vulnerability.

The code snippet below contains the vulnerable code. Can you spot the problem?

The vulnerable code snippet from the Redux documentation


function renderFullPage(html, preloadedState) {
  return `
    <!doctype html>
    <html>
      <head>
        <title>Redux Universal Example</title>
      </head>
      <body>
        <div id="root">${html}</div>
        <script>
          window.__PRELOADED_STATE__ = ${JSON.stringify(preloadedState)}
        </script>
        <script src="/static/bundle.js"></script>
      </body>
    </html>
    `
}

Don’t worry if you can’t. This code looks perfectly fine, especially to our extraordinary human minds. Let’s take a closer look at how this code can cause problems.

The code example below shows the same code example but populated with actual data from the database. As you can see, the data includes the different fields of a restaurant review.

The vulnerable code snippet containing actual state, as sent to the browser


function renderFullPage(html, preloadedState) {
  return `
    <!doctype html>
    <html>
      <head>
        <title>Redux Universal Example</title>
      </head>
      <body>
        <div id="root">${html}</div>
        <script>
          window.__PRELOADED_STATE__ = 
            {"title":"The best meal ever!","content":"This place is amazing! ...","restaurantId":1,"id":1}
        </script>
        <script src="/static/bundle.js"></script>
      </body>
    </html>
    `
}

Again, this example is perfectly fine. There is nothing wrong here. But what if the attacker provides a malicious review? A sneaky attacker may include HTML in their review, which results in the following code being generated.

The vulnerable code snippet containing actual state, as sent to the browser


function renderFullPage(html, preloadedState) {
  return `
    <!doctype html>
    <html>
      <head>
        <title>Redux Universal Example</title>
      </head>
      <body>
        <div id="root">${html}</div>
        <script>
          window.__PRELOADED_STATE__ = 
            {"title":"oh shit!","content":"</script><script>alert('gotcha!')</script>","restaurantId":1,"id":1}
        </script>
        <script src="/static/bundle.js"></script>
      </body>
    </html>
    `
}

At first sight, there is still nothing wrong. But that’s because your brain uses implicit context information to process this code snippet (and the lack of code highlighting does not help yet).

In reality, we don’t parse HTML, but the browser does. The browser is responsible for parsing and rendering pages into nice visualizations. One of the first things the browser does is to run an HTML parser on the page’s contents. This parser looks for HTML elements being opened and matches the opening tags to closing tags. When the parser encounters the <script> tag on line 10, it starts looking for the closing tag. In your mind, you recognized the </script> tag on line 13 as the intended closing tag. However, the browser does not know this. Instead, it looks for the first occurrence of </script>, which is on line 12. The code example below only shows the HTML, using proper HTML syntax highlighting. This should illustrate the problem a lot more clearly.

The HTML as generated by the server-side rendering process


<!doctype html>
<html>
    <head>
    <title>Redux Universal Example</title>
    </head>
    <body>
    <div id="root">${html}</div>
    <script>
        window.__PRELOADED_STATE__ = 
        {"title":"oh shit!","content":"</script><script>alert('gotcha!')</script>","restaurantId":1,"id":1}
    </script>
    <script src="/static/bundle.js"></script>
    </body>
</html>

If we analyze this code block, we can see that the browser picks up two different script blocks, as illustrated below.

The first code block that the browser finds in the page


<script>
    window.__PRELOADED_STATE__ = 
    {"title":"oh shit!","content":"</script>

The second code block that the browser finds in the page, which contains attacker-provided code


<script>alert('gotcha!')</script>

To a human eye, this process makes no sense, but the browser literally does not know any better. The HTML parser first identifies JavaScript code blocks, which are then processed by the JavaScript engine. Because of these different contexts, the browser sees an HTML element in a JavaScript string as HTML code, which results in an XSS vulnerability.

This vulnerability is the same vulnerability that existed in the Redux code example. The cause of this vulnerability is parser confusion in the browser, caused by having HTML elements in a JavaScript string. By itself, the use of JSON.stringify() is fine. Using the output of JSON.stringify() in a JavaScript context will result in the expected behavior. However, using the output of JSON.stringify() in an HTML-based environment will cause problems.

A closer look at Scully

Scully is an Angular-based static site generator. During its build process, Scully will render all the pages in your application as full HTML pages. These pages contain all data to be rendered statically. They also contain the Scully code, which will bootstrap the Angular application after rendering the page.

One of the strengths of Scully is the ability to run the actual Angular application during the server-side rendering process. When the application fetches data, such as a list of reviews, the rendered pages will contain this data. While this process sounds worrying, it is quite safe. The data is put into the page by Angular, which has built-in strict contextual escaping and HTML sanitization to avoid XSS vulnerabilities.

So far, so good. But Scully does more than that. It also includes the data used to render components in the page. The client-side code uses this data to rehydrate the application components after the page has been rendered. You can see an example of such included data in the code snippet below.

The script block containing state, added by Scully at build time


<script id="ScullyIO-transfer-state">
  window['ScullyIO-transfer-state']=
  /** ___SCULLY_STATE_START___ */
    {"title":"The best meal ever!","content":"This place is amazing! ...","restaurantId":1,"id":1}
  /** ___SCULLY_STATE_END___ */
</script>

This code example should look familiar. The data is serialized as JSON and placed in a script block. Interesting, right? Whenever you see a pattern like this, it’s worth taking a look at the source code. In this case, the source code shows the use of JSON.stringify(), just like we discussed before.

Verify this vulnerability is quite straightforward. Let’s manipulate one of the reviews to include a payload and see how the browser responds. By adding </script><script>alert(...)</script> to a review, the script block generated by Scully now looks like this.

The script block containing state after the vulnerability has been exploited


<script id="ScullyIO-transfer-state">
  window['ScullyIO-transfer-state']=
  /** ___SCULLY_STATE_START___ */
    {"title":"oh shit!","content":"</script><script>alert('gotcha!')</script>","restaurantId":1,"id":1}
  /** ___SCULLY_STATE_END___ */
</script>

Rendering the page in the browser will trigger the execution of the JavaScript code.

This example uncovers an XSS vulnerability in Scully, triggered by malicious data used in the build process. While this vulnerability is dangerous, its level of exploitability is relatively low. Many Scully projects only use trusted data in the rendering process, instead of untrusted data coming from users.

Fixing the problem

Fortunately, the fix for this problem is not too complicated. If we can avoid the browser getting confused about HTML tags in JSON data, the entire problem goes away. The easiest way to achieve that is by encoding the characters used to define HTML elements. The specific type of encoding is less relevant, as long as the browser does not see the data as code anymore.

How do you do this encoding in practice? Well, there are three practical examples we can investigate:

  1. The fix used in the Redux example
  2. The code used by Angular Universal
  3. The fix deployed by Scully

Let’s look at each in a bit more detail.

The fix used in the Redux example

The code example below shows the corrected code example for loading state in a React application.

The corrected code snippet from the Redux documentation


function renderFullPage(html, preloadedState) {
  return `
    <!doctype html>
    <html>
      <head>
        <title>Redux Universal Example</title>
      </head>
      <body>
        <div id="root">${html}</div>
        <script>
          // WARNING: See the following for security issues around embedding JSON in HTML:
          // https://redux.js.org/recipes/server-rendering/#security-considerations
          window.__PRELOADED_STATE__ = 
            ${JSON.stringify(preloadedState).replace(/</g, '\\u003c')}
        </script>
        <script src="/static/bundle.js"></script>
      </body>
    </html>
    `
}

Alternatively, as discussed in the original post about this vulnerability, the state can be serialized using the Serialize JavaScript library to transform a JavaScript object into a JSON data structure. The result of running this function is JSON output with properly encoded HTML characters.

An alternative solution using the *Serialize JavaScript* library


var serialize = require("serialize-javascript");

function renderFullPage(html, preloadedState) {
  return `
    <!doctype html>
    <html>
      <head>
        <title>Redux Universal Example</title>
      </head>
      <body>
        <div id="root">${html}</div>
        <script>
          // WARNING: See the following for security issues around embedding JSON in HTML:
          // https://redux.js.org/recipes/server-rendering/#security-considerations
          window.__PRELOADED_STATE__ = ${serialize(preloadedState, { isJSON: true })}
        </script>
        <script src="/static/bundle.js"></script>
      </body>
    </html>
    `
}

The code used by Angular Universal

While researching how Angular Universal handles transferring state, Sam and I discovered that Angular Universal already encodes the data before putting it into the page. You can find the exact code from Angular Universal in the code snippet below. The full code is available in this location on Github.

A selection of code from Angular Universal to correctly serialize JSON data


// https://github.com/angular/angular/blob/362f45c4bf1bb49a90b014d2053f4c4474d132c0/packages/platform-server/src/transfer_state.ts#L21
script.textContent = escapeHtml(transferStore.toJson());

// https://github.com/angular/angular/blob/6dc43a475bf72a81ddc7d226f722488b04ed2582/packages/platform-browser/src/browser/transfer_state.ts#L12
export function escapeHtml(text: string): string {
  const escapedText: {[k: string]: string} = {
    '&': '&a;',
    '"': '&q;',
    '\'': '&s;',
    '<': '&l;',
    '>': '&g;',
  };
  return text.replace(/[&"'<>]/g, s => escapedText[s]);
}

As you can see in the code snippet, Angular Universal first serializes the data into JSON and then encodes dangerous characters. This approach avoids that the browser becomes confused, so it effectively neutralizes the vulnerability.

The fix deployed by Scully

Version 1.0.10 of Scully also neutralizes this problem by encoding dangerous characters in the HTML code. This commit by Scully maintainer Sander Elias shows the fix for this problem. The commit is part of a larger pull request.

If you look at the code, you can see that Scully takes a bit of a different approach than Angular Universal. From my discussions with Sander, it became clear that this was necessary because the state can contain sequences of escaped HTML, which could break with the additional encoding. To avoid issues, Scully uses a custom encoding for transferring the state. The script block with the state now looks like the example below and no longer confuses the browser.

The script block containing state after Scully has encoded the HTML


<script id="ScullyIO-transfer-state">
  window['ScullyIO-transfer-state']=
  /** ___SCULLY_STATE_START___ */
    {"title":"oh shit!","content":"_~l~_~s~script_~g~_~l~script_~g~alert(_~q~gotcha!_~q~)_~l~_~s~script_~g~","restaurantId":1,"id":1}
  /** ___SCULLY_STATE_END___ */
</script>

Reporting the issue

Reporting security vulnerabilities is always a bit tricky. Dropping the vulnerability in the public Github repository exposes it to the world while being triaged and fixed. That’s why I decided to reach out privately before logging the vulnerability in public databases. The whole process was kind of new to me, and since I’m likely not alone, I decided to document my reporting efforts here, hoping it may help someone else in the future.

Step 1 - Reporting the vulnerability privately

First of all, I am not a Scully expert, so while I was pretty sure that the vulnerability was problematic, there was still a chance that Scully never serialized untrusted data. To play it safe, I decided to reach out to the maintainers privately. If this was indeed a problem, the vulnerability should be fixed before announcing it publicly to the world.

I explained the problem in an email and followed up with Sander Elias over a couple of Twitter DMs. Once we got going, Sander quickly found the right solution and dropped a PR for the fix (I guess it’s public information now, after all).

This step, which fixes the problem, is the most crucial. However, to ensure maximum adoption of the updated version, it makes sense to have this vulnerability reported as a security issue. Doing so will trigger dependency monitoring systems to alert developers about the need to update, which will help propagate the patch.

Step 2 - Reporting the vulnerability to Snyk

Since this was my first time reporting a vulnerability, I did not really know where to go. I looked around for a way to report issues on NPM or Github. Still, I could not find anything intended for security vulnerabilities in code (NPM has a button to report malware, though).

I use Snyk for vulnerability management in my projects. Since Snyk is a major player in the vulnerability management space, I looked into their resources on reporting vulnerabilities. This form allows anyone to report a vulnerability. It takes a bit of effort to fill out each of the fields, but it’s a pretty straightforward process. I submitted the form and updated the Scully maintainers, as we agreed upon before.

I never heard back from Snyk, which left me pretty confused about the entire process. However, searching for scully in the Snyk vulnerability database shows that my report has been processed and is now listed as a vulnerability. Digging a bit deeper reveals that the vulnerability is only published in Snyk’s database but is not propagated to other databases. It also does not show up in tools like npm audit.

Step 3 - Help!?

Fortunately, I know a few people from Snyk from hanging out at conferences (at a different time). After checking in with Liran Tal, it turned out that there was a snafu in the process. I should have been notified. Also, the vulnerability should have been assigned a CVE number. A nudge from Liran goes a long way since both issues have been fixed. The vulnerability is now available in the National Vulnerability Database.

In the meantime, I also reached out to Twitter about the issue. As it turns out, many people are somewhat confused about reporting vulnerabilities, and there is no exact process to follow. Perhaps something we should clarify as a community.

Step 4 - Analyzing dependencies

At the time of writing, the vulnerability has been fixed for about a month and publicly reported for 2 - 3 weeks. The vulnerability is listed in Snyk’s database and the NVD database. Scully shows up as a vulnerable dependency when running the snyk test command, but not in NPM’s built-in npm audit tool. From a security perspective, it’s a bit disappointing to see the difficulty with vulnerability management.

A key takeaway here is not relying on a single database but attempting to use multiple sources of vulnerability data instead.

Summary

To summarize, serializing JavaScript objects with JSON.stringify() if perfectly fine in a JavaScript context. When the serialized data ends up in an HTML context, it can lead to XSS vulnerabilities.

To avoid XSS vulnerabilities, dangerous HTML characters should be encoded. For most projects, the use of the Serialize JavaScript library should suffice, but if necessary, custom encodings can be applied.



About Dr. Philippe De Ryck

Hi, I'm Philippe, and I help developers protect companies through better web security. Learn more about my security training program, advisory services, or check out my recorded conference talks.

Want to learn more about OAuth 2.0 and OpenID Connect?

Save yourself days of digging through dozens of specs with this online course

More information
Philippe De Ryck

Dr. Philippe De Ryck

Hi, I'm Philippe, and I help developers protect companies through better web security. As the founder of Pragmatic Web Security, I travel the world to teach practitioners the ins and outs of building secure software.


Talks and workshops

You will often find me speaking and teaching at public and private events around the world. My talks always encourage developers to step up and get security right.


Articles

Security is often about small nuances. In my articles, I dive deeper into various security topics, providing concrete guidelines and advice. My articles also answer questions I often get while speaking or teaching.


Security resources

Getting security right is all about knowledge. I strongly believe in sharing that knowledge to move forward as a community. Among my resources, you can find developer cheat sheets, recorded talks, and extensive slide decks.


Mailing list

Subscribe to the Pragmatic Web Security mailing list to stay up to date on the latest activities and resources.

Subscribe