My favorites | Sign in
Logo
                
Search
for
Updated Jan 14, 2009 by mikesamuel
JsHtmlSanitizer  
How to use caja as a stand-alone client side sanitizer

Introduction

The Caja project includes a html-sanitizer written in javascript which can be used independently of the cajoler. You can use it to remove potentially executable javascript from a snippet of html. To use it, first build html-sanitizer-minified.js by running ant.

Use a <script> tag to include the resulting com/google/caja/plugin/html-sanitizer-minified.js in your program. To sanitize a snippet of javascript, use the html_sanitize(htmlSnippet, urlTransformer, nameIdClassTransformer) to sanitize your html snippet where:

The return value is the html snippet with all script and style tags removed (style tags can include code which is interpreted as javascript on some browsers), and urls, ids, names and classes rewritten according to the transformers.

If you need more control, you can use html.makeSaxParser to create your own SAX style processor. makeSaxParser takes as its argument, an object that contains event handlers like:

var mySaxParser = html.makeSaxParser(
    {
      startDoc: function (x) { /* called first before processing starts */ },
      startTag: function (tagNameLowerCase, attribs, x) {
        // called on start tags.  may modify attribs.
      },
      endTag: function (tagName, x) {
        // called on end tags.
      },
      pcdata: function (plainText, x) {
        // plainText has entities replaced with the literal value.
      },
      rcdata: function (plainText, x) {
        // contents of a TITLE, TEXTAREA, or similar tag.
      },
      cdata: function (plainText, x) {
        // contents of a SCRIPT, STYLE, XMP, or similar tag.
      },
      endDoc: function (x) {
        // called when processing finished.
      }
    });

After this call, mySaxParser is a function that takes HTML text and an arbitrary value that will be passed as the parameter x to the event handlers above.

Example

    <script src="html-sanitizer-minified.js"></script>
    <script>
      function urlX(url) { if(/^https?:\/\//.test(url)) { return url }}
      function idX(id) { return id }
      alert(html_sanitize('<b>hello</b><img src="http://asdf"><a href="javascript:alert(0)"><script src="http://dfd"><\/script>', urlX, idX))
    </script>

Sign in to add a comment