{"id":1413,"date":"2013-09-24T13:27:45","date_gmt":"2013-09-24T20:27:45","guid":{"rendered":"http:\/\/blog.mozilla.org\/security\/?p=1413"},"modified":"2013-09-24T13:27:45","modified_gmt":"2013-09-24T20:27:45","slug":"introducing-html2dom-an-alternative-to-setting-innerhtml","status":"publish","type":"post","link":"https:\/\/blog.mozilla.org\/security\/2013\/09\/24\/introducing-html2dom-an-alternative-to-setting-innerhtml\/","title":{"rendered":"Introducing html2dom, an alternative to setting innerHTML"},"content":{"rendered":"<article>Having spent significant time to <a href=\"https:\/\/wiki.mozilla.org\/Security\/Reviews\/Gaia\">review the source code of some Firefox OS core apps<\/a>, I noticed that a lot of developers like to use <code>innerHTML<\/code> (or <code>insertAdjacentHTML<\/code>). It is indeed a useful API to insert HTML from a given string without hand-crafting objects for each and every node you want to insert into the DOM.<br \/>\nThe dilemma begins however, when this is not a hardcoded string but something which is constructed dynamically. If the string contains user input (or something from a malicious third-party &#8211; be it app or website), it may as well insert and change application logic (Cross-Site Scripting): The typical example would be a <code>&lt;script&gt;<\/code> tag that runs code on the attacker&#8217;s behalf and reads, modifies or forwards the current content to a third-party. <a title=\"Content Security Policy\" href=\"https:\/\/developer.mozilla.org\/en\/docs\/Security\/CSP\">CSP<\/a>, which we use in Firefox OS, can only mitigate some of these attacks, but <a href=\"http:\/\/lcamtuf.blogspot.de\/2011\/12\/notes-about-post-xss-world.html\">certainly not all<\/a>.<\/p>\n<h4>Using innerHTML is bad (Hint: DOM XSS)<\/h4>\n<p>What&#8217;s also frustrating about these pieces of code is that analyzing it requires you to manually trace every function call and variable back to its definition to see whether it is indeed tainted by user input.<\/p>\n<p>With code changing frequently those reviews don&#8217;t really scale. One possible approach is to avoid using <code>innerHTML<\/code> for good. Even though this idea sounds a bit naive, I have dived into the world of automated HTML parsing and code generation to see how feasible it is.<\/p>\n<h4>Enter html2dom<\/h4>\n<p>For the sake of experimentation (and solving this neatly self-contained problem), I have created <a href=\"https:\/\/github.com\/freddyb\/html2dom\">html2dom<\/a>. html2dom is a tiny library that accepts a HTML string and returns alternative JavaScript source code. Example:<\/p>\n<div>\n<pre>&lt;p id=\"greeting\"&gt;Hello &lt;b&gt;World&lt;\/b&gt;&lt;\/p&gt;<\/pre>\n<\/div>\n<p>Will yield this (as a string).<\/p>\n<div>\n<pre>var docFragment = document.createDocumentFragment();\r\n\/\/ this fragment contains all DOM nodes\r\nvar greeting = document.createElement('P');\r\ngreeting.setAttribute(\"id\", \"greeting\");\r\ndocFragment.appendChild(greeting);\r\nvar text = document.createTextNode(\"Hello \");\r\ngreeting.appendChild(text);\r\nvar b = document.createElement('B');\r\ngreeting.appendChild(b);\r\nvar text_0 = document.createTextNode(\"World\");\r\nb.appendChild(text_0);<\/pre>\n<\/div>\n<p>As you can see, html2dom tries to use meaningful variable names to make the code readable. If you want, you can try the <a href=\"http:\/\/freddyb.github.io\/html2dom\/\">demo here<\/a>. Now we could also just replace the <code>\"World\"<\/code> string with a JavaScript variable. It cannot do any harm as it is always rendered as text.<\/p>\n<h4>When it comes to HTML parsers, you <em>also<\/em> don&#8217;t want to write your own.<\/h4>\n<p>Luckily, there are numerous very useful APIs which helped making the development of html2dom fairly easy. First there is the <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/API\/DOMParser\">DOMParser API<\/a> which took care about all HTML parsing. Using the DOM tree output, I could just iterate over all nodes and their children to emit a specific piece of JavaScript depending on its type (e.g., HTML or Text). For this, the <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/DOM\/Document.createNodeIterator\">nodeIterator<\/a> turned out really valuable.<\/p>\n<p>I have also written a few <a href=\"http:\/\/freddyb.github.io\/html2dom\/tests\/tests.html\">unit tests<\/a>, so if you want to start messing with my code, I suggest you start by checking them out right away.<\/p>\n<h4>Known Bugs &amp; Security<\/h4>\n<p>This tool doesn&#8217;t really save you from all of your troubles. But if you can, make sure that the user input is always somewhere in a text node, then html2dom can prevent you from a great deal of harm. <a href=\"http:\/\/freddyb.github.io\/html2dom\/\">Give it a try!<\/a><\/p>\n<h4>On the horizon<\/h4>\n<p>I have also been looking at attempts to rewrite potentially dangerous JavaScript automatically. This is at an early stage and still experimental but you can look at a <a href=\"http:\/\/people.mozilla.com\/%7Efbraun\/falafler\/\">prototype<\/a> here<\/p>\n<\/article>\n","protected":false},"excerpt":{"rendered":"<p>Having spent significant time to review the source code of some Firefox OS core apps, I noticed that a lot of developers like to use innerHTML (or insertAdjacentHTML). It is &hellip; <a class=\"go\" href=\"https:\/\/blog.mozilla.org\/security\/2013\/09\/24\/introducing-html2dom-an-alternative-to-setting-innerhtml\/\">Read more<\/a><\/p>\n","protected":false},"author":405,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[69],"tags":[],"coauthors":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v22.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Introducing html2dom, an alternative to setting innerHTML - Mozilla Security Blog<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/blog.mozilla.org\/security\/2013\/09\/24\/introducing-html2dom-an-alternative-to-setting-innerhtml\/\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Frederik Braun\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/blog.mozilla.org\/security\/2013\/09\/24\/introducing-html2dom-an-alternative-to-setting-innerhtml\/\",\"url\":\"https:\/\/blog.mozilla.org\/security\/2013\/09\/24\/introducing-html2dom-an-alternative-to-setting-innerhtml\/\",\"name\":\"Introducing html2dom, an alternative to setting innerHTML - Mozilla Security Blog\",\"isPartOf\":{\"@id\":\"https:\/\/blog.mozilla.org\/security\/#website\"},\"datePublished\":\"2013-09-24T20:27:45+00:00\",\"dateModified\":\"2013-09-24T20:27:45+00:00\",\"author\":{\"@id\":\"https:\/\/blog.mozilla.org\/security\/#\/schema\/person\/9a9b6565cbac3c698b84dbd7447e438f\"},\"breadcrumb\":{\"@id\":\"https:\/\/blog.mozilla.org\/security\/2013\/09\/24\/introducing-html2dom-an-alternative-to-setting-innerhtml\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/blog.mozilla.org\/security\/2013\/09\/24\/introducing-html2dom-an-alternative-to-setting-innerhtml\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/blog.mozilla.org\/security\/2013\/09\/24\/introducing-html2dom-an-alternative-to-setting-innerhtml\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/blog.mozilla.org\/security\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Introducing html2dom, an alternative to setting innerHTML\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/blog.mozilla.org\/security\/#website\",\"url\":\"https:\/\/blog.mozilla.org\/security\/\",\"name\":\"Mozilla Security Blog\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/blog.mozilla.org\/security\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/blog.mozilla.org\/security\/#\/schema\/person\/9a9b6565cbac3c698b84dbd7447e438f\",\"name\":\"Frederik Braun\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/blog.mozilla.org\/security\/#\/schema\/person\/image\/f188d5ece9062fd6ec08fbeb06809792\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/1f41f3ef916e1c1fc9401cf3212a6708?s=96&d=identicon&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/1f41f3ef916e1c1fc9401cf3212a6708?s=96&d=identicon&r=g\",\"caption\":\"Frederik Braun\"},\"description\":\"Frederik Braun defends Mozilla Firefox as a Staff Security Engineer in Berlin. He's also a member of the W3C Web Application Security Working Group and co-authored the Subresource Integrity standard.\",\"sameAs\":[\"https:\/\/frederik-braun.com\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Introducing html2dom, an alternative to setting innerHTML - Mozilla Security Blog","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/blog.mozilla.org\/security\/2013\/09\/24\/introducing-html2dom-an-alternative-to-setting-innerhtml\/","twitter_misc":{"Written by":"Frederik Braun","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/blog.mozilla.org\/security\/2013\/09\/24\/introducing-html2dom-an-alternative-to-setting-innerhtml\/","url":"https:\/\/blog.mozilla.org\/security\/2013\/09\/24\/introducing-html2dom-an-alternative-to-setting-innerhtml\/","name":"Introducing html2dom, an alternative to setting innerHTML - Mozilla Security Blog","isPartOf":{"@id":"https:\/\/blog.mozilla.org\/security\/#website"},"datePublished":"2013-09-24T20:27:45+00:00","dateModified":"2013-09-24T20:27:45+00:00","author":{"@id":"https:\/\/blog.mozilla.org\/security\/#\/schema\/person\/9a9b6565cbac3c698b84dbd7447e438f"},"breadcrumb":{"@id":"https:\/\/blog.mozilla.org\/security\/2013\/09\/24\/introducing-html2dom-an-alternative-to-setting-innerhtml\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/blog.mozilla.org\/security\/2013\/09\/24\/introducing-html2dom-an-alternative-to-setting-innerhtml\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/blog.mozilla.org\/security\/2013\/09\/24\/introducing-html2dom-an-alternative-to-setting-innerhtml\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/blog.mozilla.org\/security\/"},{"@type":"ListItem","position":2,"name":"Introducing html2dom, an alternative to setting innerHTML"}]},{"@type":"WebSite","@id":"https:\/\/blog.mozilla.org\/security\/#website","url":"https:\/\/blog.mozilla.org\/security\/","name":"Mozilla Security Blog","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/blog.mozilla.org\/security\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/blog.mozilla.org\/security\/#\/schema\/person\/9a9b6565cbac3c698b84dbd7447e438f","name":"Frederik Braun","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/blog.mozilla.org\/security\/#\/schema\/person\/image\/f188d5ece9062fd6ec08fbeb06809792","url":"https:\/\/secure.gravatar.com\/avatar\/1f41f3ef916e1c1fc9401cf3212a6708?s=96&d=identicon&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/1f41f3ef916e1c1fc9401cf3212a6708?s=96&d=identicon&r=g","caption":"Frederik Braun"},"description":"Frederik Braun defends Mozilla Firefox as a Staff Security Engineer in Berlin. He's also a member of the W3C Web Application Security Working Group and co-authored the Subresource Integrity standard.","sameAs":["https:\/\/frederik-braun.com"]}]}},"_links":{"self":[{"href":"https:\/\/blog.mozilla.org\/security\/wp-json\/wp\/v2\/posts\/1413"}],"collection":[{"href":"https:\/\/blog.mozilla.org\/security\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.mozilla.org\/security\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/security\/wp-json\/wp\/v2\/users\/405"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/security\/wp-json\/wp\/v2\/comments?post=1413"}],"version-history":[{"count":0,"href":"https:\/\/blog.mozilla.org\/security\/wp-json\/wp\/v2\/posts\/1413\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.mozilla.org\/security\/wp-json\/wp\/v2\/media?parent=1413"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.mozilla.org\/security\/wp-json\/wp\/v2\/categories?post=1413"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.mozilla.org\/security\/wp-json\/wp\/v2\/tags?post=1413"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/blog.mozilla.org\/security\/wp-json\/wp\/v2\/coauthors?post=1413"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}