{"id":1617,"date":"2012-07-09T12:13:26","date_gmt":"2012-07-09T01:13:26","guid":{"rendered":"http:\/\/blog.mozilla.org\/nnethercote\/?p=1617"},"modified":"2012-07-10T10:45:55","modified_gmt":"2012-07-09T23:45:55","slug":"how-to-compare-the-memory-efficiency-of-web-browsers","status":"publish","type":"post","link":"https:\/\/blog.mozilla.org\/nnethercote\/2012\/07\/09\/how-to-compare-the-memory-efficiency-of-web-browsers\/","title":{"rendered":"How to Compare the Memory Efficiency of Web Browsers"},"content":{"rendered":"<p><strong>TL;DR: Cross-browser comparisons of memory consumption should be avoided.\u00a0 If you want to evaluate how efficiently browsers use memory, you should do cross-browser comparisons of <em>performance<\/em> across several machines featuring a range of memory configurations.<\/strong><\/p>\n<h3>Cross-browser Memory Comparisons are Bad<\/h3>\n<p>Various tech sites periodically compare the performance of browsers.\u00a0 These often involve some cross-browser comparisons of memory efficiency.\u00a0 A typical one would be this:\u00a0 open a bunch of pages in tabs, measure memory consumption, then close all of them except one and wait two minutes, and then measure memory consumption again.\u00a0 Users sometimes do similar testing.<\/p>\n<p>I think comparisons of memory consumption like these are (a) very difficult to make correctly, and (b) very difficult to interpret meaningfully.\u00a0 I have suggestions below for alternative ways to measure memory efficiency of browsers, but first I&#8217;ll explain why I think these comparisons are a bad idea.<\/p>\n<h4>Cross-browser Memory Comparisons are Difficult to Make<\/h4>\n<p>Getting apples-to-apples comparisons are really difficult.<\/p>\n<ol>\n<li>Browser memory measurements aren&#8217;t easy.\u00a0 In particular, all browsers use multiple processes, and accounting for shared memory is difficult.<\/li>\n<li>Browsers are non-deterministic programs, and this can cause wide variation in memory consumption results.\u00a0 In particular, whether or not the JavaScript garbage collector runs can greatly reduce memory consumption.\u00a0 If you get unlucky and the garbage collector runs just after you measure, you&#8217;ll get an unfairly high number.<\/li>\n<li>Browsers can exhibit adaptive memory behaviour.\u00a0 If running on a machine with lots of free RAM, a browser may choose to take advantage of it;\u00a0 if running on a machine with little free RAM, a browser may choose to discard regenerable data more aggressively.<\/li>\n<\/ol>\n<p>If you are comparing two versions of the same browser, problems (1) and (3) are avoided, and so if you are careful with problem (2) you can get reasonable results.\u00a0 But comparing different browsers hits all three problems.<\/p>\n<p>Indeed, Tom&#8217;s Hardware de-emphasized memory consumption measurements in their latest <a href=\"http:\/\/www.tomshardware.com\/reviews\/windows-7-chrome-20-firefox-13-opera-12,3228-12.html\">Web Browser Grand Prix<\/a> due to problem (3).\u00a0 Kudos to them!<\/p>\n<h4>Cross-browser Memory Comparisons are Difficult to Interpret<\/h4>\n<p>Even if you could get the measurements right, memory consumption is still not a good thing to compare.\u00a0 Before I can explain why, I&#8217;ll introduce a couple of terms.<\/p>\n<ul>\n<li>A <em>primary metric<\/em> is one a user can directly perceive.\u00a0 Metrics that measure performance and crash rate are good examples.<\/li>\n<li>A <em>secondary metric<\/em> is one that a user can only indirectly perceive via some kind of tool.\u00a0 Memory consumption is one example.\u00a0 The L2 cache miss rate is another example.<\/li>\n<\/ul>\n<p>(I made up these terms, I don&#8217;t know if there are existing terms for these concepts.)<\/p>\n<p>Primary metrics are obviously important, precisely because user can detect them.\u00a0 They measure things that users notice:\u00a0 &#8220;this browser is fast\/slow&#8221;, &#8220;this browser crashes all the time&#8221;, etc.<\/p>\n<p>Secondary metrics are important because they can affect primary metrics:\u00a0 memory consumption can affect performance and crash rate;\u00a0 the L2 cache miss rate can affect performance.<\/p>\n<p>Secondary metrics are also difficult to interpret.\u00a0 They can certainly be suggestive, but there are lots of secondary metrics that affect each primary metric of interest, so focusing too strongly on any single secondary metric is not a good idea.\u00a0 For example, if browser A has a higher L2 cache miss rate than browser B, that&#8217;s suggestive, but you&#8217;d be unwise to draw any strong conclusions from it.<\/p>\n<p>Furthermore, memory consumption is harder to interpret than many other secondary metrics.\u00a0 If all else is equal, a higher L2 cache miss rate is worse than a lower one.\u00a0 But that&#8217;s not true for memory consumption.\u00a0 There are all sorts of time\/space trade-offs that can be made, and there are many cases where using more memory can make browsers faster;\u00a0 JavaScript JITs are a great example.<\/p>\n<p>And I haven&#8217;t even discussed which memory consumption metric you should use.\u00a0 Physical memory consumption is an obvious choice, but I&#8217;ll discuss this more below.<\/p>\n<h3>A Better Methodology<\/h3>\n<p>So, I&#8217;ve explained why I think you shouldn&#8217;t do cross-browser memory comparisons.\u00a0 That doesn&#8217;t mean that efficient usage of memory isn&#8217;t important! However, instead of directly measuring memory consumption &#8212; a secondary metric &#8212; it&#8217;s far better to measure the effect of memory consumption on primary metrics such as performance.<\/p>\n<p>In particular, I think people often use memory consumption measurements as a proxy for performance on machines that don&#8217;t have much RAM.\u00a0 If you care about performance on machines that don&#8217;t have much RAM, <em>you should measure performance on a machine that doesn&#8217;t have much RAM<\/em> instead of trying to infer it from another measurement.<\/p>\n<h4>Experimental Setup<\/h4>\n<p>I did exactly this by doing something I call <em>memory sensitivity testing<\/em>, which involves measuring browser performance across a range of memory configurations.\u00a0 My test machine had the following characteristics.<\/p>\n<ul>\n<li>CPU: Intel i7-2600 3.4GHz (quad core with hyperthreading)<\/li>\n<li>RAM: 16GB DDR3<\/li>\n<li>OS: Ubuntu 11.10, Linux kernel version 3.0.0.<\/li>\n<\/ul>\n<p>I used a Linux machine because Linux has a feature called <a href=\"http:\/\/en.wikipedia.org\/wiki\/Cgroups\">cgroups<\/a> that allows you to restrict the machine resources available to one or more processes.\u00a0 I followed Justin Lebar&#8217;s <a href=\"http:\/\/jlebar.com\/2011\/6\/15\/Limiting_the_amount_of_RAM_a_program_can_use.html\">instructions<\/a> to create the following configurations that limited the amount of physical memory available: 1024MiB, 768MiB, 512MiB, 448MiB, 384MiB, 320MiB, 256MiB, 192MiB, 160MiB, 128MiB, 96MiB, 64MiB, 48MiB, 32MiB.<\/p>\n<p>(The more obvious way to do this is to use <code>ulimit<\/code>, but as far as I can tell it doesn&#8217;t work on recent versions of <a href=\"http:\/\/stackoverflow.com\/questions\/3043709\/resident-set-size-rss-limit-has-no-effect\/3043778#3043778\">Linux<\/a> or on <a href=\"http:\/\/forums.macrumors.com\/showthread.php?t=573616\">Mac<\/a>.\u00a0 And I don&#8217;t know of any way to do this on Windows.\u00a0 So my experiments had to be on Linux.)<\/p>\n<p>I used the following browsers.<\/p>\n<ul>\n<li>Firefox 12 Nightly, from 2012-01-10 (64-bit)<\/li>\n<li>Firefox 9.0.1 (64-bit)<\/li>\n<li>Chrome 16.0.912.75 (64-bit)<\/li>\n<li>Opera 11.60 (64-bit)<\/li>\n<\/ul>\n<p>IE and Safari aren&#8217;t represented because they don&#8217;t run on Linux.\u00a0 Firefox is over-represented because that&#8217;s the browser I work on and care about the most \ud83d\ude42\u00a0 The versions are a bit old because I did this testing about six months ago.<\/p>\n<p>I used the following benchmark suites:\u00a0 Sunspider v0.9.1, V8 v6, Kraken v1.1.\u00a0 These are all JavaScript benchmarks and are all awful for gauging a browser&#8217;s memory efficiency;\u00a0 but they have the key advantage that they run quite quickly.\u00a0 I thought about using Dromaeo and Peacekeeper to benchmark other aspects of browser performance, but they take several minutes each to run and I didn&#8217;t have the patience to run them a large number of times.\u00a0 This isn&#8217;t ideal, but I did this exercise to test-drive a benchmarking methodology, not make a definitive statement about each browser&#8217;s memory efficiency, so please forgive me.<\/p>\n<h4>Experimental Results<\/h4>\n<p>The following graph shows the Sunspider results.\u00a0 (Click on it to get a larger version.)<\/p>\n<p><a href=\"http:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/01\/SS091.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-large wp-image-1625\" title=\"SS091\" src=\"http:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/01\/SS091-1024x640.png\" alt=\"sunspider results graph\" width=\"500\" height=\"312\" srcset=\"https:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/01\/SS091-1024x640.png 1024w, https:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/01\/SS091-300x187.png 300w, https:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/01\/SS091.png 1689w\" sizes=\"(max-width: 500px) 100vw, 500px\" \/><\/a><\/p>\n<p>As the lines move from right to left, the amount of physical memory available drops.\u00a0 Firefox was clearly the fastest in most configurations, with only minor differences between Firefox 9 and Firefox 12pre, but it slowed down drastically below 160MiB;\u00a0 this is exactly the kind of curve I was expecting.\u00a0 Opera was next fastest in most configurations, and then Chrome, and both of them didn&#8217;t show any noticeable degradation at any memory size, which was surprising and impressive.<\/p>\n<p>All the browsers crashed\/aborted if memory was reduced enough.\u00a0 The point at which the graphs stop on the left-hand side indicate the lowest size that each browser successfully handled.\u00a0 None of the browsers ran Sunspider with 48MiB available, and FF12pre failed to run it with 64MiB available.<\/p>\n<p>The next graph shows the V8 results.<\/p>\n<p><a href=\"http:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/01\/V8v6.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-large wp-image-1626\" title=\"V8v6\" src=\"http:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/01\/V8v6-1024x641.png\" alt=\"v8 results graph\" width=\"500\" height=\"312\" srcset=\"https:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/01\/V8v6-1024x641.png 1024w, https:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/01\/V8v6-300x187.png 300w, https:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/01\/V8v6.png 1689w\" sizes=\"(max-width: 500px) 100vw, 500px\" \/><\/a><\/p>\n<p>The curves go the opposite way because V8 produces a score rather than a time, and bigger is better.\u00a0 Chrome easily got the best scores.\u00a0 Both Firefox versions degraded significantly.\u00a0 Chrome and Opera degraded somewhat, and only at lower sizes.\u00a0 Oddly enough, FF9 was the only browser that managed to run V8 with 128MiB available;\u00a0 the other three only ran it with 160MiB or more available.<\/p>\n<p>I don&#8217;t particularly like V8 as a benchmark.\u00a0 I&#8217;ve always found that it doesn&#8217;t give consistent results if you run it multiple times, and these results concur with that observation.\u00a0 Furthermore, I don&#8217;t like that it gives a score rather than a time or inverse-time (such as runs per second), because it&#8217;s unclear how different scores relate.<\/p>\n<p>The final graph shows the Kraken results.<\/p>\n<p><a href=\"http:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/01\/Kraken11.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-large wp-image-1627\" title=\"Kraken11\" src=\"http:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/01\/Kraken11-1024x641.png\" alt=\"kraken results graph\" width=\"500\" height=\"312\" srcset=\"https:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/01\/Kraken11-1024x641.png 1024w, https:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/01\/Kraken11-300x187.png 300w, https:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/01\/Kraken11.png 1690w\" sizes=\"(max-width: 500px) 100vw, 500px\" \/><\/a><\/p>\n<p>As with Sunspider, Chrome barely degraded and both Firefoxes degraded significantly.\u00a0 Opera was easily the slowest to begin with and degraded massively;\u00a0 nonetheless, it managed to run with 128MiB available (as did Chrome), which neither Firefox managed.<\/p>\n<h4>Experimental Conclusions<\/h4>\n<p>Overall, Chrome did well, and Opera and the two Firefoxes had mixed results. But I did this experiment to test a methodology, not to crown a winner.\u00a0 (And don&#8217;t forget that these experiments were done with browser versions that are now over six months old.)\u00a0 My main conclusion is that <strong>Sunspider, V8 and Kraken are not good benchmarks when it comes to gauging how efficiently browsers use memory<\/strong>.\u00a0 For example, none of the browsers slowed down on Sunspider until memory was restricted to 128MiB, which is a ridiculously small amount of memory for a desktop or laptop machine;\u00a0 it&#8217;s small even for a smartphone.\u00a0 V8 is clearly stresses memory consumption more, but it&#8217;s still not great.<\/p>\n<p>What would a better benchmark look like?\u00a0 I&#8217;m not completely sure, but it would certainly involve opening multiple tabs and simulate real-world browsing. Something like Membench (see <a href=\"http:\/\/gregor-wagner.com\/?p=79\">here<\/a> and <a href=\"http:\/\/gregor-wagner.com\/?p=95\">here<\/a>) might be a reasonable starting point.\u00a0 To test the impact of memory consumption on performance, a clear performance measure would be required, because Membench lacks one currently.\u00a0 To test the impact of memory consumption on crash rate, Membench could be modified to just keep opening pages until the browser crashes.\u00a0 (The trouble with that is that you&#8217;d lose your count when the browser crashed!\u00a0 You&#8217;d need to log the current count to a file or something like that.)<\/p>\n<p>BTW, if you are thinking &#8220;you&#8217;ve just measured the <a href=\"http:\/\/en.wikipedia.org\/wiki\/Working_set_size\">working set size<\/a>&#8220;, you&#8217;re exactly right! I think working set size is probably the best metric to use when evaluating memory consumption of a browser.\u00a0 Unfortunately it&#8217;s hard to measure (as we&#8217;ve seen) and it is best measured via a\u00a0 curve rather than a single number.<\/p>\n<h3>\u00a0A Simpler Methodology<\/h3>\n<p>I think memory sensitivity testing is an excellent way to gauge the memory efficiency of different browsers.\u00a0 (In fact, the same methodology can be used for any kind of program, not just browsers.)<\/p>\n<p>But the above experiment wasn&#8217;t easy:\u00a0 it required a Linux machine, some non-trivial configuration of that machine that took me a while to get working, and at least 13 runs of each benchmark suite for each browser.\u00a0 I understand that tech sites would be reluctant to do this kind of testing, especially when longer-running benchmark suites such as Dromaeo and Peacekeeper are involved.<\/p>\n<p>A simpler alternative that would still be quite good would be to perform all the performance tests on several machines with different memory configurations.\u00a0 For example, a good experimental setup might involve the following machines.<\/p>\n<ul>\n<li>A fast desktop with 8GB or 16GB of RAM.<\/li>\n<li>A mid-range laptop with 4GB of RAM.<\/li>\n<li>A low-end netbook with 1GB or even 512MB of RAM.<\/li>\n<\/ul>\n<p>This wouldn&#8217;t require nearly as many runs as full-scale memory sensitivity testing would.\u00a0 It would avoid all the problems of cross-browser memory consumption comparisons:\u00a0 difficult measurements, non-determinism, and adaptive behaviour.\u00a0 It would avoid secondary metrics in favour of primary metrics.\u00a0 And it would give results that are easy for anyone to understand.<\/p>\n<p>(In some ways it&#8217;s even better than memory sensitivity testing because it involves real machines &#8212; a machine with a 3.4GHz i7-2600 CPU and only 128MiB of RAM isn&#8217;t a realistic configuration!)<\/p>\n<p>I&#8217;d love it if tech sites started doing this.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>TL;DR: Cross-browser comparisons of memory consumption should be avoided.\u00a0 If you want to evaluate how efficiently browsers use memory, you should do cross-browser comparisons of performance across several machines featuring a range of memory configurations. Cross-browser Memory Comparisons are Bad Various tech sites periodically compare the performance of browsers.\u00a0 These often involve some cross-browser comparisons [&hellip;]<\/p>\n","protected":false},"author":139,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[30,4555,4558,4544,4546,4562,311],"tags":[],"_links":{"self":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/posts\/1617"}],"collection":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/users\/139"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/comments?post=1617"}],"version-history":[{"count":0,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/posts\/1617\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/media?parent=1617"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/categories?post=1617"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/tags?post=1617"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}