{"id":3014,"date":"2014-11-04T10:56:10","date_gmt":"2014-11-03T23:56:10","guid":{"rendered":"http:\/\/blog.mozilla.org\/nnethercote\/?p=3014"},"modified":"2014-11-04T10:56:10","modified_gmt":"2014-11-03T23:56:10","slug":"please-grow-your-buffers-exponentially","status":"publish","type":"post","link":"https:\/\/blog.mozilla.org\/nnethercote\/2014\/11\/04\/please-grow-your-buffers-exponentially\/","title":{"rendered":"Please grow your buffers exponentially"},"content":{"rendered":"<p>If you record every heap allocation and re-allocation done by Firefox you find some interesting things. In particular, you find some sub-optimal buffer growth strategies that cause a lot of heap churn.<\/p>\n<p>Think about a data structure that involves a contiguous, growable buffer, such as a string or a vector. If you append to it and it doesn&#8217;t have enough space for the appended elements, you need to allocate a new buffer, copy the old contents to the new buffer, and then free the old buffer. <code>realloc()<\/code> is usually used for this, because it does these three steps for you.<\/p>\n<p>The crucial question: when you have to grow a buffer, how much do you grow it? One obvious answer is &#8220;just enough for the new elements&#8221;. That might seem\u00a0 space-efficient at first glance, but if you have to repeatedly grow the buffer it can quickly turn bad.<\/p>\n<p>Consider a simple but not outrageous example. Imagine you have a buffer that starts out 1 byte long and you add single bytes to it until it is 1 MiB long. If you use the &#8220;just-enough&#8221; strategy you&#8217;ll cumulatively allocate this much memory:<\/p>\n<blockquote><p>1 + 2 + 3 + &#8230; + 1,048,575 + 1,048,576 = 549,756,338,176 bytes<\/p><\/blockquote>\n<p>Ouch. O(<em>n<\/em><sup>2<\/sup>) behaviour really hurts when <em>n<\/em> gets big enough. Of course the peak memory usage won&#8217;t be nearly this high, but all those reallocations and copying will be slow.<\/p>\n<p>In practice it won&#8217;t be this bad because heap allocators round up requests, e.g. if you ask for 123 bytes you&#8217;ll likely get something larger like 128 bytes. The allocator used by Firefox (an old, extensively-modified version of jemalloc) rounds up all requests between 4 KiB and 1 MiB to the nearest multiple of 4 KiB. So you&#8217;ll actually allocate approximately this much memory:<\/p>\n<blockquote><p>4,096 + 8,192 + 12,288 + &#8230; + 1,044,480 + 1,048,576 = 134,742,016 bytes<\/p><\/blockquote>\n<p>(This ignores the sub-4 KiB allocations, which in total are negligible.) Much better. And if you&#8217;re lucky the OS&#8217;s virtual memory system will do some magic with page tables to make the copying cheap. But still, it&#8217;s a lot of churn.<\/p>\n<p>A strategy that is usually better is exponential growth. Doubling the buffer each time is the simplest strategy:<\/p>\n<blockquote><p>4,096 + 8,192 + 16,384 + 32,768 + 65,536 + 131,072 + 262,144 + 524,288 + 1,048,576 = 2,093,056 bytes<\/p><\/blockquote>\n<p>That&#8217;s more like it; the cumulative size is just under twice the final size, and the series is short enough now to write it out in full, which is nice &#8212; calls to <code>malloc()<\/code> and <code>realloc()<\/code> aren&#8217;t that cheap because they typically require acquiring a lock. I particularly like the doubling strategy because it&#8217;s simple and it also avoids wasting usable space due to <a href=\"https:\/\/blog.mozilla.org\/nnethercote\/2011\/08\/05\/clownshoes-available-in-sizes-2101-and-up\/\">slop<\/a>.<\/p>\n<p>Recently I&#8217;ve converted &#8220;just enough&#8221; growth strategies to exponential growth strategies in <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=1084114\">XDRBuffer<\/a> and <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=1048044\">nsTArray<\/a>, and I also found a case in <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=1088143\">SQLite<\/a> that Richard Hipp has fixed. These pieces of code now match numerous places that already used exponential growth: pldhash, JS::HashTable, mozilla::Vector, JSString, nsString, and pdf.js.<\/p>\n<p>Pleasingly, the nsTArray conversion had a clear positive effect. Not only did the exponential growth strategy reduce the amount of heap churn and the number of <code>realloc()<\/code> calls, it also <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=1048044#c35\">reduced heap fragmentation<\/a>: the &#8220;heap-overhead&#8221; part of the purple measurement on <a href=\"https:\/\/areweslimyet.com\/\">AWSY<\/a> (a.k.a. &#8220;<span class=\"quote\">RSS: After TP5, tabs closed [+30s, forced GC]&#8221;) <\/span>dropped by 4.3 MiB! This makes sense if you think about it: an allocator can fulfil power-of-two requests like 64 KiB, 128 KiB, and 256 KiB with less waste than it can awkward requests like 244 KiB, 248 KiB, 252 KiB, etc.<\/p>\n<p>So, if you know of some more code in Firefox that uses a non-exponential growth strategy for a buffer, please fix it, or let me know so I can look at it. Thank you.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you record every heap allocation and re-allocation done by Firefox you find some interesting things. In particular, you find some sub-optimal buffer growth strategies that cause a lot of heap churn. Think about a data structure that involves a contiguous, growable buffer, such as a string or a vector. If you append to it [&hellip;]<\/p>\n","protected":false},"author":139,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4557,30,4556,4544,4546],"tags":[],"_links":{"self":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/posts\/3014"}],"collection":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/users\/139"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/comments?post=3014"}],"version-history":[{"count":0,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/posts\/3014\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/media?parent=3014"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/categories?post=3014"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/tags?post=3014"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}