{"id":3094,"date":"2015-06-03T10:34:09","date_gmt":"2015-06-02T23:34:09","guid":{"rendered":"http:\/\/blog.mozilla.org\/nnethercote\/?p=3094"},"modified":"2015-06-03T10:34:09","modified_gmt":"2015-06-02T23:34:09","slug":"measuring-data-structure-sizes-firefox-c-vs-servo-rust","status":"publish","type":"post","link":"https:\/\/blog.mozilla.org\/nnethercote\/2015\/06\/03\/measuring-data-structure-sizes-firefox-c-vs-servo-rust\/","title":{"rendered":"Measuring data structure sizes: Firefox (C++) vs. Servo (Rust)"},"content":{"rendered":"<p>Firefox&#8217;s <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Mozilla\/Performance\/about:memory\">about:memory page<\/a> presents fine-grained measurements of memory usage. Here&#8217;s a short example.<\/p>\n<pre>725.84 MB (100.0%) -- explicit\r\n\u251c\u2500\u2500504.36 MB (69.49%) -- window-objects\r\n\u2502 \u251c\u2500\u2500115.84 MB (15.96%) -- top(https:\/\/treeherder.mozilla.org\/#\/jobs?repo=mozilla-inbound, id=2147483655)\r\n\u2502 \u2502 \u251c\u2500\u2500\u250085.30 MB (11.75%) -- active\r\n\u2502 \u2502 \u2502 \u251c\u2500\u250084.75 MB (11.68%) -- window(https:\/\/treeherder.mozilla.org\/#\/jobs?repo=mozilla-inbound)\r\n\u2502 \u2502 \u2502 \u2502 \u251c\u2500\u250036.51 MB (05.03%) -- dom\r\n\u2502 \u2502 \u2502 \u2502 \u2502 \u251c\u2500\u250016.46 MB (02.27%) \u2500\u2500 element-nodes\r\n\u2502 \u2502 \u2502 \u2502 \u2502 \u251c\u2500\u250013.08 MB (01.80%) \u2500\u2500 orphan-nodes\r\n\u2502 \u2502 \u2502 \u2502 \u2502 \u2514\u2500\u2500\u25006.97 MB (00.96%) ++ (4 tiny)\r\n\u2502 \u2502 \u2502 \u2502 \u251c\u2500\u250025.17 MB (03.47%) -- js-compartment(https:\/\/treeherder.mozilla.org\/#\/jobs?repo=mozilla-inbound)\r\n\u2502 \u2502 \u2502 \u2502 \u2502 \u251c\u2500\u250023.29 MB (03.21%) ++ classes\r\n\u2502 \u2502 \u2502 \u2502 \u2502 \u2514\u2500\u2500\u25001.87 MB (00.26%) ++ (7 tiny)\r\n\u2502 \u2502 \u2502 \u2502 \u251c\u2500\u250021.69 MB (02.99%) ++ layout\r\n\u2502 \u2502 \u2502 \u2502 \u2514\u2500\u2500\u25001.39 MB (00.19%) ++ (2 tiny)\r\n\u2502 \u2502 \u2502 \u2514\u2500\u2500\u25000.55 MB (00.08%) ++ window(https:\/\/login.persona.org\/communication_iframe)\r\n\u2502 \u2502 \u2514\u2500\u2500\u250030.54 MB (04.21%) ++ js-zone(0x7f131ed6e000)<\/pre>\n<p>A typical about:memory invocation contains many thousands of measurements. Although they can be hard for non-experts to interpret, they are immensely useful to Firefox developers. For this reason, I&#8217;m currently implementing a similar system in <a href=\"https:\/\/github.com\/servo\/servo\">Servo<\/a>, which is a next-generation browser engine that&#8217;s implemented in <a href=\"http:\/\/www.rust-lang.org\/\">Rust<\/a>. Although the implementation in Servo is heavily based on the Firefox implementation, Rust has some features that make the Servo implementation a lot nicer than the Firefox implementation, which is written in C++. This blog post is a deep dive that explains how and why.<\/p>\n<h3>Measuring data structures in Firefox<\/h3>\n<p>A lot of the measurements done for about:memory are of heterogeneous data structures that live on the heap and contain pointers. We want such data structures to be able to measure themselves. Consider the following simple example.<\/p>\n<pre>struct CookieDomainTuple\r\n{\r\n  nsCookieKey key;\r\n  nsRefPtr&lt;nsCookie&gt; cookie;\r\n \r\n  size_t SizeOfExcludingThis(mozilla::MallocSizeOf aMallocSizeOf) const;\r\n};<\/pre>\n<p>The things to immediately note about this type are as follows.<\/p>\n<ul>\n<li>The details of <code>nsCookieKey<\/code> and <code>nsCookie<\/code> don&#8217;t matter here.<\/li>\n<li><code>nsRefPtr<\/code> is a smart pointer type.<\/li>\n<li>There is a method, called <code>SizeOfExcludingThis<\/code>, for measuring the size of a <code>CookieDomainTuple<\/code>.<\/li>\n<\/ul>\n<p>That measurement method has the following form.<\/p>\n<pre>size_t\r\nCookieDomainTuple::SizeOfExcludingThis(MallocSizeOf aMallocSizeOf) const\r\n{\r\n  size_t amount = 0;\r\n  amount += key.SizeOfExcludingThis(aMallocSizeOf);\r\n  amount += cookie-&gt;SizeOfIncludingThis(aMallocSizeOf);\r\n  return amount;\r\n}<\/pre>\n<p>Things to note here are as follows.<\/p>\n<ul>\n<li><code>aMallocSizeOf<\/code> is a pointer to a function that takes a pointer to a heap block and returns the size of that block in bytes. Under the covers it&#8217;s implemented with a function like <code>malloc_usable_size<\/code>. Using a function like this is superior to computing the size analytically, because (a) it&#8217;s less error-prone and (b) it measures the actual size of heap blocks, which is often larger than the requested size because <a href=\"https:\/\/blog.mozilla.org\/nnethercote\/2011\/08\/05\/clownshoes-available-in-sizes-2101-and-up\/\">heap allocators round up some request sizes<\/a>. It will also naturally measure any padding between members.<\/li>\n<li>The two data members are measured by invocations to size measurement methods that they provide.<\/li>\n<li>The first of these is called <code>SizeOfExcludingThis<\/code>. The &#8220;excluding <code>this<\/code>&#8221; here is necessary because <code>key<\/code> is an <code>nsCookieKey<\/code> that sits within a <code>CookieDomainTuple<\/code>. We don&#8217;t want to measure the <code>nsCookieKey<\/code> struct itself, just any additional heap blocks that it has pointers to.<\/li>\n<li>The second of these is called <code>SizeOfIncludingThis<\/code>. The &#8220;including <code>this<\/code>&#8221; here is necessary because <code>cookie<\/code> is just a pointer to an <code>nsCookie<\/code> struct, which we <em>do<\/em> want to measure, along with any additional heap blocks it has pointers to.<\/li>\n<li>We need to be careful with these calls. If we call <code>SizeOfIncludingThis<\/code> when we should call <code>SizeOfExcludingThis<\/code>, we&#8217;ll likely get a crash due to calling <code>aMallocSizeOf<\/code> on a non-heap pointer. And if we call <code>SizeOfExcludingThis<\/code> when we should call <code>SizeOfIncludingThis<\/code>, we&#8217;ll miss measuring the struct.<\/li>\n<li>If this struct had a pointer to a raw heap buffer &#8212; e.g. a <code>char*<\/code> member &#8212; it would measure it by calling <code>aMallocSizeOf<\/code> directly on the pointer.<\/li>\n<\/ul>\n<p>With that in mind, you can see that this method is itself a <code>SizeOfExcludingThis<\/code> method, and indeed, it doesn&#8217;t measure the memory used by the struct instance itself. A method that did include that memory would look like the following.<\/p>\n<pre>size_t\r\nCookieDomainTuple::SizeOfIncludingThis(MallocSizeOf aMallocSizeOf)\r\n{ \r\n  return aMallocSizeOf(this) + SizeOfExcludingThis(aMallocSizeOf);\r\n}<\/pre>\n<p>All it does is measure the <code>CookieDomainTuple<\/code> struct itself &#8212; i.e. <code>this<\/code> &#8212; and then call the <code>SizeOfExcludingThis<\/code> method, which measures all child structures.<\/p>\n<p>There are a few other wrinkles.<\/p>\n<ul>\n<li>Often we want to ignore a data member. Perhaps it&#8217;s a scalar value, such as an integer. Perhaps it&#8217;s a non-owning pointer to something and that thing would be better measured as part of the measurement of another data structure. Perhaps it&#8217;s something small that isn&#8217;t worth measuring. In these cases we generally use comments in the measurement method to explain why a field isn&#8217;t measured, but it&#8217;s easy for these comments to fall out-of-date. It&#8217;s also easy to forget to update the measurement method when a new data member is added.<\/li>\n<li>Every <code>SizeOfIncludingThis<\/code> method body looks the same: <code>return aMallocSizeOf(this) + SizeOfExcludingThis(aMallocSizeOf);<\/code><\/li>\n<li>Reference-counting complicates things, because you end up with pointers that conceptually own a fraction of another structure.<\/li>\n<li>Inheritance complicates things.<\/li>\n<\/ul>\n<p>(The <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Mozilla\/Performance\/Memory_reporting\">full documentation<\/a> goes into more detail.)<\/p>\n<p>Even with all the wrinkles, it all works fairly well. Having said that, there are a lot of <code>SizeOfExcludingThis<\/code> and <code>SizeOfIncludingThis<\/code> methods that are boilerplate-y and tedious to write.<\/p>\n<h3>Measuring data structures in SERVO<\/h3>\n<p>When I started implementing a similar system in Servo, I naturally followed a similar design. But I soon found I was able to improve upon it.<\/p>\n<p>With the same functions defined for lots of types, it was natural to define a Rust trait, like the following.<\/p>\n<pre>pub trait HeapSizeOf { \r\n\u00a0 fn size_of_including_self(&amp;self) -&gt; usize;\r\n  fn size_of_excluding_self(&amp;self) -&gt; usize; \r\n}<\/pre>\n<p>Having to repeatedly define <code>size_of_including_self<\/code> when its definition always looks the same is a pain. But heap pointers in Rust are handled via the parameterized <code>Box<\/code> type, and it&#8217;s possible to implement traits for this type. This means we can implement <code>size_of_excluding_this<\/code> for all <code>Box<\/code> types &#8212; thus removing the need for <code>size_of_including_this<\/code> &#8212; in one fell swoop, as the following code shows.<\/p>\n<pre>impl&lt;T: HeapSizeOf&gt; HeapSizeOf for Box&lt;T&gt; {\r\n  fn size_of_excluding_self(&amp;self) -&gt; usize {\r\n    heap_size_of(&amp;**self as *const T as *const c_void) + (**self).size_of_excluding_self()\r\n  }\r\n}<\/pre>\n<p>The pointer manipulations are hairy, but basically it says that if <code>T<\/code> implements the <code>HeapSizeOf<\/code> trait, then we can measure <code>Box&lt;T&gt;<\/code> by measuring the <code>T<\/code> struct itself (via <code>heap_size_of<\/code>, which is similar to the <code>aMallocSizeOf<\/code> function in the Firefox example), and then measuring the things hanging off <code>T<\/code> (via the <code>size_of_excluding_self<\/code> call). Excellent!<\/p>\n<p>With the including\/excluding distinction gone, I renamed <code>size_of_excluding_self<\/code> as <code>heap_size_of_children<\/code>, which I thought communicated the same idea more clearly; it seems better for the name to describe what it <em>is<\/em> measuring rather than what it is <em>not<\/em> measuring.<\/p>\n<p>But there was still a need for a lot of tedious boilerplate code, as this example shows.<\/p>\n<pre>pub struct DisplayList {\r\n  pub background_and_borders: LinkedList&lt;DisplayItem&gt;,\r\n  pub block_backgrounds_and_borders: LinkedList&lt;DisplayItem&gt;,\r\n  pub floats: LinkedList&lt;DisplayItem&gt;,\r\n  pub content: LinkedList&lt;DisplayItem&gt;,\r\n  pub positioned_content: LinkedList&lt;DisplayItem&gt;,\r\n  pub outlines: LinkedList&lt;DisplayItem&gt;,\r\n  pub children: LinkedList&lt;Arc&lt;StackingContext&gt;&gt;,\r\n}\r\n\r\nimpl HeapSizeOf for DisplayList {\r\n  fn heap_size_of_children(&amp;self) -&gt; usize {\r\n    self.background_and_borders.heap_size_of_children() +\r\n    self.block_backgrounds_and_borders.heap_size_of_children() +\r\n    self.floats.heap_size_of_children() +\r\n    self.content.heap_size_of_children() +\r\n    self.positioned_content.heap_size_of_children() +\r\n    self.outlines.heap_size_of_children() +\r\n    self.children.heap_size_of_children()\r\n  }\r\n}<\/pre>\n<p>However, the Rust compiler has the ability to automatically derive implementations for some built-in traits. Even better, the compiler lets you write plug-ins that do arbitrary transformations of the syntax tree, which makes it possible to write a plug-in that does the same for non-built-in traits on request. And the delightful\u00a0<a href=\"http:\/\/manishearth.github.io\/\">Manish Goregaokar<\/a> <a href=\"https:\/\/github.com\/servo\/servo\/pull\/6188\">has done exactly that<\/a>. This allows the example above to be reduced to the following.<\/p>\n<pre>#[derive(HeapSizeOf)]\r\npub struct DisplayList {\r\n  pub background_and_borders: LinkedList&lt;DisplayItem&gt;,\r\n  pub block_backgrounds_and_borders: LinkedList&lt;DisplayItem&gt;,\r\n  pub floats: LinkedList&lt;DisplayItem&gt;,\r\n  pub content: LinkedList&lt;DisplayItem&gt;,\r\n  pub positioned_content: LinkedList&lt;DisplayItem&gt;,\r\n  pub outlines: LinkedList&lt;DisplayItem&gt;,\r\n  pub children: LinkedList&lt;Arc&lt;StackingContext&gt;&gt;,\r\n}<\/pre>\n<p>The first line is an annotation that triggers the plug-in to do the obvious thing: generate a <code>heap_size_of_children<\/code> definition that just calls <code>heap_size_of_children<\/code> on all the struct fields. Wonderful!<\/p>\n<p>But you may remember that I mentioned that sometimes in Firefox&#8217;s C++ code we want to ignore a particular member. This is also true in Servo&#8217;s Rust code, so the plug-in supports an <code>ignore_heap_size<\/code> annotation which can be applied to any field in the struct definition; the plug-in will duly ignore any such field.<\/p>\n<p>If a new field is added which has a type for which <code>HeapSizeOf<\/code> has not been implemented, the compiler will complain. This means that we can&#8217;t add a new field to a struct and forget to measure it. The <code>ignore_heap_size_of<\/code> annotation also requires a string argument, which (by convention) holds a brief explanation why the member is ignored, as the following example shows.<\/p>\n<pre>#[ignore_heap_size_of = \"Because it is a non-owning reference.\"]\r\npub image: Arc&lt;Image&gt;,<\/pre>\n<p>(An aside: the best way to handle <code>Arc<\/code> is an open question. If one of the references is clearly the owner, it probably makes sense to count the full size for that one reference. Otherwise, it is probably best to divide the size equally among all the references.)<\/p>\n<p>The plug-in also has a <code>known_heap_size_of!<\/code> macro that lets us easily dictate the heap size of built-in types (such as integral types, whose heap size is zero). This works because Rust allows implementations of custom traits for built-in types. It provides additional uniformity because built-in types don&#8217;t need special treatment. The following line says that all the built-in signed integer types have a <code>heap_size_of_children<\/code> value of zero.<\/p>\n<pre>known_heap_size_of!(0, i8, i16, i32, i64, isize);<\/pre>\n<p>Finally, if there is a type for which the measurement needs to do something more complicated, we can still implement <code>heap_size_of_children<\/code> manually.<\/p>\n<h3>Conclusion<\/h3>\n<p>The Servo implementation is much nicer than the Firefox implementation, in the following ways.<\/p>\n<ul>\n<li>\u00a0There is no need for an including\/excluding split thanks to trait implementations on <code>Box<\/code>. This avoids boilerplate some code and makes it impossible to accidentally call the wrong method.<\/li>\n<li>Struct fields that use built-in types are handled the same way as all others, because Rust trait implementations can be defined for built-in types.<\/li>\n<li>Even more boilerplate is avoided thanks to the compiler plug-in that auto-derives <code>HeapSizeOf<\/code> implementations; it can even ignore fields.<\/li>\n<li>For ignored fields, the required string parameter makes it impossible to forget to explain why the field is ignored.<\/li>\n<\/ul>\n<p>These are possible due to several powerful language and compiler features of Rust that C++ lacks. There may be some C++ features that could improve the Firefox code &#8212; and I&#8217;d love to hear suggestions &#8212; but it&#8217;s never going to be as nice as the Rust code.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Firefox&#8217;s about:memory page presents fine-grained measurements of memory usage. Here&#8217;s a short example. 725.84 MB (100.0%) &#8212; explicit \u251c\u2500\u2500504.36 MB (69.49%) &#8212; window-objects \u2502 \u251c\u2500\u2500115.84 MB (15.96%) &#8212; top(https:\/\/treeherder.mozilla.org\/#\/jobs?repo=mozilla-inbound, id=2147483655) \u2502 \u2502 \u251c\u2500\u2500\u250085.30 MB (11.75%) &#8212; active \u2502 \u2502 \u2502 \u251c\u2500\u250084.75 MB (11.68%) &#8212; window(https:\/\/treeherder.mozilla.org\/#\/jobs?repo=mozilla-inbound) \u2502 \u2502 \u2502 \u2502 \u251c\u2500\u250036.51 MB (05.03%) &#8212; dom \u2502 [&hellip;]<\/p>\n","protected":false},"author":139,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4550,30,4544,16179,16180],"tags":[],"_links":{"self":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/posts\/3094"}],"collection":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/users\/139"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/comments?post=3094"}],"version-history":[{"count":0,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/posts\/3094\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/media?parent=3094"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/categories?post=3094"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/tags?post=3094"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}