{"id":508,"date":"2011-02-01T17:09:07","date_gmt":"2011-02-01T06:09:07","guid":{"rendered":"http:\/\/blog.mozilla.org\/nnethercote\/?p=508"},"modified":"2011-02-01T17:09:07","modified_gmt":"2011-02-01T06:09:07","slug":"memory-profiling-firefox-on-techcrunch","status":"publish","type":"post","link":"https:\/\/blog.mozilla.org\/nnethercote\/2011\/02\/01\/memory-profiling-firefox-on-techcrunch\/","title":{"rendered":"Memory profiling Firefox on TechCrunch"},"content":{"rendered":"<p><a href=\"http:\/\/blog.mozilla.org\/rob-sayre\/\">Rob Sayre<\/a> suggested <a href=\"http:\/\/techcrunch.com\/\">TechCrunch<\/a> to me as a good stress test for Firefox&#8217;s memory usage:<\/p>\n<pre>  {sayrer} take a look at techcrunch.com if you have a chance. that one is brutal.<\/pre>\n<p>So I measured space usage <a href=\"http:\/\/blog.mozilla.org\/nnethercote\/2011\/01\/07\/memory-profiling-firefox-with-massif-part-2\/\">with Massif<\/a> for a single TechCrunch tab, on 64-bit Linux.  Here&#8217;s the high-level result:<\/p>\n<pre>  34.65% (371,376,128B) _dl_map_object_from_fd (dl-load.c:1199)\r\n  21.14% (226,603,008B) pthread_create@@GLIBC_2.2.5 (allocatestack.c:483)\r\n  08.93% (95,748,276B) in 4043 places, all below massif's threshold (00.20%)\r\n  06.26% (67,112,960B) pa_shm_create_rw (in \/usr\/lib\/libpulsecommon-0.9.21.so)\r\n  03.10% (33,263,616B) JSC::ExecutablePool::systemAlloc(unsigned long) (ExecutableAllocatorPosix.cpp:43)\r\n  02.67% (28,618,752B) NewOrRecycledNode(JSTreeContext*) (jsparse.cpp:670)\r\n  01.90% (20,414,464B) js::PropertyTree::newShape(JSContext*, bool) (jspropertytree.cpp:97)\r\n  01.57% (16,777,216B) GCGraphBuilder::AddNode(void*, nsCycleCollectionParticipant*) (nsCycleCollector.cpp:596)\r\n  01.48% (15,841,208B) JSScript::NewScript(JSContext*, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int, unsigned short, unsigned short, JSVersion) (jsutil.h:210)\r\n  01.45% (15,504,752B) ChangeTable (pldhash.c:563)\r\n  01.44% (15,478,784B) g_mapped_file_new (in \/lib\/libglib-2.0.so.0.2400.1)\r\n  01.41% (15,167,488B) GCGraphBuilder::NoteScriptChild(unsigned int, void*) (mozalloc.h:229)\r\n  01.37% (14,680,064B) js_NewFunction(JSContext*, JSObject*, int (*)(JSContext*, unsigned int, js::Value*), unsigned int, unsigned int, JSObject*, JSAtom*) (jsgcinlines.h:127)\r\n  00.97% (10,383,040B) js::mjit::Compiler::finishThisUp(js::mjit::JITScript**) (jsutil.h:214)\r\n  00.78% (8,388,608B) js::StackSpace::init() (jscntxt.cpp:164)\r\n  00.69% (7,360,512B) pcache1Alloc (sqlite3.c:33491)\r\n  00.62% (6,601,324B) PL_DHashTableInit (pldhash.c:268)\r\n  00.59% (6,291,456B) js_NewStringCopyN(JSContext*, unsigned short const*, unsigned long) (jsgcinlines.h:127)\r\n  00.59% (6,287,516B) nsTArray_base::EnsureCapacity(unsigned int, unsigned int) (nsTArray.h:88)\r\n  00.52% (5,589,468B) gfxImageSurface::gfxImageSurface(gfxIntSize const&amp;, gfxASurface::gfxImageFormat) (gfxImageSurface.cpp:111)\r\n  00.49% (5,292,184B) js::Vector::growStorageBy(unsigned long) (jsutil.h:218)\r\n  00.49% (5,283,840B) nsHTTPCompressConv::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned int, unsigned int) (nsMemory.h:68)\r\n  00.49% (5,255,168B) js::Parser::markFunArgs(JSFunctionBox*) (jsutil.h:210)\r\n  00.49% (5,221,320B) nsStringBuffer::Alloc(unsigned long) (nsSubstring.cpp:209)\r\n  00.43% (4,558,848B) _dl_map_object_from_fd (dl-load.c:1250)\r\n  00.42% (4,554,740B) nsTArray_base::EnsureCapacity(unsigned int, unsigned int) (nsTArray.h:84)\r\n  00.39% (4,194,304B) js_NewGCString(JSContext*) (jsgcinlines.h:127)\r\n  00.39% (4,194,304B) js::NewCallObject(JSContext*, js::Bindings*, JSObject&amp;, JSObject*) (jsgcinlines.h:127)\r\n  00.39% (4,194,304B) js::NewNativeClassInstance(JSContext*, js::Class*, JSObject*, JSObject*) (jsgcinlines.h:127)\r\n  00.39% (4,194,304B) JS_NewObject (jsgcinlines.h:127)\r\n  00.35% (3,770,972B) js::PropertyTable::init(JSRuntime*, js::Shape*) (jsutil.h:214)\r\n  00.35% (3,743,744B) NS_NewStyleContext(nsStyleContext*, nsIAtom*, nsCSSPseudoElements::Type, nsRuleNode*, nsPresContext*) (nsPresContext.h:306)\r\n  00.34% (3,621,704B) XPT_ArenaMalloc (xpt_arena.c:221)\r\n  00.31% (3,346,532B) nsCSSSelectorList::AddSelector(unsigned short) (mozalloc.h:229)\r\n  00.30% (3,227,648B) js::InitJIT(js::TraceMonitor*) (jsutil.h:210)\r\n  00.30% (3,207,168B) js::InitJIT(js::TraceMonitor*) (jsutil.h:210)\r\n  00.30% (3,166,208B) js_alloc_temp_space(void*, unsigned long) (jsatom.cpp:689)\r\n  00.28% (2,987,548B) nsCSSExpandedDataBlock::Compress(nsCSSCompressedDataBlock**, nsCSSCompressedDataBlock**) (mozalloc.h:229)\r\n  00.27% (2,883,584B) js::detail::HashTable::SetOps, js::SystemAllocPolicy&gt;::add(js::detail::HashTable::SetOps, js::SystemAllocPolicy&gt;::AddPtr&amp;, unsigned long const&amp;) (jsutil.h:210)\r\n  00.26% (2,752,512B) FT_Stream_Open (in \/usr\/lib\/libfreetype.so.6.3.22)\r\n  00.24% (2,564,096B) PresShell::AllocateFrame(nsQueryFrame::FrameIID, unsigned long) (nsPresShell.cpp:2098)\r\n  00.21% (2,236,416B) nsRecyclingAllocator::Malloc(unsigned long, int) (nsRecyclingAllocator.cpp:170)<\/pre>\n<p>Total memory usage at peak was 1,071,940,088 bytes.  Lets go through some of these entries one by one.<\/p>\n<pre>  34.65% (371,376,128B) _dl_map_object_from_fd (dl-load.c:1199)\r\n  21.14% (226,603,008B) pthread_create@@GLIBC_2.2.5 (allocatestack.c:483)\r\n  06.26% (67,112,960B) pa_shm_create_rw (in \/usr\/lib\/libpulsecommon-0.9.21.so)\r\n<\/pre>\n<p>These three, although the biggest single entries, can be more or less ignored;\u00a0 I explained why <a href=\"http:\/\/blog.mozilla.org\/nnethercote\/2011\/01\/07\/memory-profiling-firefox-with-massif-part-2\/\">previously<\/a>.<\/p>\n<pre>  03.10% (33,263,616B) JSC::ExecutablePool::systemAlloc() (ExecutableAllocatorPosix.cpp:43)\r\n<\/pre>\n<p>This is for code generated by JaegerMonkey.\u00a0 I know very little about JaegerMonkey&#8217;s code generation so I don&#8217;t have any good suggestions for reducing it.\u00a0 As I understand it very little effort has been made to minimize the size of the generated code so there may well be some easy wins there.<\/p>\n<pre>  02.67% (28,618,752B) NewOrRecycledNode() (jsparse.cpp:670)\r\n<\/pre>\n<p>This is for JSParseNode, the basic type from which JS parse trees are constructed.\u00a0 <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=626932\">Bug 626932<\/a> is open to shrink JSParseNode;\u00a0 there are a couple of good ideas but not much progress has been made.\u00a0 I hope to do more here but probably not in time for Firefox 4.0.<\/p>\n<pre>  01.90% (20,414,464B) js::PropertyTree::newShape() (jspropertytree.cpp:97)\r\n<\/pre>\n<p>Shapes are a structure used to speed up JS property accesses.\u00a0 Increasing the MAX_HEIGHT constant from 64 to 128 (which reduces the number of JS objects that are converted to &#8220;dictionary mode&#8221;, and thus the number of Shapes that are allocated) may reduce this by 3 or 4 MB with negligible speed cost.\u00a0 I opened <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=630456\">bug 630456<\/a> for this.<\/p>\n<pre>  01.57% (16,777,216B) GCGraphBuilder::AddNode() (nsCycleCollector.cpp:596)\r\n\u00a0 01.41% (15,167,488B) GCGraphBuilder::NoteScriptChild() (mozalloc.h:229)\r\n<\/pre>\n<p>This is the cycle collector.\u00a0 I know almost nothing about it, but I see it allocates 32,768 PtrInfo structs at a time.\u00a0 I wonder if that strategy could be improved.<\/p>\n<pre>  01.48% (15,841,208B) JSScript::NewScript() (jsutil.h:210)\r\n  01.37% (14,680,064B) js_NewFunction() (jsgcinlines.h:127)\r\n<\/pre>\n<p>Each JS function has a JSFunction associated with it, and each JSFunction has a JSScript associated with it.\u00a0 Each of them stores various bits of information about the function.\u00a0 I don&#8217;t have any good ideas for how to shrink these structures.\u00a0 Both of them are reasonably large, with lots of fields.<\/p>\n<pre>  00.97% (10,383,040B) js::mjit::Compiler::finishThisUp() (jsutil.h:214)\r\n<\/pre>\n<p>Each function compiled by JaegerMonkey also has some additional information associated with it, including all the inline caches.\u00a0 This is allocated here.\u00a0 <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=611400\">Some<\/a> <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=619622\">good<\/a> <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=619849\">progress<\/a> has already been made here, and I have <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=629601\">some<\/a> <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=630445\">more<\/a> <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=630447\">ideas<\/a> for getting it down a bit further.<\/p>\n<pre>  00.59% (6,291,456B) js_NewStringCopyN() (jsgcinlines.h:127)\r\n  00.49% (5,292,184B) js::Vector::growStorageBy() (jsutil.h:218)\r\n<\/pre>\n<p>These entries are for space used during JS scanning (a.k.a. lexing, tokenizing).\u00a0 Identifiers and strings get <em>atomized<\/em>, i.e. put into a table so there&#8217;s a single copy of each one.\u00a0 Take an identifier as an example.\u00a0 It starts off stored in a buffer of characters.\u00a0 It gets scanned and copied into a js::Vector, with any escaped chars being converted along the way.\u00a0 Then the copy in the js::Vector is atomized, which involves copying it again into a malloc&#8217;d buffer of just the right size.\u00a0 I thought about avoiding this copying in <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=588648\">bug 588648<\/a>, but it turned out to be difficult.\u00a0 (I did manage to remove <em>another<\/em> extra copy of every character, though!)<\/p>\n<p>In summary, there is definitely room for more improvement.\u00a0 I hope to get a few more space optimizations in before Firefox 4.0 is released, but there&#8217;ll be plenty of other work to do afterwards.\u00a0 If anyone can see other easy wins for the entries above, I&#8217;d love to hear about them.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Rob Sayre suggested TechCrunch to me as a good stress test for Firefox&#8217;s memory usage: {sayrer} take a look at techcrunch.com if you have a chance. that one is brutal. So I measured space usage with Massif for a single TechCrunch tab, on 64-bit Linux. Here&#8217;s the high-level result: 34.65% (371,376,128B) _dl_map_object_from_fd (dl-load.c:1199) 21.14% (226,603,008B) [&hellip;]<\/p>\n","protected":false},"author":139,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/posts\/508"}],"collection":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/users\/139"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/comments?post=508"}],"version-history":[{"count":0,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/posts\/508\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/media?parent=508"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/categories?post=508"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/tags?post=508"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}