{"id":31,"date":"2011-10-17T15:10:50","date_gmt":"2011-10-17T15:10:50","guid":{"rendered":"http:\/\/blog.mozilla.org\/nfroyd\/?p=31"},"modified":"2011-10-17T15:14:58","modified_gmt":"2011-10-17T15:14:58","slug":"reordering-plugins-and-edge-profiles","status":"publish","type":"post","link":"https:\/\/blog.mozilla.org\/nfroyd\/2011\/10\/17\/reordering-plugins-and-edge-profiles\/","title":{"rendered":"reordering plugins and edge profiles"},"content":{"rendered":"<p>Mike Hommey and I have been working with<a href=\"http:\/\/gcc.gnu.org\/ml\/gcc-patches\/2011-09\/msg01440.html\"> a linker plugin that uses the callgraph to reorder functions<\/a> on Linux and Android. One feature of the linker plugin is that it dumps a simplified representation of the callgraph to a text file. The simplified representation is really simple, just edge counts for all the edges in the callgraph. It&#8217;s not perfect, since the ordering of caller and callee in the dump file is not always consistent. Also, you get a few spurious functions in there, like <tt>__builtin_expect<\/tt> and SSE builtins calling things.\u00a0 (<tt>__builtin_expect<\/tt> just provides a hint to the compiler about how to order basic blocks for more cache-friendly control flow; SSE builtins should compile down to single\/few instructions and never actually call something at runtime.)<\/p>\n<p>Nevertheless, looking at <a href=\"http:\/\/people.mozilla.org\/~nfroyd\/final_layout_libxul.txt\">the file for libxul.so<\/a> can be illuminating. Actually, it&#8217;s probably more illuminating to narrow things down to <a href=\"http:\/\/people.mozilla.org\/~nfroyd\/libxul_top100k_calls.txt\">the edges with 100k+ call counts and demangled function names<\/a>. Doing that, we can see that:<\/p>\n<ul>\n<li>We call <tt>fgets<\/tt> 115k times.\u00a0 I believe this is <a href=\"http:\/\/mxr.mozilla.org\/mozilla-central\/source\/extensions\/spellcheck\/hunspell\/src\/filemgr.cpp#69\">solely for the spellchecker&#8217;s data<\/a>.\u00a0 At least we&#8217;re not using <tt>gets<\/tt>.<\/li>\n<li>We call <tt>floor<\/tt> over a million times.\u00a0 The only caller here is <a href=\"http:\/\/mxr.mozilla.org\/mozilla-central\/source\/gfx\/src\/nsRect.h#226\"><tt>nsRect::ToNearestPixels<\/tt><\/a>, which winds up calling <tt>floor<\/tt> through the magic of inlining.<\/li>\n<li>We call the related functions <tt>floorf<\/tt> and <tt>ceilf<\/tt> over 250k times.\u00a0 The caller here is <a href=\"http:\/\/mxr.mozilla.org\/mozilla-central\/source\/gfx\/src\/nsRect.h#232\"><tt>nsRect::ScaleToOutsidePixels<\/tt><\/a>, again through the magic of inlining.<\/li>\n<li>We have a specialized <a href=\"http:\/\/mxr.mozilla.org\/mozilla-central\/source\/intl\/uconv\/src\/nsUTF8ToUnicodeSSE2.cpp#47\"><tt>mozilla::SSE::Convert_ascii_run<\/tt><\/a> function; this function gets called 366 times by <a href=\"http:\/\/mxr.mozilla.org\/mozilla-central\/source\/intl\/uconv\/src\/nsUTF8ToUnicode.cpp#57\"><tt>nsUTF8ToUnicode::Convert<\/tt><\/a>.\u00a0 It&#8217;s apparently doing a lot of work though; the plugin records that it makes over a million calls to various SSE builtins.<\/li>\n<li><a href=\"http:\/\/mxr.mozilla.org\/mozilla-central\/source\/xpcom\/glue\/pldhash.cpp#605\"><tt>PL_DHashTableOperate<\/tt><\/a> is a pretty busy function with several million calls to it.\u00a0 Big users are <a href=\"http:\/\/mxr.mozilla.org\/mozilla-central\/source\/xpcom\/base\/nsCycleCollector.cpp#1699\"><tt>GCGraphBuilder::NoteXPCOMChild<\/tt><\/a> (167k), <a href=\"http:\/\/mxr.mozilla.org\/mozilla-central\/source\/xpcom\/ds\/nsStaticNameTable.cpp#216\"><tt>nsStaticCaseInsensitiveNameTable::Lookup<\/tt><\/a> (100k), <a href=\"http:\/\/mxr.mozilla.org\/mozilla-central\/source\/layout\/style\/nsCSSRuleProcessor.cpp#632\"><tt>RuleHash::EnumerateAllRules<\/tt><\/a> (108k), <a href=\"http:\/\/mxr.mozilla.org\/mozilla-central\/source\/layout\/base\/nsPresArena.cpp#448\"><tt>nsPresArena::AllocateBySize<\/tt><\/a> (129k) and <a href=\"http:\/\/mxr.mozilla.org\/mozilla-central\/source\/layout\/base\/FramePropertyTable.cpp#92\"><tt>FramePropertyTable::Get<\/tt><\/a> (315k).<\/li>\n<li>We, um, do a lot of string operations.\u00a0 I won&#8217;t enumerate them all here.<\/li>\n<li>I\/O operations get called quite a bit; the profile here is what PGO builds run to get profile feedback.<\/li>\n<\/ul>\n<p>Anyway, maybe people have already seen all this stuff, but I certainly hadn&#8217;t.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Mike Hommey and I have been working with a linker plugin that uses the callgraph to reorder functions on Linux and Android. One feature of the linker plugin is that it dumps a simplified representation of the callgraph to a text file. The simplified representation is really simple, just edge counts for all the edges [&hellip;]<\/p>\n","protected":false},"author":320,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/posts\/31"}],"collection":[{"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/users\/320"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/comments?post=31"}],"version-history":[{"count":0,"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/posts\/31\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/media?parent=31"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/categories?post=31"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/tags?post=31"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}