Category Archives: Uncategorized

This is not the security blog

Planet Mozilla’s been a little mixed up for the past few days, claiming that I was the author of posts on the Mozilla Security Blog. The good news is that this problems appears to have been fixed, thanks to Mike Hoye.

However, it’s likely that very few people saw the post I made a few days ago about the new per-class measurements in about:memory. So please take a look if you’re interested in that sort of thing. Thanks.

Internet Banking Fail

My bank’s online banking service is generally very good.  Having said that, I got this today.

"Sorry we're unable to retrieve your Interest Statement details right now. Please try again between 7AM-9PM Mon-Fri (AEST/AEDT), excludes public holidays."

Sigh.

Bleg for a new machine: outcome

Recently I blegged (here and here) for help in designing a new machine.  My goals:  fast browser and JS shell builds, quietness, and a setup that wasn’t too complicated.  I now have the new machine and have done some comparisons to the old machine.

New vs old

The most important components of the new machine are:  an Intel i7-4770 CPU (I’m using the integrated graphics), 32 GiB of RAM, a 512GB Samsung 840 Pro SSD hard disk, and a Fractal Design Define R4 case.

In comparison, the equivalent components in the old machine were: an Intel i7-2600 CPU, 16 GiB of RAM, a magnetic hard disk, and an Antec Sonata III 500 case.

A basic comparison

The new machine is definitely faster.  Compile times are about 1.5x faster;  I can do a debug browser build with clang in under 13 minutes, and one with GCC in under 17 minutes.  (I hadn’t realized that clang was so much faster than GCC.)

Furthermore, disk-intensive operations are massively faster.  Just as importantly, disk-intensive operations vary in speed much less.  With a magnetic disk, if you’re doing something where the data is already in the disk cache, it’ll be pretty fast;  otherwise it’ll be horribly slow.  The SSD doesn’t suffer that dichotomy.

Finally, the new case, while not silent, is certainly quieter… maybe half as loud as the old one.  It’s also bigger than I expected — it’s 1–2 inches bigger in every dimension than the old one. There must be a lot of empty space inside.  And although it has a pleasingly minimalist aesthetic — it’s about as plain a black box as you could imagine — it does have an obnoxiously bright, blue power indicator light at the top of the front panel, which I quickly covered with a small strip of black electrical tape.

A detailed performance comparison

Building and testing

All builds are 64-bit debug builds.  I used clang 3.2-1~exp9ubuntu1 and gcc-4.7.real (Ubuntu/Linaro 4.7.3-1ubuntu1) for compilation. I measured each operation only once, and the old machine in particular would vary in its times due to the magnetic disk.  So don’t treat individual measurements as gospel.  In all cases I give the old machine’s time first.

  • Browser clobber build (clang): 19.7 minutes vs 12.7 minutes (1.56x faster).  I didn’t measure a GCC brower build on the old machine, but on the new machine it was 16.8 minutes (1.32x slower than clang).
  • Browser no-change build (clang): 48 seconds vs 31 seconds (1.55x faster).
  • Browser clobber build, with ccache, with an empty cache (clang): 23.3 minutes vs 14.8 minutes (1.57x faster).  These are 1.18x slower and 1.17x slower than the corresponding non-ccache builds.
  • Browser clobber build, with ccache, with a full cache (clang): 6.2 minutes vs 2.6 minutes (2.4x faster).  These are 3.18x faster and 4.89x faster than the corresponding non-ccache builds.  Here the effect of the SSD becomes clear — the new machine gets a much bigger benefit from ccache.
  • Two concurrent browser builds (clang): 45.9 & 45.4 minutes vs 22.5 & 22.5 minutes (2.03x faster).  Interestingly, the amortized per-build time on the old machine (22.9 minutes) was 1.16x slower than a single build, but the amortized per-build time on the new machine (11.3 minutes) was 1.12x faster than a single build.  The new machine, despite having the same number of cores, clearly provides more parallelism, and a single browser build doesn’t take full advantage of that parallelism.
  • JS shell everything-but-ICU build (clang): 59 seconds vs 42 seconds (1.4x faster).  It’s worth noting that JS shell builds spend a higher proportion of their time doing C++ compilation than browser builds.
  • JS shell everything-but-ICU build (GCC): 130 seconds vs 81 seconds (1.60x faster).  These are 2.20x slower and 1.93x slower than the corresponding clang builds!
  • JS jit tests (compiled with clang): 179 seconds vs 137 seconds (1.31x faster).  These tests are much more CPU-bound and less disk-bound than compilation, so the smaller speed up isn’t surprising.
  • SunSpider: 156 ms vs 127 ms (1.23x faster).  Again, CPU is the main factor.

Next, here are the times for some disk-intensive operations.  The results here, especially for the old machine, could be highly variable.

  • Delete a build directory: 10.5 seconds vs 1.4 seconds (7.5x faster).
  • Do a local clone of mozilla-inbound: 7.7 minutes vs 10 seconds (46x faster).
  • Recursive grep of .cpp/.h/.idl files in a repository, first time: 53.2 seconds vs 0.8 seconds (67x faster).
  • The same operation, immediately again: 0.2 seconds vs 0.2 seconds (same speed).

Those last two comparisons really drive home the impact of the SSD, and the reduction in variability it provides. It’s hard to describe how pleasing this is.  On the old machine I always knew when libxul.so was linking, because my whole machine would grind to a halt and trivial operations like saving a file in vim would take multiple seconds.  I don’t have that any more!

And this is relevant to ccache, too.  I tried ccache again recently on my old machine, and while it did speed up compilations somewhat, the extra load on the disk noticeably affected everything else — I had even more of those unpredictable pauses when doing anything other than building.  This was annoying enough that I disabled it.  But ccache should be much more attractive on the new machine.  I will try it again soon, once I’ve had the new machine long enough that I will be well-attuned to its performance.

Conclusion

The CPU is a decent improvement over the old one.  It accounts for roughly half the improvement in build times.

The SSD is a fantastic improvement over the old one.  It too accounts for roughly half the improvement in build times, but makes disk-intensive operations much faster.  It’s performance is also much less variable and thus more predictable.

clang is up to 2x faster than GCC!  This surprised me greatly.  I’d be interested to hear if others have seen such a large difference.

Thanks again to everybody who helped me design the new machine.  It’s been well worth the effort!

Bleg for a new machine (part 2)

Last week I blegged for help in designing a new machine, and I got almost 50 extremely helpful comments and a handful of private emails.  Many thanks to all those who gave advice.

I mentioned that I want browser and JS shell builds to be fast, and that I want the machine to be quiet.  There were two other things that I didn’t mention, that affect my choices.

  • I’m not a hardware tinkerer type.  I don’t particularly enjoy setting up machines — I’m a programmer, not a sysadmin :)  I like vanilla configurations, so that problems are unlikely, and so that when they do occur there’s a good chance someone else has already had the same problem and found a solution.  So that’s a significant factor in my design.
  • I turn off my machine at night. And I use lots of repository clones (I have 10 copies of inbound present at all times), typically switching between two or three of them in one session.  So I stress the disk cache in ways that other people might not.

Here’s my latest configuration.  I don’t expect anything other than perhaps minor changes to this, though I’d still love to hear your thoughts.

  • CPU.  The Intel i7-4770.  I originally chose the i7-4770K, which is 0.1 GHz faster and is overclockable, but it lacks some of the newer CPU features such as support for virtualization and transactional memory.  Since I won’t overclock — as I said, I’m not the tinkerer type — several people suggested the i7-4770 would be better.
  • Motherboard. ASUS Z87-Plus.  I originally chose the ASUS Z87-C, but was advised that a board with an Intel NIC would be better.
  • Memory. 32 GiB of Kingston 2133 MHz RAM.  No change.
  • Disk. Samsung 840 Pro Series 512 GB.  No change. Multiple people said this was overkill — that 256 GB should be enough, or that the cheaper 840 EVO was almost as good.  But I’ll stick with it;  those disks have a really good reputation, it should last a long time, and I really like the idea of not having to worry about disk space, especially with two OSes installed. And apparently the performance of those drives diminishes once they get about 80% full, so having some excess capacity sounds good.
  • Graphics card.  Multiple people agreed that the Intel integrated graphics was powerful enough, and that the Intel driver situation on Linux is excellent, which is great — I don’t like mucking about with drivers!
  • Case. The Fractal Design Define R4 (Black) was recommended by two people.  It looks fantastic (my wife is in love with it already) and is reputedly very quiet.
  • Optical drive.  A Samsung DVD-RW drive. Unchanged.
  • Software. Several people suggested using Virtual Box instead of VMWare for my Windows VM.  I didn’t know about Virtual Box, so that was a good tip.  Someone also suggested I get Windows 7 Professional instead of Home Premium because the latter only supports 16 GiB of RAM.  Ugh, typical Microsoft segmented software offerings.
  • I didn’t mention monitor, keyboard and mouse because I’m happy with my current ones.

This looks like an excellent set-up for a single-CPU, quad-core machine.  However, multiple people suggested that I go for more cores, either by choosing 6-core or 8-core server CPUs, or using dual-sockets, or both.  I spent a lot of time investigating this option, and I considered several configurations, including a dual-socket machine with two Xeon E5-2630 CPUs (giving 12 cores and 24 threads) or a single-socket machine with an i7-3970X (giving 6 cores and 12 threads) or a Xeon E5-2660 (giving 8 cores and 16 threads).  But I have a mélange of concerns: (a) a more complex configuration (esp. dual-socket), (b) lack of integrated graphics, (c) higher power consumption, heat and noise, and (d) probably worse single-threaded performance.  These were enough that I have put it into the too-hard basket for now.

Ideally, I’d love to build two or three machines, benchmark them, and give all but one back.  Or, it would be nice Intel’s rumoured Haswell-E 8-core consumer machines were available now.

Still, daydreams aside, compared to my current machine, the above machine should give a nice speed bump (maybe 15–20% for CPU-bound operations, and who-knows-how-much for disk-bound operations), should be quieter, and will allow me to do Windows builds much more easily.

Thanks again to everyone who gave such good advice!  I promise that once I purchase and set up the new machine, I’ll blog about its performance compared to the old machine, so that any other Mozilla developers who want to get a new machine have some data points.

Bleg for a new machine

I last upgraded my desktop machine in June 2011.  This improved Firefox clobber build times from ~25 minutes to ~12 minutes, which was a huge productivity win.  Since then, build times have crept back up to, yep, ~25 minutes.  Time for a new machine.

I’m no hardware guru, so I’m asking for help.  I’m a Linux desktop user, and my main goal is to make compilation of Firefox fast.  I frequently build both Firefox and the JS shell, and it’s not unusual for me to have two (and occasionally three) builds running concurrently.  I’d also like for the machine to be quiet, as it will sit on top of my desk, not far from where I sit.

I have a quote for a machine from a company called CPL, who I’ve bought from before.  (Note that all prices on their website are in AUD.)  I gave them partial guidance on components (based on some very helpful advice from Naveed Ihsanullah), and here’s what they came up with.

  • CPU.  The Intel i7-4770K.  It is a 3.5 GHz CPU with 4 physical cores, 8 virtual cores, and an 8 MiB cache.  This is the fastest of the new Haswell CPUs that CPL has and so seems like a good choice.
  • Motherboard.  ASUS Z87-C.  Naveed said ASUS is a good brand.
  • Memory. 32 GiB of Kingston 2133 MHz RAM.  I currently have 16GiB, and while I don’t feel like that limits me at the moment, I might as well give myself some breathing room.
  • Disk. Samsung 840 Pro Series 512 GB.  Naveed recommended this specific model.  I currently have a magnetic disk, so I’m hoping to see some big performance improvements here (both while compiling and when doing things like grepping through the whole tree).  It’s interesting that this is easily the most expensive component;  SSDs still ain’t cheap.
  • Graphics card.  I currently do very little graphics-intensive stuff, other than look at the occasional WebGL demo.  Apparently the i7-4770K has a reasonably fast integrated graphics card, so I’m on the fence about whether I should get a separate graphics card.  If so, I guess it would be good to know if the driver situation on Linux is a happy one.
  • Case.  The Zalman MS800 Plus ATX Mid Tower.  This was entirely CPL’s suggestion, and I don’t much like the look of it.  It’s tall — about 10cm taller than my current case — and it also has an air-vent on top, whereas my current case doesn’t and so I’m able to put a printer on top of it.  On the plus side, it is pretty plain looking — no glowing lights or racing stripes — which I appreciate.
  • Optical drive.  A bog-standard Samsung DVD-RW drive.  I think I have the exact same one in my current machine, and it works fine.  I pretty much only use it to install OSes.
  • Software.  I use Ubuntu, and I want to install Windows 7 in a VM for the occasional times when I have some Windows-specific behaviour that I need to investigate.  (My current machine dual boots between Linux and Windows, and having to reboot is an enormous hassle.)  CPL has Windows 7 Home Premium 64-bit — I’m pretty sure I don’t want to deal with Windows 8 — but they don’t sell VMWare so I’ll need to get that somewhere else.

Overall, I feel pretty good about everything except the graphics card and the case.  Please let me know if you have any suggestions!  Thanks.

Mozilla Research is on a roll

I was struck by the following slide from Brendan’s Mozilla-at-15 post.

Current Mozilla Research projects

Hot damn, those are some impressive projects.

Don’t make me guess who you are

I read Planet Mozilla through Google Reader.  When I see a post with a title that sounds interesting, I open it in a new tab.  If I’m lucky, I’ll get a blog that looks like this:

blog header

It’s totally clear whose blog this is.  This is good.  Bonus points for a subtitle that gives a good idea of what the blog is about.

But all too often the author’s name isn’t prominent.  Maybe it’s down the bottom of the post:  “Posted by J. Programmer.”  Sometimes it’s tucked away on a separate “about” page.  Sometimes people use a nickname that is meaningless to me.  And sometimes there is no name at all.

It’s even worse when this lack of identification is combined with one of the common WordPress templates — then I can’t even tell if I’m reading the same blog as the last one I saw that had that same theme.

So:  put your name somewhere prominent on your blog.  Preferably in the title or subtitle.  Please don’t make me guess who you are.

This blog’s URL has changed

This blog recently moved from blog.mozilla.com/nnethercote to blog.mozilla.org/nnethercote.  This move broke the captcha I use for comments for a couple of days.  The breakage has now been fixed, sorry for any inconvenience!

Testing the top 100 add-ons for memory leaks

Yesterday I mentioned that the plan for testing the top 100 add-ons for memory leaks had hit a snag — our list is that of the top 100 installed add-ons, rather than the top 100 enabled add-ons, and the latter is what we really want.

However, after some thought, I’ve concluded that there is likely to be a lot of overlap between the two lists.  For example, I’d be surprised if any of the top 50 enabled add-ons are not in the top 100 installed add-ons list.  Furthermore, this kind of testing will never give perfect coverage anyway.

Therefore, there’s little point in delaying the testing while we wait for a better top 100 list.  If you are interested in helping, please jump in.  And if you contact me privately I’ll be able to suggest some add-ons that are towards the more popular end of the top 100 list.  Thanks, and apologies for any confusion I have caused!

MemShrink progress, week 20

Surprise of the week

[Update: This analysis about livemarks may be wrong.  Talos results from early September show that the MaxHeaps number increased, and the reduction when the “Latest Headlines” livemark was removed has undone that increase.  So livemarks may not be responsible at all, it could just be a coincidence.  More investigation is needed!]

Jeff Muizelaar removed the “Latest Headlines” live bookmark from new profiles.  This was in the Bookmarks Toolbar, and so hasn’t been visible since Firefox 4.0, and most people don’t use it.  And it uses a non-zero amount of memory and CPU.  Just how non-zero was unclear until Marco Bonardo noticed some big performance improvements.  First, in the Talos “MaxHeap” results on WinNT5.2:

Talos MaxHeap graph

And secondly in the Talos “Allocs” results on WinNT5.2 and Mac10.5.2:

Talos Allocs graph

In the WinNT5.2 cases, it looks like we had a bi-modal distribution previously, and this patch changed things so that the higher of the two cases never occurred.  In the Mac10.5.2 case we just had a simple reduction in the number of allocations.  On Linux the results were less conclusive, but there may have been a similar if smaller effect.

This surprised me greatly.  I’ve done a lot of memory profiling of Firefox and never seen anything that pointed to the feed reader as using a lot of memory.  This may be because the feed reader’s usage is falling into a larger, innocuous bucket, such as JS or generic strings.  Or maybe I just missed the signs altogether.

Some conclusions and questions:

  • If you have live bookmarks in your bookmarks toolbar, remove them! [Update: I meant to say “unused live bookmarks”.]
  • We need to work out what is going on with the feed reader, and optimize its memory usage.
  • Can we disable unused live bookmarks for existing users?

Apparently nobody really owns the feed reader, because previous contributors to it have all moved on.  So I’m planning to investigate, but I don’t know the first thing about it.  Any help would be welcome!

 Other stuff

There was a huge memory leak in the Video DownloadHelper add-on v4.9.5, and possibly earlier versions.  This has been fixed in v4.9.6a3 and the fix will make it into the final version v4.9.6 when it is released.  That’s one more add-on leak down, I wonder how many more there are to go.

TraceMonkey, the trace JIT, is no longer built by default.  This means it’s no longer used, and this saves both code and data space.  The old combination of TraceMonkey and JaegerMonkey is slower than the new combination of JaegerMonkey with type inference, and TraceMonkey is also preventing various clean-ups and simplifications, so it’ll be removed entirely soon.

I refined the JS memory reporters in about:memory to give more information about objects, shapes and property tables.

I avoided creating small property tables, removed KidHashes when possible, and reduced the size of KidHashes with few entries.

I wrote about various upcoming memory optimizations in the JS engine.

Justin Lebar enabled jemalloc on MacOS 10.5 builds.  This was expected to be a space improvement, but it also reduced the “Tp5 MozAfterPaint” page loading benchmark by 8%.

Robert O’Callahan avoided excessive memory usage in certain DOM animations on Windows.

Drew Willcoxon avoided excessive memory usage in context menu items created by the add-on SDK.

Bug Counts

  • P1: 35 (-1, +1)
  • P2: 116 (-2, +5)
  • P3: 55 (-2, +3)
  • Unprioritized: 5 (-4, +5)

At this week’s MemShrink meeting we only had 9 bugs to triage, which is the lowest we’ve had in a long time.  It feels like the MemShrink bug list is growing slower than in the past.