You lose more when slow than you gain when fast

Post author By Nicholas Nethercote
Post date May 31, 2011
18 Comments on You lose more when slow than you gain when fast

SpiderMonkey is Firefox’s JavaScript engine. In Firefox 3.0 and earlier versions, it was just an interpreter. In Firefox 3.5, a tracing JIT called TraceMonkey was added. TraceMonkey was able to massively speed up certain parts of programs, such as tight loops; parts of programs it couldn’t speed up continued to be interpreted. TraceMonkey provided large speed improvements, but JavaScript performance overall still didn’t compare that well against that of Safari and Chrome, both of which used method JITs that had worse best-case performance than TraceMonkey, but much better worst-case performance.

If you look at the numbers, this isn’t so surprising. If you’re 10x faster than the competition on half the cases, and 10x slower on half the cases, you end up being 5.05x slower overall. Firefox 4.0 added a method JIT, JaegerMonkey, which avoided those really slow cases, and Firefox’s JavaScript performance is now very competitive with other browsers.

The take-away message: you lose more when slow than you gain when fast. Your performance is determined primarily by your slowest operations. This is true for two reasons. First, in software you can easily get such large differences in performance: 10x, 100x, 1000x and more aren’t uncommon. Second, in complex programs like a web browser, overall performance (i.e. what a user feels when browsing day-to-day) is determined by a huge range of different operations, some of which will be relatively fast and some of which will be relatively slow.

Once you realize this, you start to look for the really slow cases. You know, the ones where the browser slows to a crawl and user starts cursing and clicking wildly and holy crap if this happens again I’m switching to another browser. Those are the performance effects that most users care about, not whether their browser is 2x slower on some benchmark. When they say “it’s really fast”, the probably actually mean “it’s never really slow”.

That’s why memory leaks are so bad — because they lead to paging, which utterly destroys performance, probably more than anything else.

It also makes me think that the single most important metric when considering browser performance is page fault counts. Hmm, I think it’s time to look again at Julian Seward’s VM profiler and the Linux perf tools.

18 replies on “You lose more when slow than you gain when fast”

This is also why using geometric mean for computing overall benchmark scores is bad: it puts just as much weight on making something already fast even faster as it does on making slow things fast.

Are memory leaks still a problem? If not, how can we show/prove to users that it isn’t a problem. Lots of people I know, technical people, engineers, still consider Firefox to leak lots of memory and I never have a good response to them.

Gen: yes, they’re still a problem. See here for a list of leaks and leak-like reports against Firefox 4 betas and onwards. I think the best response is “we’re working on it”. Firefox 5 is definitely better than Firefox 4. The rapid release cycle should help too, as new big leaks are less likely to be introduced when features are added gradually instead of in one big hit every year or so.

For the most part, my Firefox(es) work fine until paging issues are involved (I guess I need more RAM). However, for some reason, if I go away from Firefox, leave it open, and then come back to it 8+ hours later (or so) when I click or scroll something in Firefox it will freeze up (i.e. Windows will say ‘Not responding’), sometimes for a minute, sometimes longer. If I check the Task Manager, the CPU looks idle and there doesn’t seem to be any unusual page swapping or I/O. It seems to happen more when I have more tabs open, but the correlation isn’t perfect.

How can I find out what Firefox is doing when it’s frozen like this so that I can write up a useful bug report? And given I have only slow means to replicate it (i.e. wait 8+ hours and hope this is one of those occasions it will freeze) — how do I replicate it faster?

voracity: I don’t have good answers to your questions. That kind of problem is a nightmare to debug 🙁

I fully agree.

When I was doing perf tests of some software (or hardware) I usually use the 95 percentile of the value being measured as the metric to follow for improvement.

I specifically spent a lot of time improving set top box zapping time and I do know that the users only remember the “slow” zappings if they happen more than a few time during the evening.

Ben

You are right, It happens really often for me that FF just freezes completely because of IO (may be paging related) when the whole system is under VM pressure. (happens more often under windows, although my windows system is far more powerful with both more memory and faster swap)

Mozilla needs good Talos tests for these cases. I was also wondering why your great work you did on memory reduction so far was not reflected no areweslimyet.com at all?

This definitely needs better performance tracking tools.

Arpad: the Talos tests on areweslimyet.com aren’t very good, there’s heaps of problems they can miss. See https://wiki.mozilla.org/Performance/MemShrink (which I just rewrote this morning) for some more thoughts about this.

@voracity : having a stack trace with symbol would help see https://developer.mozilla.org/en/how_to_get_a_stacktrace_with_windbg

hej, this topic was covered on falsyvalues.com conference in warsow, poland, very interesting indeed.

I understand that you would like to see more work on speeding those slow parts rather then constantly improve those already fast.

I think they should focus more on DOM/CSS operations, on the other hand JS is gaining popularity as language out of browsers so I understand the effort to speed it up.

anyway nice post;]

I do not think I could agree more aggressively than *absofreakinglutely*.

Nicholas your posts of late make that much sense, I nominate you for lead engineer and Mitchell Baker’s position all in one.

It seems you are finally breaking down the barriers of semantics that exist between real-world users and mozilla-world developers and policy makers.

I can concur with your theory because it happens to me daily. I run a relatively old system at home. It has just 1GB of DDR memory which is shared with onboard graphics. No SSD, only SATA II HDD support but it does have gigabit ethernet. Most of the time Fierfox takes up ever more of my memory until it is using well over 500MB (more than half) and often even higher.

Despite this, my system usually runs fine and I’ve plenty of extensions and tabs open. The *only* time my system and Firefox feel slow is when memory leaks seem to eating up memory and page faults are sky high.

If Firefox cannot take advantage of some instruction sets on my 6 year old CPU, cannot run without slowdowns caused by memory leaks to the point of the window freezing, page file thrashing and the browser almost crashing (even when there’s still one or two hundred MB of RAM that Firefox is yet to eat, then Firefox is really not coded all that well.

I love where you are going with your thinking Nicholas. It is highly overdue that Mozilla has people like yourself looking at Firefox performance from an objective and ongoing viewpoint. You do not seem to be sucked in by the sort of hype such as that which occurred with JaegerMonkey and so forth.

Please keep fighting hard investigating where real performance improvements might come from in Firefox. After all with the major-release-every-time-someone-coughs decision, it’s going to look very embarrassing when Firefox hits ‘version 10’ early next year and it is still thrashing hard drives, paging out things like simple frecency calls to the places database when a user is trying to click-load a favourite website from the awesomebar, as just happened to me (took about 5 seconds or more).

FWIW, could you add the issue of Add-On memory leaks to your agenda? Add-On developers are being named and shamed for startup impact but not memory impact yet. Sounds silly to me.

Also xtalos which can count page faults too: https://wiki.mozilla.org/Auto-tools/Projects/XTalos

I don’t know much about math so I’m hoping someone can explain this to me: “If you’re 10x faster than the competition on half the cases, and 10x slower on half the cases, you end up being 5.05x slower overall.”

Asa: imagine you have two operations you care about. On browser A they both take 1 second. On browser B, the first is 10x faster and so takes 0.1s, but the second is 10x slower and so takes 10s. Total time for both operations is 2 seconds on browser A and 10.1 seconds on browser B. 10.1 / 2 = 5.05. QED 🙂

Asa: and that example is simplified, of course. But we saw it with SunSpider a lot; if you suck really badly at one or two benchmarks compared to the competition you don’t have a chance overall. And for normal users, the time lost when Firefox grinds to a halt for 30 seconds is likely to exceed any wins from faster basic operations (JavaScript, DOM, whatever) in most cases.

Would it make sense for us to keep metrics on the browser’s avg memory usage, and also on page faults, and if we’re both hitting lots of page faults and above avg size, pop up a window suggesting that the user restart their session to get better performance?

Jason, I suspect that send a signal — that we’re not even going to try to fix the memory issues, just provide hacky workarounds. I think the effort would be better spent trying to fix leaks.

njn, thanks for the math help. makes good sense.

Comments are closed.