TL;DR: Any single change that reduces Firefox’s memory consumption can affect Firefox’s speed, stability and reputation in a variety of ways, some of which are non-obvious. Some examples illustrate this.
The MemShrink wiki page starts with the following text.
MemShrink is a project that aims to reduce Firefox’s memory consumption. There are three potential benefits. Speed. […] Stability. […] Reputation.”
I want to dig more deeply into these benefits and the question of what it means to “reduce Firefox’s memory consumption”, because there are some subtleties involved. In what follows I will use the term “MemShrink optimization” to refer to any change that reduces Firefox’s memory consumption.
People tend to associate low memory consumption with speed. However, time/space trade-offs abound in programming, and an explicit goal of MemShrink is to not slow Firefox down — the wiki page says:
Changes that reduce memory consumption but make Firefox slower are not desirable.
There are several ways that MemShrink optimizations can improve performance.
The case that people probably think of first is paging. If physical memory fills up and the machine needs to start paging, i.e. evicting virtual memory pages to disk space, it can be catastrophic for performance. This is because disk accesses are many thousands of times slower than RAM accesses.
However, some MemShrink optimizations are far more likely to affect paging than others. The key idea here is that of the working set size — what’s important is not the total amount of physical or virtual memory being used, but the fraction of that memory that is touched frequently. For example, consider two programs that allocate and use a 1GB array. The first one touches pages within the array at random. The second one touches every page once and then touches the first page many times. The second program will obviously page much less than the first if the system’s physical memory fills up.
The consequence of this is that a change that reduces the size of data structures that are accessed frequently is much more likely to reduce paging than a change that reduces the size of data structures that are accessed rarely. Note that this is counter-intuitive! It’s natural to want to optimize data structures that are wasteful of space, but “wasteful of space” often means “hardly touched” and so such optimizations don’t have much effect on paging.
Measuring the working set size of a complex program like a web browser is actually rather difficult, which means that gauging the impact of a change on paging is also difficult. Julian Seward’s virtual memory profiler offers one possible way. Another complication is that results vary greatly between machines. If you are running Firefox on a machine with 16GB of RAM, it’s likely that no change will affect paging, because Firefox is probably never paging in the first place. If you are on a netbook with 1GB of RAM, the story is obviously different. Also, the effects can vary between different operating systems.
Some MemShrink optimizations can also reduce cache pressure. For example, a change that makes a struct smaller would allow more of them to fit into a cache line. Like paging, these effects are very difficult to quantify, and changes that affect hot structures are more likely to reduce cache pressure significantly and improve performance.
Only a small fraction of MemShrink optimizations will speed up structure traversals.
If Firefox (or any program) uses too much memory, it can lead to aborts and crashes. These are sometimes called “OOMs” (out of memory). There are two main kinds of OOM: those involving virtual memory, and those involving physical memory.
A “virtual OOM” occurs when the virtual address space fills up and Firefox simply cannot refer to any more memory. This is mostly a problem on Windows, where Firefox is distributed as a 32-bit application, and so it can only address 2GB or 4GB of memory (the addressable amount depends on the OS configuration). This is true even if you have more than 4GB of RAM. In contrast, Mac OS X and Linux builds of Firefox are 64-bit and so virtual memory exhaustion is essentially impossible because the address space is massively larger.
(I don’t want to get distracted by the question of why Firefox is a 32-bit application on Windows. I’ll just mention that (a) many Windows users are still running 32-bit versions of Windows that cannot run 64-bit applications, and (b) Mozilla does 64-bit Windows builds for testing purposes. Detailed discussions of the pros and cons of 64-bit builds can be read here and here.)
The vast majority of MemShrink optimizations will reduce the amount of virtual memory consumed. (The only counter-examples I can think of involve deliberately evicting pages from physical memory. E.g. see the example of the GC decommitting change discussed below.) And any such change will obviously reduce the number of virtual OOMs. Furthermore, the effect of any reduction is obvious and straightforward — a change that reduces the virtual memory consumption by 100MB on a particular machine and workload is twice as good as one that reduces it by 50MB. Of course, any improvement will only be noticed by those who experience virtual OOMs, which typically is people who have 100s of tabs open at once. (It may come as a surprise, but some people have that many tabs open regularly.)
A “physical OOM” occurs when physical memory (and any additional backing storage such as swap space on disk) fills up. This is mostly a problem on low-end devices such as smartphones and netbooks, which typically have small amounts of RAM and may not have any swap space.
The situation for physical memory is similar to that for virtual memory: almost any MemShrink optimization will reduce Firefox’s physical memory consumption. (One exception is that it’s possible for a memory allocation to consume virtual memory but not physical memory if it’s never accessed; more about this in the examples section below.) And any reduction in physical memory consumption will in turn reduce the number of physical OOMs. Finally, the effects are again obvious and straightforward — a 100MB reduction is twice as good as a 50MB reduction.
Finally, we have reputation. The obvious effect here is that if MemShrink optimizations cause Firefox to become faster and more stable over time, people’s opinion of Firefox will rise, either because their own experience improves, or they hear that other people’s experience improves.
But I want to highlight a less obvious aspect of reputation. People often gauge Firefox’s memory consumption by looking at a utility such as the Task Manager (on Windows) or ‘top’ (on Mac/Linux). Interpreting the numbers from these utilities is rather difficult — there are multiple metrics and all sorts of subtleties involved. (See this Stack Overflow post for evidence of the complexities and how easy it is to get things wrong.) In fact, in my opinion, the subtleties are so great that people should almost never look at these numbers and instead focus on metrics that are influenced by memory consumption but which they can observe directly as users, i.e. speed and crash rate… but that’s a discussion for another time.
Nonetheless, a non-trivial number of people judge Firefox on this metric. Imagine a change that caused Firefox’s numbers in these utilities to drop but had no other observable effect. (Such a change may be impossible in practice, but that doesn’t matter in this thought experiment.) One thing that has consistently surprised me is that some people view memory consumption as something approaching a moral issue: low memory consumption is virtuous and high memory consumption is sinful. As a result, this hypothetical change would improve Firefox’s reputation, rightly or wrongly, for the better.
Let’s call this aspect of Firefox’s reputation the “reputation-by-measurement”. I suspect the most important metric for reputation-by-measurement is the “private bytes” reported by the Windows Task Manager, because that’s what people seem to most often look at. Private bytes measures the virtual memory of a process that is not shared with any other process. It’s my educated guess that in Firefox’s case that the amount of shared memory isn’t that high, and so the situation is similar to virtual OOMs above — just about any change that reduces the amount of virtual memory will reduce the private bytes by the same amount, and in terms of reputation-by-measurement, a 100MB reduction is twice as good as a 50MB reduction.
Some examples help bring this discussion together. Consider bug 609905, which removed a 512KB block of memory that was found to be allocated but never accessed. (This occurred because some code that used that block was removed but the allocation wasn’t removed at the same time.) What were the benefits of this change?
- The 512KB never would have been in the working set, so performance would not have been affected.
- Virtual memory consumption would have dropped by 512KB, slightly reducing the likelihood of virtual OOMs.
- Physical memory consumption probably didn’t change — because the block was never accessed, it probably never made it into physical memory.
- Private bytes would have dropped by 512KB, slightly improving reputation-by-measurement.
- Performance may have improved slightly due to reduced paging, on machines where paging happens. Those chunks are clearly not in the working set when they are decommitted, but if paging occurs, the pre-emptive removal of some pages from physical memory may prevent the OS from having to evict some other pages, some of which might have been in the working set.
- Virtual memory consumption would not have changed at all, because decommitted memory still takes up address space.
- Physical memory consumption would have dropped by the full decommit amount — 10s or even 100s of MBs in many cases when decommitting is triggered — significantly reducing the likelihood of physical OOMs.
- Private bytes would not have changed, leaving reputation-by-measurement unaffected. [Update: Justin Lebar queried this. This page indicates that decommitting memory does reduce private bytes, which means that this change would have improved reputation-by-measurement.]
Another interesting one is bug 676457. It fixed a problem where PLArenaPool was requesting lots of allocations of ~4,128 bytes. jemalloc rounded these requests up to 8,192 bytes, so almost 50% of each 8,192 byte block was wasted, and there could be many of these blocks. The patch fixed this by reducing the requests to 4,096 bytes which is a power-of-two and not rounded up (and also usually the size of an OS virtual memory page). What were the benefits of this change?
- Performance may have improved due to less paging, because the working set size may have dropped. The size of the effect depends how often the final 32 used bytes of each chunk — those that spilled onto a second page — are accessed. For at least some of the blocks those 32 bytes would never be touched.
- Virtual memory consumption dropped significantly, reducing the likelihood of virtual OOMs.
- Physical memory consumption may have dropped, but it’s not clear by how much. In cases where the extra 32 bytes are never accessed, the second page might not have ever taken up physical memory.
- Private bytes would have dropped by the same amount as virtual memory, improving reputation-by-measurement.
Finally, let’s think about a compacting, generational garbage collector, something that is being worked on by the JS team at the moment.
- Performance improves for three reasons. First, paging is reduced because of the generational behaviour: much of the JS engine activity occurs in the nursery, which is small; in other words, the memory activity is concentrated within a smaller part of the working set. Second, paging is further reduced because of the compaction: this reduces fragmentation within pages in the tenured heap, reducing the total working set size. Third, the tenured heap grows more slowly because of the generational behaviour: many objects are collected earlier (in the nursery) than they would be with a non-generational collector, which means that structure traversals done by the garbage collector (during full-heap collections) and cycle collector are faster.
- Virtual memory consumption drops in two ways. First, the compaction minimizes waste due to fragmentation. Second, the heap grows more slowly.
- Physical memory consumption drops for the same two reasons.
- Private bytes also drops for the same two reasons.
A virtuous change indeed.
Reducing Firefox’s memory consumption is a good thing, and it has the following benefits.
- It can improve speed, due to less paging, fewer cache misses, and faster structure traversals. These changes are likely to be noticed more by users on lower-end machines.
- It improves stability by reducing virtual OOM aborts, which mostly helps heavy tab users on Windows. It also improves stability by reducing physical OOM aborts, which mostly affects heavy-ish tab users on small devices like smartphones and netbooks.
- It improves reputation among those whose browsing experience is improved by the above changes, and also among users who make judgments according to memory measurements with utilities like the Windows Task Manager.
Furthermore, when discussing MemShrink optimizations, it’s a good idea to describe improvements in these terms. For example, instead of saying “this change reduces memory consumption”, one could say “this change reduces physical and virtual memory consumption and may reduce paging”. I will endeavour to do this myself from now on.
35 replies on “The benefits of reducing memory consumption”
Great analysis and distillation of the impact of memory bugs on a product. Thanks Nicholas.
What’s the schedule for Generational GC? This year? Next?
I was talking to Dave Mandelin about this just today. Bill McCloskey and Terrence Cole are actively working on it, and Steve Fink is about to join them. Dave’s rough estimate was 5-6 months. I’d be pleased if we got it that quickly… it’s a big, complex change.
Is the incremental GC expected to come sooner, or are the changes coupled?
Incremental GC should be finished quite soon, hopefully in the current development cycle. It’s independent from generational GC.
Why only three developers on such a critical element of the company’s flagship product, it’s raison d’être?
Thank you for your valuable contribution to the memshrink effort.
I’ll remind you of Brook’s law. Throwing more people at a project — especially a single, complex feature with no opportunities for parallelism — wouldn’t make it faster.
Naw, I heard that two women *can* birth a child in half the time.
I don’t know if Brooks’ Law applies when going from only 2 developers to 3 developers 🙂
Dave Mandelin said to me that an extra person probably wouldn’t make it happen faster, but he thought it might reduce the risk of the schedule slipping. He also said there might not be that much generational GC stuff that can happen in parallel, and Steve might end up doing other stuff.
I’m not too worried about generational GC; it has multiple good people working on it.
Thanks for this explanation of the complexities of memory – and what really makes a difference to users.
I’d like to highlight one area which you have not directly addressed, and which contributes to the poor reputation of Firefox memory usage. This is the impact that Firefox memory usage has on other applications that are running on the same machine. I suspect this is somewhere between physical and private memory.
My work laptop, provided by a large enterprise where FF is the preferred browser, has 3GB memory. This laptop is allocated to me until some time in 2015. The primary usage is MS Office, email and specific work applications. Based on previous experience, the memory requirements of these applications will grow over the next three years and and at some time my laptop will become memory constrained and start excessive paging. At this point, like many users, I will look at Task Manager to see what is using all the memory. On earlier 1GB and 2GB machines, I would have to shut FF to run some business apps.
Right now, according to Task Manager FF 10 is using almost 8 times as much memory as the next largest application, despite me having multiple large presentations and documents open. All that FF is used for is to browse the internal intranet plus a few news sites (e.g. BBC). The good news is that thanks to memshrink FF is much better behaved, in that it no longer grows so quickly and it gives back memory as I close tabs or windows.
So, I believe that reducing FF memory is a good thing in its own right, not just as an aspect of FF performance.
Those slow-downs are all caused by paging. For paging, as you say, Firefox’s performance also has implications for other programs running on the same machine. So it’s the working set size that’s the primary factor.
Ulrich Drepper’s LWN series on memory performance has a section on page fault optimisation. Apparently he also has a valgrind-based tool for measuring the working set influx, called pagein. How does it compare to Julian Seward’s prototype, did the prototype get merged into cachegrind or callgrind?
I haven’t heard of that. Thanks for the pointer! I’ll read it with interest.
I’ve read through the two threads you linked to about a the win-64 build. Has Asa followed up on his opening post in the 2nd and published the product team plan yet?
Not that I know of. https://wiki.mozilla.org/Features/Desktop/64bit_Firefox_Windows_7 is the feature page. Except for a trivial edit a few days ago it hasn’t been modified since October, and Asa hasn’t touched it since July. It’s possible that the plan is written down somewhere else, but if so, I can’t find it.
It’s good to see that Mozilla is finally beginning to act on the fact that their reputation is going to be staked on the quality of their addons, whether third party or not. The leak checking during review is an excellent first step. Offering developer assistance will also promote goodwill within the addon community. I think that any efforts made to improve addon quality will pay dividends in the long run, so I hope it continues.
One other area that reduced memory utilization can have an effect is on how much memory in the system is available for things like a disk cache.
I’ve had webservers where the performance took a nose dive on busy days while the systems still had lots of CPU and ram available to them. We finally identified that what was happening what the the working set of files that were being served to the Internet was larger than the memory available for the disk cache, and the result was that instead of serving the pages from memory, the webserver was having to wait for disk I/O
but while this was happening, the CPU was idle, waiting for disk, and the memory utilization of the webserver was only slightly higher than on days when this didn’t happen.
so the effects of using more memory are sometimes rather subtle.
This is similar to paging that you do refer to above, but much less likely to be noticed.
I wonder, how does a compartment for about:blank get to be 21.55MB o.O This is on Aurora 12.0a2
│ ├───21.55 MB (01.91%) ++ compartment(about:blank)
Can you show me all the about:compartment entries that you’ve omitted for this compartment? (Click on the entry to expand it sub-tree; and then click on any entries within that sub-tree that have a ‘++’ to expand them too.)
Sure… [hours later]
│ ├───21.81 MB (02.80%) — compartment(about:blank)
│ │ ├──18.29 MB (02.35%) — gc-heap
│ │ │ ├──10.97 MB (01.41%) — arena
│ │ │ │ ├──10.69 MB (01.37%) ── unused 
│ │ │ │ └───0.28 MB (00.04%) — (2 tiny)
│ │ │ │ ├──0.14 MB (00.02%) ── headers 
│ │ │ │ └──0.14 MB (00.02%) ── padding 
│ │ │ └───7.32 MB (00.94%) — (4 tiny)
│ │ │ ├──4.42 MB (00.57%) — shapes
│ │ │ │ ├──2.66 MB (00.34%) ── tree 
│ │ │ │ ├──1.55 MB (00.20%) ── base 
│ │ │ │ └──0.21 MB (00.03%) ── dict 
│ │ │ ├──2.62 MB (00.34%) — objects
│ │ │ │ ├──2.18 MB (00.28%) ── function 
│ │ │ │ └──0.44 MB (00.06%) ── non-function 
│ │ │ ├──0.20 MB (00.03%) ── type-objects 
│ │ │ └──0.07 MB (00.01%) ── scripts 
│ │ └───3.53 MB (00.45%) — (2 tiny)
│ │ ├──2.55 MB (00.33%) — shapes-extra
│ │ │ ├──2.13 MB (00.27%) ── compartment-tables 
│ │ │ ├──0.17 MB (00.02%) ── tree-tables 
│ │ │ ├──0.15 MB (00.02%) ── tree-shape-kids 
│ │ │ └──0.09 MB (00.01%) ── dict-tables 
│ │ └──0.98 MB (00.13%) ── object-slots 
Hmm. Disabled all extensions (in case firebug/noscript/stylish was the problem), and straight after a browser restart:
│ ├───25.46 MB (07.14%) — compartment(about:blank)
│ │ ├──21.47 MB (06.02%) — gc-heap
│ │ │ ├──13.21 MB (03.70%) — arena
│ │ │ │ ├──12.89 MB (03.61%) ── unused 
│ │ │ │ └───0.33 MB (00.09%) — (2 tiny)
│ │ │ │ ├──0.17 MB (00.05%) ── headers 
│ │ │ │ └──0.16 MB (00.04%) ── padding 
│ │ │ ├───5.01 MB (01.40%) — shapes
│ │ │ │ ├──2.98 MB (00.84%) ── tree 
│ │ │ │ ├──1.77 MB (00.50%) ── base 
│ │ │ │ └──0.26 MB (00.07%) ── dict 
│ │ │ └───3.25 MB (00.91%) — (4 tiny)
│ │ │ ├──2.95 MB (00.83%) — objects
│ │ │ │ ├──2.48 MB (00.69%) ── function 
│ │ │ │ └──0.47 MB (00.13%) ── non-function 
│ │ │ ├──0.22 MB (00.06%) ── type-objects 
│ │ │ ├──0.08 MB (00.02%) ── scripts 
│ │ │ └──0.00 MB (00.00%) ── strings
│ │ └───3.99 MB (01.12%) — (3 tiny)
│ │ ├──2.93 MB (00.82%) — shapes-extra
│ │ │ ├──2.51 MB (00.70%) ── compartment-tables 
│ │ │ ├──0.20 MB (00.06%) ── tree-tables 
│ │ │ ├──0.11 MB (00.03%) ── dict-tables 
│ │ │ └──0.11 MB (00.03%) ── tree-shape-kids 
│ │ ├──1.06 MB (00.30%) ── object-slots 
│ │ └──0.00 MB (00.00%) ── string-chars
What on earth is it doing.
All those “” suffixes indicate that there are actually 391 about:blank compartments. Do you have an enormous session (i.e. many tabs) and also have “don’t load tabs until selected” selected? That would explain it. We have https://bugzilla.mozilla.org/show_bug.cgi?id=681201 open to reduce the cost of these unrestored tabs, though in your case it’s costing only 25.46 / 391 = 66.7KB per tab.
If you don’t have an enormous session with 391 unrestored tabs… then I don’t know what’s going on.
Yeah, I use panorama to store work sessions for projects I do and then push them to the background until I need to revisit them. Didn’t know it used about:blank to populate empty unloaded tabs, I am now educated.
Does a large sessionstore.js have an IO latency cost as well? I wonder if it would be less if one switched to sqlite (even with its fsync acrobatics).
“what’s important is not the total amount of physical or virtual memory being used, but the fraction of that memory that is touched frequently.”
No, the total amount of virtual memory used is also very important on modest hardware like netbooks and old laptops. I know this from years of bothersome experience trying to use FF with its remarkable memory leak. For years now, FF left open for a day or so will create a huge amount of virtual memory for the paging system to handle which is unparalleled by any other applications i run. The idea that this is fine as long as the virtual mem is ‘not touched frequently’ is a unreal excuse for the long running situation. Mozilla developers need more experience running on modest hardware and to revise their policy on the leak. K-Meleon reveals what FF could be like without its terrible memory leak, KM just doesn’t leak and never gets over 200MBs.
That sentence was in the section about paging. I addressed virtual memory elsewhere in the article.
If you doubt Mozilla’s and my dedication to reducing memory consumption, please read https://wiki.mozilla.org/Performance/MemShrink, especially the links under “Minutes, Progress Reports and Presentations”.
As for the steadily increasing memory consumption that you see — are you running any add-ons? In our experience, with recent versions of Firefox (e.g. version 7 and later) it’s most often one or more badly-written add-ons that is the cause of excessive memory consumption. Try restarting in safe mode (http://support.mozilla.org/en-US/kb/Safe%20Mode) which disables all add-ons and see if the problem persists. If the problem goes away, it’s very likely one of your add-ons that’s at fault; you can then selectively disable them one at a time to determine which one is causing the problem.
Im sorry for sounding confrontational, however i have tried most versions of firefox with and without addons, I look after a few computers and for some years now, every version of ff i tried leaks like nothing else. I have also read scores of articles and discussions which include dismissive statements about the situation. Every update mentions memory improvements and im fed up with my personal testing cycle – version 10 sounded really hopeful, but it still leaks here. Ive read too many statements to the effect that the virtual memory demand is pageable so its no big problem.
The FF 3.6 branch seems to leak the least, but K-Meleon which is based on it doesnt leak at all.
Good luck with it Nicholas and sorry for chewing your ear, but i dont believe it. I dont think FF needs to be ‘more efficient’, i think mozilla should own up to it and even make it less efficient if needs be -in order to fix the *leak*!
I understand some people are dismissive but I’ve been working full-time on reducing Firefox’s memory consumption for over a year. So I want to hear about these things.
Can you be more specific about the leaks? Most people who have a recent Firefox and don’t have badly-written add-ons don’t see bad leaks, so I’d like to know what is different about your machine and configuration.
You said you tried FF10. With no add-ons? Can you give data about the leaks, e.g. explain the steps to reproduce and cut+paste the contents of about:memory? Is this on Windows?
“don’t see bad leaks”
You dont see FF10 use 400+ MBs after a few hours running? I havent tested it in safemode recently, so i will do this in the next day or two and post about:memory. If it is due to an addon i can isolate that will be great, however should moz not protect against addons leaking 100s megs?
On my favourite laptop i run without a pagefile and a memory level icon in the systray so im aware of mem useage. It is a customised machine, but ive observed huge ff memory demand on my desktop and family computers too.
Thanks for your interest, ill get back to you in day or two with ff10 test.
It’s only a memory leak if you then close all those tabs and the memory consumption doesn’t drop.
Im typing on K-meleon which is an mfc compile of ff3.6 codebase. Ive had gmail, facebook, and around a dozen tabs open for days and its at 152 mb private bytes at the moment.
If you close all tabs but 1 what does your useage drop to? You dont have to do it now but are you sure, check sometime?
In all recent firefox versions ive tried, it wants restarted every few hours and i plain dangerous if left running indefinitely. Ill try FF10 soon and post details, im hoping it wont be yet another wild goose/addon chase.
Comparisons between different systems are very hard to interpret, because there are so many factors that can cause differences. If you can compare K-meleon and FF10 on the same workload on your machine that’ll be much more useful information.
I ve been running FF10 now with no addons (although i should have turned a couple on because its unrealistic) It opened an old 12 tab session and started up with 130/135 privateB/working set. After a couple of links and laying open for 40mins its up to 164/170 which looks like 45/45 mb per hour leak…
But its early days ill see how it goes…
Im happy until it starts tipping 250.
The about:memory page looks very promising.
K-meleon is remarkable it is at 150/120 and has been open and active for days. Its got privacy and developement plugins installed but no firebug. Its just a little glitchy on some sites, but thats worth it for netbook stability and it exemplifies what is possible.
Im happy and a little sheepish to report FF10 not doing too badly with addons turned off. It stayed below 200 MBs after a couple of hours light useage and returned to 143/149 MB after closing all but 2 tabs.
I just commented/complained quickly here and should have read your blog first! It is very interesting and promising. When i have time i will try and discover which addons in my profile have been causing runaway demand and report them. A whitelist and blacklist would be very useful.
There is still room for improvement, K-Meleon /commMeleon compile returns to 120/93 MB for 2 tabs after running for ages with a number of addons installed. Making it brilliant to have as the always open browser, but it is slightly old and unpolished.
This netbook and tablet generation of hardware is putting fresh pressure on memory design and management. I was very impressed with google’s memory optimisation of java in android.
Thanks for your help and patience Nicholas
Excellent! One thing that can cause memory usage to increase at start-up is Firefox’s safe browsing service, which has to download a database of known bad sites from a server. That often surprises people, because it causes memory usage to go up even when the browser is apparently idle. In upcoming versions of Firefox the amount of memory used during these downloads has been greatly reduced.
And we’re still working hard to reduce memory consumption in many other ways. Thanks for trying Firefox again!