Feed on
Posts
Comments

This is really a RFC.

It seems to me that many end-user bugzilla reports about excessive memory use share a common structure.  First, the user reports that “I did A, B and C, had a coffee, went back to my machine and found that Firefox was sitting on N gigabytes of memory.”  Then follows lots of
discussion along the lines of “oh, so the cycle collector did/didn’t run while Fx was idle”, and “but no, that’s not right, because the tertiary FooBar timer inhibit mechanism will have made the XYZ collector run instead” kind of thing.

This goes on and on, while everybody tries to figure out what Firefox was actually doing in the period leading up to the too-much-memory observation.  Meanwhile the original reporter gets bored with the discussion and moves on to something else, and there’s general confusion, annoyance, and lack of reproducibility all round.

So the idea is simple: make a log file listing events which are known to have a significant impact on memory usage.  Nothing heavyweight, just one line per event, timestamped, plus brief relevant stats.  For example:

  • GC started / ended, total size N, reclaimed M bytes
  • GC mapped in new heap / unmapped heap
  • CC started / ended, total size N, reclaimed M bytes
  • image discard ran
  • jemalloc mmap’d more pages / munmapped some pages
  • nanojit / mjit mapped / unmapped code pages

Perhaps with some indirectly relevant events such as

  • downloaded another chunk of the phishing database
  • no user input seen for 17 minutes
  • new tab created; now there are 23 of them
  • user requested about:memory (+ what it produced)
  • extension Xyzzy loaded/initialised

That way, we’d at least have some information about the space history leading up to a situation where a user says “urk!  massive memory leak!”.

The log would be low overhead, so it can be used in production.  For sure we don’t want to log more than a couple of events per second.  Perhaps a moderate sized circular buffer (64KB? 1MB?) that could be dumped to disk on request.  Then, when someone reports excessive memory use, the first thing we ask for is the log file.

Partial implementations of this exist already.  There’s javascript.options.mem.log, for example, and I’m sure other subsystems have their own schemes.  But AFAIK there’s no uniform,  system-wide, lightweight, easy-to-use mechanism.  Would one be useful?

10 Responses to “A low overhead, always-on, system-wide memory event log”

  1. on 26 May 2011 at 12:03 pm glandium

    I think that could be interesting, and that this should probably be part of the recently born telemetry.

  2. on 26 May 2011 at 2:32 pm bjacob

    I think that would be a good idea, and I can see how I would use it. I would suggest keeping the log in a buffer instead of writing it to a file, to reduce the overhead (otherwise I’m afraid it’s going to be too expensive for me) and/or maybe have it disabled by default (settable at runtime) as people know when they want it.

  3. on 26 May 2011 at 2:45 pm jseward

    bjacob: keeping it in a buffer is important, since writing
    continuously to disk could cause large numbers of otherwise-pointless
    disk wakeups on mobile systems, which is definitely bad.

    I think it should be enabled by default, though. It’s most valuable
    exactly when memory use is unexpectedly high and we have no clear idea
    how to reproduce it. If we’re putting more than one or two lines per
    second in the log, we’re misusing the mechanism. The cost of one or
    two sprintfs per second plus a 64k circular buffer is probably
    undetectably low.

  4. on 26 May 2011 at 2:46 pm Wladimir Palant

    Yes, sounds like a good idea. Instead of dumping the buffer to disk I would display its contents on about:memory – the user should attach about:memory contents to the bug report anyway.

  5. on 26 May 2011 at 11:40 pm azakai

    Very good idea in my opinion.

  6. on 26 May 2011 at 11:59 pm Nicholas Nethercote

    Not surprisingly, I like the idea. I don’t think it has to be restricted to memory-related events, though obviously they are good to have in there.

    Some other events that would be interesting:
    - Open/close a tab, including the URL
    - Create/destroy a JSCompartment, including the principle (the URL, basically)

  7. on 27 May 2011 at 12:01 am Nicholas Nethercote

    And I think the in-memory circular buffer is a good idea, with a way of saving it to disk. Perhaps there could be a command-line option or about:config option to increase its size. So you could make it quite big if you wanted to run the browser for a long time and not miss events.

  8. on 27 May 2011 at 12:28 am Nicholas Nethercote

    Instead of dumping the buffer to disk I would display its contents on about:memory

    I’d put it in a separate page — about:logging or something — because it doesn’t have to be only memory-related events.

    It could just be non-localized, plaintext, IMO. Google Chrome has about:histograms which is like that.

  9. on 08 Jun 2011 at 10:06 pm Frank Nestel

    Good logging is allways a way to get ahead of things.

    Keeping it in memory for performance is generally a good idea. However then you won’t get information when the browser gets unresponsive. FF4 was getting unusuable slow when approaching certain internal (probably memory) boundaries and it could be nice to have some information in case of a crash or hang.

    So, yes, keep it in memory, but provide automaticall, periodic dumping to disk! I mean writing a (say) 1MB file every handful of seconds can’t be much more fileio, than all the sqllite stuff being done.

  10. on 12 Jul 2011 at 7:20 am Colin

    This could certainly be a helpful feature, but how many users post on the forums to announce a ‘massive memory leak’ vs the number of active Firefox users? I wouldn’t add a feature like this to prod… unless you also intended to report back to Mozilla for every user who agrees to the ToS and then used the data to find previously unknown mem leaks (not a bad idea…)

    I can be obsessive about efficiency at times, but I’m not sure its not worth it to make millions of people run this code when only a handful will ever use it. Firefox tends to be fairly solid, whenever I seen FF breach 100MB its because I’m running an intensive site like Pandora.