Faster Plugin Enumeration + Help Wanted

In addition to slow font enumeration, we were suffering from a similar problem: slow plugin enumeration. Just as with fonts, the plugin enumeration code is different on every platform. Unlike the font situation, plugin enumeration is done completely within our code(ie easy to fix).

Plugin enumeration is often triggered by JavaScript code (for example by checking if a Java handler is present). This means that enumeration is a blocking operation that must happen quickly. XPerf made me wonder why so many plugin-like .dll files were being read. This lead me to a fun set of perf fixes.

The Algorithm

  1. Files in plugin directories are listed
  2. Platform-specific IsPluginFile function to determines what files look like plugins(ie np*.dll on Windows).
  3. Code then checks if the files + their timestamps are known by pluginreg.dat. If so, cached info is used and the following steps are skipped
  4. For each library-file that isn’t found in pluginreg.dat, we use platform-specific GetPluginInfo to load the library-file to see if it is indeed a valid plugin (and to see what mimetypes it handles/etc).
  5. Valid plugins are recorded in pluginreg.dat.

This process took up to 3 seconds on a user’s computer. WTF? There were gotchas in almost every step of the way.

  1. Windows directory listing code would request metadata for every bloody file in the directory. Which resulted in an easiest optimization ever: pure code deletion.
  2. IsPluginFile on Windows/Mac sneakily did more than just check the filename. It also checked if the file was loadable, which on Windows loaded the dll and all of the dependencies. Mac code was satisfied with merely doing a little extra IO.
  3. This part was right
  4. #2 was easily fixed by moving file IO here.
  5. Files that failed the check in #4 were doomed to cause extra IO for all of eternity. Scott Greenlay fixed that by recording invalid plugin-like files too.

This was a rare fix that resulted in seconds saved on crapware-loaded computers. Usually I have to count my progress in milliseconds :(

Help Wanted

I have plans for vastly improving Firefox startup, but I need help to get there. If you enjoy beating under-performing code into submission and want to work for Mozilla, please send me your resume(taras at mozilla dot com). Example projects: a better performance testsuite (ie tracking IO, cpu instructions, etc), better infrastructure for profiling addons, optimizing away various CSS/XUL markup, etc. A low-level approach to solving problems is helpful, compiler/linker/kernel hackers are well-suited (but not required) for this.

7 comments

  1. Hi Tara!
    Finally Mozilla can see this problem…. great idea to hire one or two guys dedicated to solve it.
    I hope you find your hAcKerz :)
    Thanks!

  2. Hooray for fixing this! The delays can even sometimes be of the order of 20 seconds or above — pretty crazy.

  3. Does this mean that extensions can drop all their ABI-specific plugins (e.g. Linux_x86-gcc3 and Linux_x86_64-gcc3) into the main extension plugin folder and this will just work? (Platform-specific plugin folders got dropped in bug 568691).

  4. Does this focus to the Core performance now also mean that bugs including patches for Core performance optimization will get reviewed now?

    For example:
    313282 In strcstr.c there is an ‘obvious improvement’ waiting to be performed
    456547 Only create the offline cache when really needed
    490700 Let qcms directly transform to Cairo pixel format to save later conversions
    11736 nsJARInputStream.cpp: release dependency on nsJar when doing directory read
    417154 nsMemoryCacheDevice::EvictEntriesIfNecessary could be optimized
    331032 Decom nsICacheEntryInfo, replace by ExpirationTime attribute to nsICachingChannel
    230675 ‘decom’ of nsICacheVisitor.idl: saves 10% / 150K from nkcache_s.lib
    399223 make aToken a member of nsCSSScanner instead of passing as function argument
    617897 Replace calls to AppendASCII(‘*’) with Append(‘*’)
    405407 Merge nsDiskCacheStreamIO and nsDiskCacheStreamOutput

    I also have submitted a patch to optimize nsDirEnum itself somewhat (saving a NEW and string copies).

  5. I think 11736 should be 511736 in the previous comment.

    Keep up the good work, I hope you get some good applicants.

  6. Alfred,
    I reviewed the patch in 617897

    bug 405407 does seem relevant and it’s too bad it wasn’t picked up in time for ff4. I see no reason to not review/land that asap once ff4 branches.

    The rest of the patches appear to be waiting for your input so they can progress.

  7. I really rest when it comes to run vtune at work. The problem typically is to set up useful and measurable experiment (program run).

    I think if you post or otherwise publish cases where “beating under-performing code” is really needed some people may do it as part of their hobby. Of course if they could get to it through myriad of obstacles in a weekend. Last time I checked ff code infrastructure looked quite monstrous though.