Quick stubs

Quick stubs are an optimization, so you probably shouldn’t care about them. For the curious few, here’s an explanation.

XPConnect, the main customs office on the border between JavaScript and C++, is bureaucratic and slow. When JavaScript calls a method or accesses a property of an XPCOM object—for example, the nodeType property of a DOM node—this happens:

  1. The JavaScript engine looks for a property named nodeType on the node and sees that it isn’t there. So it goes off looking for it, and after checking for a number of possible special cases and querying several interested parties, XPConnect detects that there’s an XPIDL property with that name on one of the interfaces the object implements. It creates a getter function and a setter function and defines the property on the node’s prototype. Later nodeType accesses in the same context, on nodes of the same class, will perform some parts of this search again but will ultimately find the property created the first time through. Are we having fun yet?
  2. The getter function is called. The same getter function, XPC_WN_GetterSetter, is used for all JS-to-C++ getters and setters, so this is some very generic code.
  3. The getter creates an XPCCallContext. Hundreds of lines of code execute, gathering all sorts of information about the current property access and storing them in this object, which is added to a XPConnect context stack. (Most of this information probably won’t be used.)
  4. Now XPCWrappedNative::CallMethod is called. This code is even more generic. It’s about 700 lines of code, but packed with branches, so on any given call, most of it is skipped. It checks the JavaScript arguments, handles errors, converts the types of arguments from JavaScript values to C++, performs security checks, and so on. When executing a getter, there are no arguments; we skip most of it. About 500 lines in, we call the C++ method. This happens via the magic of xptcall, which knows how to fake the C++ calling convention and call a specific virtual method of a C++ object.
  5. A one-line DOM method executes, returning a constant value.
  6. XPCWrappedNative::CallMethod cleans up any data structures it allocated and converts the return value and any out parameters back from C++ to JavaScript.
  7. The XPCCallContext object is removed from the context stack and dismantled. Control returns to JavaScript.

This seemed like a pretty fat optimization target. The trick was to make this faster while retaining as much of XPConnect’s behavior as possible.

There’s a long comment in js/src/xpconnect/src/qsgen.py that explains what quick stubs are and how they work. I’ll quote that here.

About quick stubs

qsgen.py generates “quick stubs”, custom SpiderMonkey getters, setters, and methods for specified XPCOM interface members. These quick stubs serve at runtime as replacements for the XPConnect functions XPC_WN_GetterSetter and XPC_WN_CallMethod, which are the extremely generic (and slow) SpiderMonkey getter/setter/methods otherwise used for all XPCOM member accesses from JS.

There are two ways quick stubs win:

  1. Pure, transparent optimization by partial evaluation.
  2. Cutting corners.
Partial evaluation

Partial evaluation is when you execute part of a program early (before or at compile time) so that you don’t have to execute it at run time. In this case, everything that involves interpreting xptcall data (for example, the big methodInfo loops in XPCWrappedNative::CallMethod and the switch statement in XPCConert::JSData2Native) might as well happen at build time, since all the type information for any given member is already known. That’s what this script does. It gets the information from IDL instead of XPT files. Apart from that, the code in this script is very similar to what you’ll find in XPConnect itself. The advantage is that it runs once, at build time, not in tight loops at run time.

Cutting corners

The XPConnect versions have to be slow because they do tons of work that’s only necessary in a few cases. The quick stubs skip a lot of that work. So quick stubs necessarily differ from XPConnect in potentially observable ways. For many specific interface members, the differences are not observable from scripts or don’t matter enough to worry about; but you do have to be careful which members you decide to generate quick stubs for.

The complete list of known differences is in qsgen.py for the curious. That list is the fine print of quick stubs. It is long; the gist is that many methods should not have quick stubs.

The end result was a speed-up of about 20% on the Dromaeo DOM benchmark suite. Some of those tests spend lots of time in a single DOM call, not in thousands of quick calls from JS to C++. Quick stubs are no win there. But some tests gained 60% or more, indicating that most of the time was being spent in XPConnect.

Peter Van der Beken and I met in Mountain View last week and did some work on bug 457897, a follow-up to quick stubs that will likely win another 20%.

Like most native methods, XPCOM methods, including quick stubs, cannot be JITted in TraceMonkey as it stands. More on that later.

One Response to Quick stubs

  1. I love reading about nity-grity optimizations like this – look forward to reading more.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>