Automatic DeCOMtamination: Roadmap For Automated Refactorings

This is an update on the ongoing deCOMtamination work from the automated rewriting perspective. I think it’s pretty exciting that Mozilla is the first large-scale C++ project to attempt automated large scale source code cleanups and optimizations. I think the tools are finally getting mature enough for the job.

The downside is that there isn’t a published roadmap of what we are planning to achieve with deCOMtamination as we are still in the planning stages. The upside is that there is still time for any interested parties to think up the next great improvement on how things are done, checkout the refactoring toolchain and either extend an existing tool or implement a new one.

Below is a list of things that I plan to have working in the near future. The idea is to try to implement various optimizations that would be impractical(or impossible?) to do manually and see if they yield the expected performance and code quality benefits.

Step 1: Outparam Elimination

QueryInterface() and other ok/fail methods have a redundant nsresult value which can be eliminated without changing any logic in the code. The QueryInterface() rewrite is my first serious tree-wide refactoring attempt. QueryInterface() is probably the most well known and frequently-used method within Mozilla. It also one of the most CPP-encumbered methods in the tree, so it made for a good test of my CPP-aware elsa work.

The getting code to compile phase is over. Currently Benjamin is working on getting the modified code to run which requires XPConnect changes, debugging the manually-rewritten macros and verifying that the generated patch is correct.

The next step will be to eliminate local nsresult variables in the callees when they are used to store & check return values of QueryInteface(). This is basically a make-resulting-source-look-prettier optimization.

I hope to measure a speed-up and slight footprint decrease with the new QueryInterface call.

Step 2: Try Mozilla Without Reference Counting?

In my mind the most exciting part of Moz2 is Tamarin. Few things are cooler than an elegant JIT VM.

Tamarin comes with a modern garbage collector.

The goal of this rewrite is to aid Jason and Benjamin with switching XPCOM from reference counting to garbage collection. This might end up in gigantic patches to rid the stack of nsCOMPtr objects. This might be hard as it will be affecting a lot of code and might reveal more shortcomings in my version of elsa and MCPP. Details are in the wiki.

Step 3: Try C++ Exceptions

This is the most ambitious rewrite I know of for Mozilla 2. It’s similar to step 1 in that the goal is to eliminate outparameters and nsresult error codes, except in this case it would happen for all functions. Brendan mentioned this is in his blog.

The idea is that exceptions in Mozilla happen in exceptional circumstances, thus most of the time the return value will be NS_OK and the outparameter will have something valid in it. So we should rewrite all methods that return nsresult to return the outparameter value through the return value and have the errors thrown as exceptions.

In the simplest case this would involve rewriting return statements into throw statements and rewrite callers to use try/catch. Instead of

rv = bla(&outparam)…if(rv == foo) .. else if(rv == boo) return rv

the code would be

try{ outparam=bla() } catch(foo) {…} catch(boo) {throw boo}

Note this is still inefficient since the code would be manually unrolling the stack instead of letting the exceptions do that. So the next iteration of the rewrite would get rid of the

catch(boo) {throw boo}

code to streamline the execution path. Ideally this would provide a significant reduction of footprint (due to getting rid of the error propagation code) and provide a speed boost.

However, there are a lot of issues that need to be solved. How to ensure that the C++ code is exception-safe (everything has destructors to do appropriate cleanup)? How to deal with the case of the stack being a mix of platform C, C++, JavaScript, Python, etc? Most runtimes are not aware of C++ exceptions.

Infrastructure Work

Unfortunately, not all of the automated refactoring work is about exciting rewrites. Elsa is still tied to the stone-age gcc 3.4 as it can’t yet process C++ headers from the newer gcc releases due to template complexity.

There is also work that needs to be done to get OSX supported as well as Linux by elsa & mcpp. I think very little work remains there.

Another big issue is getting elsa/mcpp to work on Windows. This may involve teaching elsa about the Microsoft windows flavour of C++ or getting Mozilla to reliably build with mingw and merely teaching elsa about mingw’s flavour of windows C++.

There is also an issue of maturity. Mozilla is probably the biggest codebase to make use of elsa and mcpp, so there are teething issues to solve. Having said that, the current version of MCPP in svn should be able to compile Mozilla. Elsa can process all of Mozilla with a small patch to two files attached in the QueryInterface() bug.

Overall I’m doing less and less infrastructure work as time goes by, hopefully the tools will mostly just work from now on.

4 comments

  1. What about changing Mozilla from using NSPR to standard C++?

  2. James,
    That’s a todo item that we have to formalize too. We have to decide which parts of the ‘standard C++’ are safe to use.
    For example turns out C++ strings are really inefficient in the receive MSVC++, so it doesn’t seem like it makes sense to switch to them.

  3. I am new to deCOMtamination, but would like to contribute. I was just wondering if it is worth working on deCOMtamination bugs listed in Mozilla wiki or just wait for you guys to complete this effort. Once you complete these tools, will they automatically fix 90% of these bugs?

  4. SK,
    There is an infinite amount of stuff to do :)
    Feel free to contribute.