I spent some time proving to myself that it is possible to automate DeCOMtamination. The result is squash – a tool that aims to accomplish the first step of DeCOMtamination. My near term goals are to be able to squash together a real life XPCOM interface and implementation such that:
- The resulting code compiles
- Firefox runs correctly
- Simple functions using out parameters are converted to use return values instead
Currently I am approximately halfway through with first requirement. In its current form squash lives as a patch to the oink suite of tools.
- Pick a simple XPCOM implementation class to squash. I like
- Produce a .i file because Oink/Elsa do not have an integrated preprocessor yet.
make CXX="gcc -E" nsCSSLoader.o; mv nsCSSLoader.o nsCSSLoader.i
- Squash it:
./squash -o-lang GNU_Cplusplus -sq-implementation CSSLoaderImpl nsCSSLoader.i -sq-exclude-include string/nsTString.h -sq-include "nsString.h" > /tmp/cssloader.patch > cssloader.patch
- Apply the patch:
patch -p0 < cssloader.patch
- Run make, deduce the problem with the patch and start over
Current functionality at the header level:
- Class member merging
- virtual interface members are replaced with non-virtual ones from implementation
- Class members and their parameters are renamed: eg s/CSSLoaderImpl/nsICSSLoader/
- Additional members from the implementation are added to the interface
- Extra class/struct definitions used by the implementation members are moved up into the header as needed
- Header and forward declaration inference
- The resulting class will usually not compile because more headers are needed. Squash computes the set difference of class definitions as used by the implementation and the interface. Part of the list ends up as forwad declarations, the other part is translated futher into #include statements.
- The above is trickly algorithm to fully implement fully, so in the meantime there are also manual overrides with -sq-include and -sq-exclude-include
At the source level squash renames the class part of function definitions along with their arguments.
Challenges and Limitations
I am really grateful for Elsa’s ability to parse C++ as used in Mozilla. I think this opens a lot of doors to automation, optimizations and new features that would be too laborious to even consider implementing otherwise. However Elsa still has some maturing to do. We loose typedef information, pretty-printing is still spotty and looks too different from parsed C++, but the biggest challenge is the lack of an “end-of-ASTNode” position information. All these will be addressed in the future, but in the meantime I have to make unfortunate choices of either getting distracted and working on Elsa/Oink internals or do work-around and continue with writing tools.
My short term goal is to finish step 1 and to get squash into the oink SVN.