Squash Milestone Reached
Squash can now produce a patch that squashes my testcase class nsCSSLoaderImpl into the nsICSSLoader interface such that the resulting code compiles, links and runs!
Patching function bodies turned out easier than expected. Since the last post, I’ve added the ability to rewrite variable declarations, casts and static method calls. This was enough to get nsCSSLoader.cpp compiling.
I also ran into an issue where some methods need to remain virtual such that they can be referenced from other modules. I added a -sq-virtual flag to specify method names which need to stay virtual.
I discovered that the implementation class can be used from other source files so now squash can work on multiple files. Unfortunately, this made me run into another Elsa misfeature: memory allocation. Elsa data structures do not attempt to clean up in their destructors. Once an AST is produced, it will remain in memory for the duration of execution. This is an issue, because merely parsing all of the .i files in layout/style/ takes over 600M of memory even though squash is strictly sequential and processes a single file at a time. Hopefully, converting Elsa to use auto_ptr is feasible and I wont have to resort to funny fork() tricks to reclaim memory.
ML vs C++ for Compilers: Rant
I wonder why people insist on using C++ for symbolic manipulations instead of an *ml like O’Caml and either give up or, more frequently, reinvent features such as the ml type system, list processing, garbage collection or pattern matching. Isn’t it more productive to not have to deal with segfaults, slow compilation times and have a tenfold reduction in code size?
First I grep the build directory for usage of CSSLoaderImpl. This imprecise and will eventually be handled by squash itself, but first the memory deallocation issue has to be addressed or an index of the whole sourcetree needs to be built.
find -name \*.o | xargs grep CSSLoaderImpl
This returned nsCSSLoader.o and nsLayoutStatics.o . Now .i files are produced by running make in their respective directories
For convenience I gather the .i files in a moz directory and run squash.
./squash -o-lang GNU_Cplusplus -sq-exclude-include string/nsTString.h -sq-include nsString.h -sq-include nsCOMArray.h -sq-virtual LoadSheetSync -sq-virtual LoadSheet -sq-implementation CSSLoaderImpl moz/nsCSSLoader.i moz/nsLayoutStatics.i > cssloader.patch
Turns out pretty printing C++ is hard and Oink/Elsa’s pretty printer still needs a lot of work. By producing patches and only rewriting part of the code squash rewrites only the code that needs changing. This avoids pretty printer bugs, and maximally preserves comments and the original code structure. The wackyness of the pretty printed code is apparent in the cssloader.patch, especially in the function bodies.
I am happy to see that patching is viable even without precise source coordinates or preprocessor support in Elsa. My near term goals for squash are:
- Push squash upstream
- Add the ability to translate out-parameters to return values where possible
- Get a list of candidates for DeCOMtamination and improve squash enough to process all of them
- Work on a source code indexer. This would be useful to both squash as a semantic grep database and could be used to improve lxr.
In the longer term, I would also like to see some Elsa changes:
- Figure out a memory de-allocation strategy
- Resolve the pain that is caused by Elsa having own “string” class which makes using STL an exercise in namespace verbosity. If Elsa were to switch to C++ strings, the above de-allocation job would be simplified too.
- Elsa is lossy when it comes to C++ extensions: may need to extend the Elsa AST a little
- It would be nice to improve Elsa memory consumption further. This would be hard.
- It would be great to make Elsa’s C++ const-correct