ZOMG JSON CFGs!
One thing at a time:
I’ve been working on a CFG extractor for C++ using Taras’ gcc plugin system, and I’ve got a basic prototype working. CFG stands for “control flow graph”, which is a low-level representation of a function suitable for automatic analysis. A CFG has two main features: it shows the control structures of the function in flow-chart format, and instead of having C++ code it has a sort of bytecode. (In my case, GIMPLE expressions, which is how gcc represents CFGs.)
Here is a graphical view of the CFG for the method nsServerSocket::TryAttach.
As suggested in the title, what my plugin really does is output CFGs in JSON format. I don’t know if Taras entirely approves, but I think it’s a great idea, because (a) then I don’t have to mess with JSAPI (the C++ API for the Spidermonkey JS interpreter), (b) in my experience it really helps to have textual intermediate formats for complex analysis processes, and (c) you can write code to explore the CFG in your favorite language, in my case Python.
The immediate practical goal I have in mind is analyzing how error codes (
nsresults) are used in Firefox, so I can figure out how to replace them with C++ exceptions. As a test example, I started writing a script to find the propagate pattern, which is anything that looks like:
nsresult rv = callSomeMethod();
if (NS_FAILED(rv)) return rv;
The significance of propagate is that refactoring it to use exceptions is easy: just remove rv and all the code that uses it.
My idea for the script is to find method calls, then for each call do a DeHydra-like exploration of all forward paths that can be reached when rv is a failure code (i.e., rv != 0). If all the failure paths return rv before doing anything else, then we have the propagate pattern.
I’ve coded up a crude version of this idea that’s just good enough to find the two instances of propagate in TryAttach. And it runs in about .007 seconds on TryAttach, which is a rate of 500,000 methods per hour. Over all of FF, it will almost certainly not run that fast, but at least I haven’t already proven it’s too slow.
Next steps will be trying the analysis on more FF methods, or cleaning up the CFG extractor and adding it to the Dehydra-gcc repository.