07
Mar 07

Dehydra 0.1

Dehydra does a form of basic symbolic execution now. It’s enough to avoid the unfeasable branches. It enabled me to add known_zero() and known_notzero() so dehydra supports all of the functions documented in the UNO paper.

I believe dehydra can now match all of the features of uno’s DSL. Download it here. See malloc.js for an example of how to port UNO scripts. There is no documentation, except for this blog, but the docs on the UNO website should work too.

This is the first release so there are bugs, but the tool should be able to find some API-misuse issues, etc. I need some early adopters to help direct the evolution of this tool.


02
Mar 07

Uno, dos… DeHydra!

A Creature so Fierce…

I’ve been wrestling with control flow graphs like this. I eliminated those pesky “empty” nodes found in the previous incarnations, improved branching to track conditions and realized that what I’m really doing is developing a tool to bite off the excessive necks and heads (otherwise known as edges and basic blocks) of the Hydra monster that is represented by the bloody graph. Thus, say hello to DeHydra which was once known as dos.

DeHydra Plans

I am quite happy with how dehydra is turning out. JavaScript is a nice language to pattern match code in a flow sensitive fashion and the uno-style api seems appropriate for the task.

In the near term I really need some value-awareness through abstract interpretation. For example, this graph of a simple function demonstrates several things that dehydra will need to notice:

  1. There is a path from hydra-unfeasable.c:6 to hydra-unfeasable.c:10 and that those blocks are part of only a single control flow, not 4.
  2. hydra-unfeasable.c:12 requires a contradiction in the condition in order to have flow through it and should be dropped or even reported as dead code.

I pondered a number of approaches ranging from value analysis, using a theorem prover and reversing conditions via prolog code. A simple C++ symbolic expression interpreter in C++ seems like the most productive thing at this point. So I will focus on that in the next short while.

While I was reviewing my options I found a paper on null pointer analysis for Java to be a useful stating point for references to other relevant literature.

Longer Term DeHydra Plans

There is literature on how source-to-source transformation (like squash but also like the future C++ to ES4 tool) is assisted by flow sensitive static analysis tools (like dehydra) which I will have to absorb.
DeHydra will gain whole-program analysis capability by:

  1. Inlining control flow graphs into the main() function
  2. Supporting user-defined function attributes and using some sort of attribute inference (like in ML) to propagate them. This method would have higher performance, but less precision.

For the above to work will also need support for a user-correctable/suggestable callgraph to compensate for function pointers and dynamic module loading. This should also be enough to do dead code and module detection. Oh and I want dehydra to support JavaScript for doing cross-language analysis.

There is also Whaley’s paper which uses prolog as a concise and expressive query engine for the source code. I would really like to see dehydra evolve such capabilities.

All this will take more than a little time on my own, but this stuff is too exciting for other impatient people to idly observe.