Valgrind’s Memcheck tool works on Linux and MacOS, but not on Windows. Interestingly, there is something like it for Windows: “Dr Memory”. Similar in style to Memcheck, Dr Memory is an open source memory checking tool built on top of a JIT-based instrumentation framework called DynamoRIO. It provides essentially identical functionality: detection of invalid memory accesses, uninitialised value uses and memory leaks. Dr Memory claims to be considerably faster than Memcheck, so I was curious to see how it performed.
I recently tried Dr Memory 1.9.0-RC1 on Windows 7, running 32-bit Firefox builds, to see to what extent it can provide coverage for the Windows-specific parts of Gecko.
Installing and getting started isn’t difficult. There are command line flags to direct the output, control the level of instrumentation, specify files listing errors to hide, and so on. As you’d expect.
Despite considerable efforts with Dr Memory, I came away feeling it was a promising tool, but just a bit too hard to use. I encountered two kinds of problems.
Firstly, about half of my Firefox startups ended up spinning. Some of the time, Firefox would start (slowly, of course) and be usable after a couple of minutes. Other runs would spin for an hour or more and still not produce a usable browser. I never figured out why. This seems to be related to the instrumentation, because if I run Firefox uninstrumented on the DynamoRIO core, like Valgrind’s –tool=none, it works reliably.
A second problem was the considerable number of uninitialised memory read errors. I tried out both non-optimised (“/Zi /Od”) and optimised (“/Zi /O2 /Oy- /Ob0”) builds of Firefox.
For the non-optimised builds, Dr Memory reports no invalid accesses and a few uninitialised memory reads, which is what I’d expect. But it’s unusably slow, because the unoptimised build lacks reasonable register allocation, which easily doubles the number of memory accesses that have to be checked.
So my next step was to try an optimised build. This runs a great deal faster. There’s a down side, though: the number of uninitialised memory accesses goes way up. Most of these must be false positives, because they weren’t reported in the unoptimised runs.
I investigated further. It is likely that one source of false positives is Dr Memory’s incomplete description of the Windows system call interface. Valgrind’s description of the Linux syscall interface is itself complex, and it is said that the Windows interface makes the Linux interface look simple. Given that, I’m impressed that Dr Memory works as well as it does.
The other source of false positives appears to be bitfields. Dr Memory tracks the definedness state of each byte of memory using one bit for each byte. Consequently it has no way to accurately model partially initialised bytes, and so must unavoidably either report false positives, or miss real errors, depending on which of the two available shadow states partially initialised bytes are mapped to.
One way to detect probable false-positive bitfield errors in cross platform Gecko code is to check whether Memcheck reports errors at the same places. In many cases it doesn’t. I created a suppressions file, which tells Dr Memory to hide errors I identified as clearly false. A second line of defense is to add extra initialisation code for bitfields purely in order to keep Dr Memory happy. Neither of these are really what one wants to do, though.
The false positive problem seriously compromises Dr Memory’s usefulness on optimised Gecko code, compared to Memcheck. The effect is to create a lot more undefined value errors needing investigation. The situation is exacerbated because Dr Memory doesn’t have an equivalent to Memcheck’s origin-tracking feature, which makes it more difficult to analyse the errors and to determine where, if any, dummy initialisations should be placed.
Dr Memory does have a “light” mode, which restricts it to invalid-address and leak checking only. This increases usability at the expense of losing undefined value checking. If you’re looking for possible heap corruption on Windows, this would be worth a try.
njn wrote on :