A new entry in the annals of unfortunate software release dates:
- On August 19, Valgrind 3.5.0 was released. It added support for Mac OS 10.5.
- On August 28, Mac OS 10.6 was released.
- Valgrind 3.5.0 does not support Mac OS 10.6.
If you try to install Valgrind on a machine running Mac OS 10.6, it will fail at configure-time. If you hack the configure file appropriately so that the install completes, Valgrind will run but crash quickly on any program. Bug 205241 has the details. Greg Parker says he has a series of patches to make Valgrind work and he’s just waiting for the open source release of xnu (the core of Mac OS X) before making them public. With some luck, these fixes will make it into Valgrind 3.5.1 relatively soon.
However, once that’s fixed, there’s another problem. Mac OS 10.6 uses 64-bit executables by default. In comparison, 10.5 uses 32-bit executables by default, even though it’s capable of creating and running 64-bit executables. Unfortunately Valgrind’s support for 64-bit executables on Mac OS X isn’t very good. The main problem is that start-up is sloooooow, which means that even Hello World takes over four seconds to run on my MacBook Pro. Fixing this one will be harder, as it will require reworking the Mac OS X start-up sequence. Bug 205938 is tracking this problem.
Related to this: does anyone know if there is an easy way to have both 10.5 and 10.6 installed on a single machine? That would be a big help when it comes to developing and testing Valgrind’s Mac OS X support.
Valgrind 3.5.0 has been released! It’s the first release that supports Mac OS X. It also adds a number of other new features and a whole lot of bug-fixes. See the release notes for details. Many thanks to everyone who contributed to this release.
We’re now in the preparation phase for the 3.5.0 release of Valgrind, which will be the first release with Mac OS X support. We’ve absorbed some Mozilla culture in the Valgrind development process — we’re now using Bugzilla much more effectively. We have 17 open blockers (and 18 closed blockers), and 41 open “wanted” bugs (and 7 closed ones). Any contributions towards fixing these bugs is most welcome! We’re hoping to release in early August.
It’s time for the June update on the progress of the Mac OS X port of Valgrind.
Progress has been good: the DARWIN branch has been merged to the trunk. With that having happened, we’re now in sight of an actual release (3.5.0) containing Mac OS X support. There’s some polishing and bug-fixing — both for Mac OS X and in general — to be done before that happens, but hopefully we’ll release 3.5.0 in early August. That will be before Snow Leopard comes out; another release may be necessary afterwards, but we want to get this code released sooner rather than later.
One interesting problem we encountered was some users were having Valgrind abort with a SIGTRAP extremely early. It was very mysterious, and none of the developers were able to reproduce it. Turns out that a program called Instant Hijack by a company called Rogue Amoeba was the cause of the problem. Both Valgrind and Instant Hijack do some stuff with dyld, and apparently Instant Hijack’s stuff is a bit dodgy. Turns out there’s an easy workaround, which involves temporarily disabling Instant Hijack. This was reported by a Rogue Amoeba developer, fortunately he tried Valgrind himself, had the same SIGTRAP abort, found the bug report, and realised what the problem was. If it wasn’t for him, we’d still be scratching our heads!
In the meantime, keep reporting any problems you have, in particular any unimplemented syscall wrappers — a number have been added lately but there are still more to be done. Please report problems via Bugzilla rather than in comments on this blog, as bugzilla reports are more likely to be acted upon. Thanks!
This morning I merged the DARWIN branch, which had been holding Valgrind’s support for Mac OS X, onto the trunk. The branch is now defunct, and Valgrind-on-Mac users should check out the trunk like so:
svn co svn://svn.valgrind.org/valgrind/trunk <dirname>
and then build it according to the instructions in the README file.
This is a good thing, if only because it means I can spend less time maintaining a branch and more time actually fixing things.
Update: fixed the svn URL.
It’s time for the May update on the progress of the Mac OS X port of Valgrind. In the last month, 133 commits have been made to the DARWIN branch by Julian Seward and myself.
Here are the current (as of r9898) values of the metrics I have been using as a means of tracking progress.
- The number of regression test failures on Mac was 418/128/43/0. It’s now 421/102/15/0. I.e. the number of failures went from 171 to 117. If we ignore the tools Helgrind, DRD and exp-Ptrcheck (which are not widely used and still mostly broken on the branch) the number of failures dropped from 50 to 13. That’s a similar number to what we get on some Linux systems, and we’re in real diminishing-returns territory — the failing tests are all testing very obscure things. So we can basically declare victory on that front.
- The number of “FIXME”-style marker comments that indicate something in the code that needs to be fixed was 274. It’s now 260. Furthermore, the method I used last month to count “FIXME”-style comments was flawed, so the number has actually gone down by more than 14; the comparison next month will be reliable. But a lot of these comments are for very obscure things that won’t need to be fixed even before a release, so you shouldn’t be worried by the high number!
Functionality improvements from the last month are as follows.
- Some extra system calls are handled.
- Some more signal-handling improvements.
- Some debug info reading improvements.
- File descriptor tracking (–track-fds) now works.
- The –auto-run-dsymutil option was added. When used, it makes Valgrind run dsymutil to generate debug info for any files that need it.
- Helgrind sort of works; some of its tests pass. But it’s still probably not usable.
Things are going well enough that we should be ready to merge the branch to the trunk soon! That will be a significant milestone, and will make life easier as I won’t have to maintain the branch in parallel with the trunk. I’m currently going through the branch/trunk differences carefully in order to get ready for the merged, with luck it will happen by the end of this week.
Update, March 19: fixed some HTML tags.
It’s time for the April update on the progress of the Mac OS X port of Valgrind. It’s been a quieter month because I was on vacation for over 3 weeks, and Julian Seward hasn’t had a great deal of time to work on the port either. Even still, in that time 77 commits have been made to the DARWIN branch.
Here are the current (as of r9567) values of the metrics I have been using as a means of tracking progress.
- The number of regression test failures on Mac was 422/172/41/0. It’s now 418/128/43/0. I.e. the number of failures went from 213 to 171. If we ignore the tools Helgrind, DRD and exp-Ptrcheck (which are not widely used and still completely broken on the branch) the number of failures dropped from 92 to 50. So the functionality of the branch is progressing well.
- The size of the diff between the trunk and the branch was 38,248 lines (1.3MB). It’s now 39,027 lines (1.3MB). However, 2,223 of these lines are code that was cut, but was put in a text file for reference. So the more realistic number would be 36,804 lines (1.2MB). This metric was intended to indicate how close the branch is to being ready to merge with the trunk, but it doesn’t do that very well, so I will stop using it in the future.
- Instead, I’m going to use a new metric: the number of “FIXME”-style marker comments that indicate something in the code that needs to be fixed. A lot of these mark Darwin-specific code that works correctly, but hasn’t been abstracted cleanly. When this approaches zero, it will mean that the branch should be very close to merge-ready. (Actually, the branch may be merge-ready before it reaches zero.) The current number of these 274. (The task-tracking used within Valgrind is mostly pretty informal, you can get away with it when there’s only a handful of frequent contributors!) That number is quite high, but a lot of those will be easy to fix.
Functionality improvements are as follows.
- The build system now works with older versions of automake (pre 1.10). automake’s handling of assembly code files (specifically, whether AM_CPPFLAGS is used for them) changed in 1.10, and the build system wasn’t working with older versions.
- Some extra system calls are handled, enough that iTunes apparently now runs (although I haven’t tried it myself).
- -mdynamic-no-pic is now used for compilation of Valgrind. This turns off position-independent code, which (strangely enough) is the default for GCC on Darwin. This speeds up most programs at least a little, and in some cases up to 30%.
- Some more signal-handling improvements.
So things are still moving along well.
Another month has passed since I last wrote about my work on the Mac OS X port of Valgrind. In that time 126 commits have been made to the DARWIN branch (and a similar number to the trunk). I’ve done a lot of them, but Julian Seward has found some time to work on the DARWIN branch and so has been doing some as well.
Here are the current (as of r9455) values of the metrics I have been using as a means of tracking progress.
- The number of regression test failures on Linux was: 484 tests, 4 stderr failures, 1 stdout failures, 0 post failures (which I’ll abbreviate as 484/4/1/0). It’s now 484/0/1/0. I.e. the number of failures went from 5 to 1, and that one failure occurs on my machine even on the trunk (it’s a bad test). In other words, the branch works on Linux as well as the trunk. Now that this metric is the same on the branch as the trunk, I won’t bother tracking it in the future.
- The number of regression test failures on Mac was 402/213/52/0. It’s now 422/172/41/0. I.e. the number of failures went from 265 to 213. Also, 20 extra tests are being run — a broken CPU feature-detection program meant that a number of tests that should have been running were not, and this has been fixed. Once again, this is the most important metric, and it’s improving steadily, but there’s still a long way to go. One encouraging thing here is that 121 of these failures (more than half) involve the tools Helgrind, DRD and exp-Ptrcheck, which are three of the less-used tools in the Valgrind distribution, and which are all completely broken on the branch, and which I haven’t really looked at yet precisely because they are less-used. The other 92 failures involve Memcheck and Nulgrind (the “no-instrumentation” tool, failures for which indicate problems with the testing of Valgrind’s core). A lot of these are problems with non-portable tests, rather than the Darwin port’s functionality. Furthermore, the tools Cachegrind, Callgrind, and Massif pass all of their tests.
- The size of the diff between the trunk and the branch was 41,895 lines (1.5MB). It’s now 38,248 (1.3MB). But note, once again, that this is not a very useful metric. I just scanned through the diff and there’s not a great deal of differences in the diff than can be merged before we reach the point of the big branch-to-trunk merge.
Functionality improvements are as follows.
- Basic signals are now supported, thanks to Julian. This accounted for a lot of the new test passes. This also means that debug builds of Firefox run successfully!
- Some extra system calls are handled.
- 64-bit builds are working. To configure Valgrind for them, pass to ./configure the option –build=amd64-darwin. 64-bit Valgrind is quite slow, it does some very large mmaps at startup which take several seconds. This will need to be fixed. This also hasn’t been tested as much as the 32-bit version, and passes fewer tests.
I’m taking three weeks of vacation starting on Thursday, so progress on Valgrind+Darwin will be minimal over the next month. But I will be visiting Mountain View early next week (Monday, March 23 and Tuesday, March 24) so I’ll be able to actually meet some of the people I work with! I may also give a talk about Valgrind, depending on whether it can be scheduled. Any suggestions for things to talk about are welcome.
With Valgrind now working reasonably well on Darwin, it’s possible to run Valgrind on an iPhone. Well, not directly on an iPhone, because the Darwin port doesn’t work on ARM, but you can run Valgrind on x86 binaries built against the iPhone Simulator SDK.
However, it’s a little tricky, because you can’t run Valgrind from the command line in this environment. Landon Fuller explains here the small hack that is required to get around this limitation.
It’s been a month since I first wrote about my work on the Mac OS X port of Valgrind. In that time I’ve made 85 commits to the DARWIN branch (and a similar number to the trunk).
Here are the current (as of r9192) values of the metrics I defined in the first post as a means of tracking progress.
- The number of regression test failures on Linux was: 477 tests, 220 stderr failures, 53 stdout failures, 25 post failures (which I’ll abbreviate as 477/220/53/25). It’s now 484/4/1/0. I.e. the number of failures went from 298 to 5. A few new tests have been added. Four of the failures are in Helgrind, the data race detector tool, which I haven’t tracked down yet. The other failure is one that also occurs on the trunk. So almost all the Linux functionality broken by the changes has been restored.
- The number of regression test failures on Mac was 419/293/58/29. It’s now 402/213/52/0. I.e. the number of failures went from 380 to 265. The total number of tests has gone down because some Linux-specific tests are no longer being (inappropriately) run on Mac. This is the most important metric, and it’s improving steadily, but there’s still a long way to go.
- The number of compiler warnings on Linux was 186. It’s now 10, and all of these are from #warning declarations that mark places where improvement need to be made to the Darwin port, but aren’t actually a problem for Linux. The number of compiler warnings on Mac was 461. It’s now 44. Of these, 33 are from #warning declarations, and 10 are from code generated by the Darwin ‘mig’ utility which I have no control over. So compiler warnings aren’t an issue any more, and I won’t bother tracking them as a metric in the future.
- The size of the diff between the trunk and the branch was 55,852 lines (1.9MB). It’s now 41,895 lines (1.5MB). But note that this is not a very useful metric; progress will usually cause it to drop, but it will also increase as missing Darwin functionality is added.
Interestingly enough, although this number of Mac test failures has gone down significantly, if the branch didn’t handle your program a month ago it probably still won’t handle it now (although getsockopt() no longer causes an abort). But Valgrind’s output may well be better (e.g. debugging information will be better utilized). Much of my effort has been in making the tests pass — improving cases where the Darwin port was doing basically the right thing, but its output didn’t exactly match that expected.
One example is that stack traces were a little unclean, in various minor ways. Another example is that I added a –ignore-fn option to Massif (the heap profiler) which allows it to ignore certain heap allocations. This was required because Darwin’s libc always does a few heap allocations at start-up, but Linux’s libc doesn’t. The new option allows the Darwin allocations to be ignored and therefore Massif’s output to be consistent on both platforms.
Few if any of these changes have made the branch closer to handling new programs, at least directly. But there’s no point apologising about this, because the branch won’t reach a highly functional state without a working test suite to serve as a safety net against regressions. And as I progress, getting more tests to pass will require genuine new program functionality to be supported, so improvements should start to occur on that front soon. For example, signals currently aren’t supported at all, and this is why Firefox does not run under Valgrind on Mac yet — all calls to sigaction() currently return -1, which causes an assertion failure somewhere in NSPR.
Something else worth mentioning: I bought a new MacBook Pro, as my old 32-bit only was was slow and noisy and getting annoying. The new machine is 64-bit capable, but compiles to 32-bit by default and Valgrind’s configure script identifies it as a 32-bit only machine. If anybody knows how to make configure recognise that it’s a 64-bit machine I’d love to hear about it.
Update, March 17: fixed a broken link to an earlier post.