Me, Valgrind, and Mac OS X

Welcome to my blog, where I’ll be discussing some of the work I’m doing for Mozilla.

A little about me

I’m Australian. I live in Melbourne. I’ve also lived in Cambridge, England and Austin, Texas, and so I am fluent in at least three dialects of English. I like spending time with my wife Phoebe and baby daughter Keira, eating food, riding my bike, and following US presidential elections obsessively. Two weeks ago I left the academic/research world and started working for Mozilla.

Valgrind

My first big task for Mozilla is to improve support for Mac OS X in Valgrind. I’ve been involved with Valgrind since before the 1.0 release in 2002, and have done lots of work on it, including writing two tools that are in the Valgrind distribution: Cachegrind, a cache profiler, and Massif, a memory profiler. I even wrote a PhD dissertation about it.

And it seems that lots of Mozilla people find Valgrind useful, which is nice. However, it currently only runs on Linux. (Well, it also runs on AIX, but not many people care about that.)

Valgrind on Mac OS X

More than four years ago, on December 16, 2004, an Apple employee named Greg Parker wrote to the Valgrind developers mailing list to tell us that he was working on a port of Valgrind for Mac OS X.  He’s been working on it ever since then. (This must be why Mac OS 10.5 shipped late.)

After such a long time, I’m happy to report that there is now a branch holding Greg’s port in the Valgrind SVN repository.  If you want to check it out, do this:

  svn co svn://svn.valgrind.org/valgrind/branches/DARWIN <workspace-name>
  cd <workspace-name>

and then build it according to the instructions in the README file.  The branch is called DARWIN because Darwin is the name of the Mac OS “core”, which consists of a Mach-based microkernel and a few other bits and pieces.

However, please note that the port currently is, in Greg’s words: “UNSUPPORTED and INCOMPLETE and BUGGY… It may not find bugs in your program, or run your program correctly, or run your program at all.” What Greg has done is very impressive, and goes an awfully long way towards having a complete port of Valgrind on Mac OS X.  But it’s not the cleanest patch ever.  To give you an idea…

  • The patch I imported was 31,144 lines, just over 1MB of text.
  • The patch initially didn’t work on 32-bit Macs.
  • The patch broke Valgrind on Linux.  This took me a couple of days to fix, mostly involving the addition of appropriate #if statements.
  • The patch broke the regression test system;  they wouldn’t even build, let alone run. After fixing them to run again, more than half of the tests failed on Linux, and almost three-quarters failed on Mac.
  • There are lots of compiler warnings.  (The Valgrind trunk has none).
  • Much of the code in the patch has 4 space indenting;  the rest of Valgrind code has 3 space indenting.

So there’s plenty of work to be done to get the branch into a state where it will be suitable for merging with the trunk.  It’s hard to estimate how long this will take, it will just be a matter of fixing things one piece at a time.  My guess is that three months might suffice, but it’s really just a guess.  But here are some metrics I can use to judge progress, and their values just after I got the the system and regression tests building and running again on Mac and Linux:

  • The number of regression test failures on Linux: 477 tests, 220 stderr failures, 53 stdout failures, 25 post failures.  (“stderr” failures generally indicate that Valgrind’s output had a problem, “stdout” failures generally indicate that the test program’s output had a problem, and “post” failures indicate that the output of a Valgrind post-processing step had a problem.)  These numbers roughly indicate how much existing functionality has been broken on Linux by the Darwin changes, and should be fairly easy to get down.
  • The number of regression test failures on Mac:  419 tests, 293 stderr failures, 58 stdout failures, 29 post failures.  These numbers are the most important, as they roughly indicate how complete the Mac functionality is, and will be much more work to get down.
  • The number of compiler warnings: 186.  This number should be easy to reduce.   (Update, Jan 20: That’s on Linux. On Darwin it was 461.)
  • The size of the diff between the branch and the trunk: 55,852 lines, 1.9MB.  This is larger than the original patch because some files have been moved on the branch but not yet moved on the trunk, including some tests that are large and have large expected outputs.  This number will go down in fits and starts;  it will never get to zero, as the final merge will happen when there are many differences between the branch and trunk.

I’ll occasionally post updates to these numbers so people can track progress.

If Valgrind-on-Mac is of interest to you, please try out the new branch and let me know how it goes. Note that I’m working on an old MacBook Pro which is only 32-bit, so it’s possible that I’ve broken the 64-bit Mac support, but have no way to determine this.

9 Responses to Me, Valgrind, and Mac OS X

  1. Great news!

    Have you checked with Apple that they are not working on this? Sounds strange to me that Greg puts all that work in and then Apple doesn’t include it in Snow Leopard…but then again, in that case, he probably wouldn’t have released his version just yet.

  2. Nicholas Nethercote

    It’s not working nearly well enough to include in Snow Leopard. Also, since Valgrind is GPL, I imagine Apple doesn’t want it included in Snow Leopard.

  3. I would highly recommend having a way to run at the very least the test suite on a 64-bit machine. 64-bit is very very much the future of Mac OS X. A 32-bit only patch won’t be very useful for very long.

    It’s possible someone like vlad, alice or jesse would be able to hook you up with a 64-bit OS X machine to ssh into and at least run tests.

  4. Being a fellow Melbournian & bike rider, what’s your usual/favourite bike ride(s)?

  5. Hi,
    I’m trying valgrind with a tool that reads in text from a filehandle and then pushes it out over the network via a socket connection. Valgrind seems to bail out when the socket is being created. Here’s the output:

    valgrind ./bropipe -df /tmp/url-alerts.txt
    ==57151== Memcheck, a memory error detector.
    ==57151== Copyright (C) 2002-2008, and GNU GPL’d, by Julian Seward et al.
    ==57151== Using LibVEX rev 1880, a library for dynamic binary translation.
    ==57151== Copyright (C) 2004-2008, and GNU GPL’d, by OpenWorks LLP.
    ==57151== Using valgrind-3.5.0.SVN, a dynamic binary instrumentation framework.
    ==57151== Copyright (C) 2000-2008, and GNU GPL’d, by Julian Seward et al.
    ==57151== For more details, rerun with: -v
    ==57151==
    ==57151== Listening for debugger on port 2159
    DEBUG: try opening `/tmp/url-alerts.txt’ as input
    DEBUG: got BroConn handle
    DEBUG: attempt to connect to 127.0.0.1:47757…UNKNOWN __sigaction is unsupported. This warning will not be repeated.

    valgrind: m_syswrap/syswrap-generic.c:1452 (vgModuleLocal_generic_PRE_sys_getsockopt): Assertion ‘Unimplemented functionality’ failed.
    valgrind: valgrind
    ==57151== at 0xF00AE0AB: ???
    ==57151== by 0xF00ADF36: ???
    ==57151== by 0xF0134EC3: ???
    ==57151== by 0xF0125537: ???
    ==57151== by 0xF00F2546: ???
    ==57151== by 0xF00EF1BA: ???
    ==57151== by 0xF00F0659: ???
    ==57151== by 0xF011D99F: ???
    ==57151== by 0xFFFFFFFF: ???
    ==57151== by 0xF24BBA5C: ???
    ==57151== by 0xF24BBA08: ???
    ==57151== by 0xF00AE85D: ???
    ==57151== by 0xF24BB9E0: ???
    ==57151== by 0x2000004: ???
    ==57151== by 0x101: ???
    ==57151== by 0xF24BBA5C: ???
    ==57151== by 0x13: ???

    sched status:
    running_tid=1

    Thread 1: status = VgTs_Runnable
    ==57151== at 0x23F302: getsockopt (in /usr/lib/libSystem.B.dylib)
    ==57151== by 0x41BFF0: conn_state (in /usr/lib/libcrypto.0.9.7.dylib)
    ==57151== by 0x41C319: conn_write (in /usr/lib/libcrypto.0.9.7.dylib)
    ==57151== by 0x3EE811: BIO_write (in /usr/lib/libcrypto.0.9.7.dylib)
    ==57151== by 0x3C4EA: __bro_openssl_write (in /usr/local/bro/lib/libbroccoli.2.dylib)
    ==57151== by 0x3A324: io_msg_empty_tx (in /usr/local/bro/lib/libbroccoli.2.dylib)
    ==57151== by 0x3AA2E: io_msg_queue (in /usr/local/bro/lib/libbroccoli.2.dylib)
    ==57151== by 0x3608B: conn_init_configure (in /usr/local/bro/lib/libbroccoli.2.dylib)
    ==57151== by 0x364A5: bro_conn_connect (in /usr/local/bro/lib/libbroccoli.2.dylib)
    ==57151== by 0x2518: make_connection() (in ./bropipe)
    ==57151== by 0x30DE: main (in ./bropipe)

    Note: see also the FAQ.txt in the source distribution.
    It contains workarounds to several common problems.

    If that doesn’t help, please report this bug to: http://www.valgrind.org

    In the bug report, send all the above text, the valgrind
    version, and what OS and OS version you are using. Thanks.

    Illegal instruction

    Looks like some kind of unimplemented functionality that is causing and exception?

  6. Nicholas Nethercote

    BlueMM: I used to work at Melbourne Uni, so I did/do the Brunswick-to-Parkville route along the Upfield bike path and Royal Parade a lot.

    For longer rides I usually do the Yarra trail from the CBD round to the Merri Ck at Rushall station, and along Park St in Fitzroy/Carlton.

  7. My apologies for the incorrect indenting. I had emacs on one machine configured for valgrind-style once upon a time, but lost that configuration later.

    No apologies for the rest of the patch. Big ports are always ugly…

  8. Yep – finally! Valgrind coming to OS X!
    Awesome!!

    Can’t wait for the first alpha version to try it out.
    Used valgrind in my day job on Linux and I’m really missing something equally powerful on my Mac OS X tool suite for years now..

  9. Nicholas Nethercote

    Jay B: you should check out the code from the trunk and try that, it’s pretty robust — it can handle a non-debugging build of Firefox, which is a pretty big program. And there’s already one bug filed against the port in the Valgrind bug database :)