Tracemonkey – Page 2 – Nicholas Nethercote

After six months of working on Tracemonkey, I’ve built up a particular workflow — how I use Mercurial, arrange my workspaces, run tests, and commit code. I thought it would be worth describing this in case it helps other developers improve their workflow, or perhaps so they can give me ideas on how to improve my own workflow.

Workspace Structure

I have two machines, an Ubuntu Linux desktop and a Mac laptop. For both machines I use the same workspace structure. All my Mozilla work is in a directory ~/moz/. At any one time I have up to 10 workspaces. ~/moz/ws0/ always contains an unmodified clone of the tracemonkey repository, created like so:

hg clone http://hg.mozilla.org/tracemonkey/ ~/moz/ws0

Workspaces ~/moz/ws1 through to ~/moz/ws9 are local clones of ~/moz/ws0/ in which I make modifications. I create these workspaces like this:

hg clone ~/moz/ws0 ~/moz/wsN

Local hg clones are much cheaper than ones done over the network. On my Linux box it takes about 45 seconds, on my Mac somewhere over 2 minutes; it seems that laptops have slower hard disks than desktops. In comparison, cloning hg.mozilla.org/tracemonkey/ can take anywhere from 5 to 30 minutes or more (I don’t know why there’s so much variation there).

I mostly work with the Javascript shell, called ‘js’, so I do most of my work in ~/moz/wsN/js/src/. There are three ways I commonly build ‘js’.

Debug builds go in ~/moz/wsN/js/src/debug/. I use these for most of my development and testing.
Optimised builds go in ~/moz/wsN/js/src/opt/. I use these for measuring performance.
Optimised builds with symbols go in ~/moz/wsN/js/src/optg/. I use these with Cachegrind, which needs optimised code with symbols to be useful.

I have a number of bash aliases I use to move around these directories:

alias m="cd ~/moz/"
alias m0="cd ~/moz/ws0/"
alias j0="cd ~/moz/ws0/js/src/"
alias j0d="cd ~/moz/ws0/js/src/debug/"
alias j0o="cd ~/moz/ws0/js/src/opt/"

and so on for the remaining workspaces ws1 through ws9. I have a common bash config file that I use on both my machines; whenever I change it I copy it to the other machine. This is a manual process, which is not ideal, but in practice it works well enough.

I find nine workspaces for making changes is enough to cover everything I’m doing; if I find myself needing more it’s because some of the existing ones have stagnated and I need to do some cleaning up.

Building ‘js’

I have three scripts, js_conf_debug, js_conf_opt, js_conf_optg, which configure and build from scratch. Here is js_conf_debug, the others are similar:

#! /bin/sh

if [ -z $1 ] ; then
    echo "usage: $0 <dirname>"
elif [ -d $1 ] ; then
    echo "directory $1 already exists"
else
    autoconf2.13
    mkdir $1
    cd $1
    CC='gcc -m32' CXX='g++ -m32' AR=ar ../configure \
        --enable-debug --disable-optimize --target=i686-pc-linux-gnu
    make --quiet -j 2
fi

These are scripts rather than bash aliases or functions because they are quite different on the Linux machine and the Mac.

I also have this alias for incremental builds:

alias mq="make --quiet -j 2"

Testing ‘js’

The program I run most is trace-test.js. So much so that I have more aliases for it:

alias jsott="time opt/js -j trace-test.js"
alias jsdtt="time debug/js -j trace-test.js"

I don’t need an alias for the optg build because that’s only used with Cachegrind, which I run in a different way (see below).

I run the JS unit test with the following script:

function js_regtest
{
    x=$1
    y=$2
    if [ -z $x ] || [ -z $y ] ; then
        echo "usage: js_regtest <ws-number-1> <ws-number-2>"
    else
        xdir=$HOME/moz/ws$x/js/src/debug
        ydir=$HOME/moz/ws$y/js/src/debug
        echo "############################"
        echo "## COMPILING $xdir"
        echo "############################"
        cd $xdir && mq
        echo "############################"
        echo "## COMPILING $ydir"
        echo "############################"
        cd $ydir && mq
        cd $ydir/../../tests
        echo "############################"
        echo "## TESTING $xdir"
        echo "############################"
        time jsDriver.pl \
            -k \
            -e smdebug \
            --opt '-j' \
            -L spidermonkey-n.tests slow-n.tests \
            -f base.html \
            -s $xdir/js && \
        echo "############################"
        echo "## TESTING $ydir"
        echo "############################"
        time jsDriver.pl \
             -k \
             -e smdebug \
             --opt '-j' \
             -L spidermonkey-n.tests slow-n.tests \
             -L base-failures.txt \
             -s $ydir/js
    fi
}

An example invocation would be:

js_regtest 0 3

The above invocation first ensures a debug ‘js’ is built in workspaces 0 and 3. Then it runs ~/moz/ws0/js/src/debug/js in order to get the baseline failures, which are put in base-failures.txt. Then it runs ~/moz/ws3/js/src/debug/js and compares the results against the baseline. The -L lines skip the tests that are really slow; without them it takes hours to run. I time each invocation just so I always know roughly how long it takes; it’s a bit over 10 minutes to do both runs. It assumes that workspace 0 and 3 correspond to the same hg revision; perhaps I could automate that to guarantee it but I haven’t (knowingly) got that wrong yet so haven’t bothered to do so.

Timing ‘js’

I time ‘js’ by running SunSpider. I obtained it like so:

svn http://svn.webkit.org/repository/webkit/trunk/SunSpider ~/moz/SunSpider

I haven’t updated it in a while, I hope it hasn’t changed recently!

I run it with this bash function:

function my_sunspider
{
    x=$1
    y=$2
    n=$3
    if [ -z $x ] || [ -z $y ] || [ -z $n ] ; then
        echo "usage: my_sunspider <ws-number-1> <ws-number-2> <number-of-runs>"
    else
        for i in $x $y ; do
            dir = $HOME/moz/ws$i/js/src/opt
            cd $dir || exit 1
            make --quiet || exit 1
            cd ~/moz/SunSpider
            echo "############################"
            echo "####### TESTING ws$i #######"
            echo "############################"
            time sunspider --runs=$n --args='-j' --shell $dir/js > opt$i
         done

         my_sunspider_compare_results $x $y
    fi
}

function my_sunspider_compare_results
{
    x=$1
    y=$2
    if [ -z $x ] || [ -z $y ] ; then
        echo "usage: my_sunspider_compare_results <ws-number-1> <ws-number-2>"
    else
        sunspider-compare-results \
            --shell $HOME/moz/ws$x/js/src/opt/js opt$x opt$y
    fi
}

An invocation like this:

my_sunspider 0 3 100

will ensure that optimised builds in both workspaces are present, and then compare them by doing SunSpider 100 runs. That usually gives what SunSpider claims as +/-0.1% variation (I don’t believe it, though). On my Mac this takes about 3.5 minutes, and 100 runs is enough that the results are fairly reliable, certainly more so than the default of 10 runs. But when testing a performance-affecting change I like to do some timings, wait until a few more patches have landed in the tree, then update and rerun the timings — on my Mac I see variations of 5-10ms regularly due to minor code differences. Timing multiple versions like this gives me a better idea of whether a timing difference is real or not. Even then, it’s still not easy to know for sure, and this can be frustrating when trying to work out if an optimisation I applied is really giving a 5ms speed-up or not.

On my Linux box, I have to use 1000 runs to get +/-0.1% variation. This takes about 25 minutes, so I rarely do performance-related work on this machine. I don’t know why Linux causes greater timing variation.

Profiling ‘js’ with Cachegrind

I run Cachegrind on ‘js’ running SunSpider with this bash function:

function cg_sunspider
{
    x=$1
    y=$2
    if [ -z $x ] || [ -z $y ] ; then
        echo "usage: cg_sunspider <ws-number-1> <ws-number-2>"
    else
        for i in $x $y ; do
            dir = $HOME/moz/ws$i/js/src/optg
            cd $dir || exit 1
            make --quiet || exit 1
            cd ~/moz/SunSpider
            time valgrind --tool=cachegrind --branch-sim=yes --smc-check=all \
                --cachegrind-out-file=cachegrind.out.optg$i \
                --auto-run-dsymutil=yes \
                $dir/js `cat ss0-args.txt`
            cg_annotate --auto=yes cachegrind.out.optg$i > ann-optg$i
        done
    fi
}

ss0-args.txt contains this text:

-j -f tmp/sunspider-test-prefix.js -f resources/sunspider-standalone-driver.js

What this does is run just the main SunSpider program, once, avoiding all the start-up processes and all that. This is important for Cachegrind — it means that I can safely use –cachegrind-out-file to name a specific file, which is not safe if running Cachegrind on a program involving multiple processes. (I think this is slightly dangerous… if you run ‘sunspider –ubench’ it seems to change one of the above .js files and you have to rerun SunSpider normally to get them back to normal.) I use –branch-sim=yes because I often find it to be useful; at least twice recently it has helped me identify performance problems.

If I want to focus on a particular Cachegrind statistic, e.g. D2mr (level 2 data read misses) or Bim (indirect branch mispredictions) then I rerun cg_annotate like this:

cg_annotate --auto=yes --show=I2mr --sort=I2mr cachegrind.out.optgN > ann-optgN-I2mr

Profiling ‘js’ with Shark

To profile ‘js’ with Shark, I use SunSpider’s –shark20 and –test options. I don’t have this automated yet, I probably should.

Managing Changes with Mercurial

Most of my changes are not that large, so I leave them uncommitted in a workspace. This is primitive, but has one really nice feature: when pulling and updating, hg merges the changes and marks conflicts in the nice “<<<” “>>>” way.

In comparison, with Mercurial queues (which I tried for a while) you have to pop your patches, update, then push them, and it uses ‘patch’ to do the merging. And I hate ‘patch’ because conflicted areas tend to be larger, and because they go in a separate reject file rather than being inserted inline.

I also avoid doing local commits unless I’m working on something really large just because the subsequent merging is difficult (at least, I think it’s difficult; my Mercurial knowledge still isn’t great). In that case I do local commits until the change is finished, then apply the patch (using ‘hg diff’ and ‘patch’) in a single hit to a newly cloned tree — given Mozilla’s use of Bugzilla, the change will have to be a single patch anyway so this aggregation step has to happen at some point.

Pre-push Checklist

Before landing any patch, I do my best to work through the following check-list. I created this list recently after having to back out several commits due to missing one of the above steps; I give examples of breakage I’ve caused in square brackets.

Ensure there are no new compiler warnings for ‘js’ for optimised and debug builds. [I managed to introduce some warnings on an optimised build recently for what was supposedly a whitespace-only change!]
Ensure ‘js’ runs trace-test.js without failures, for optimised builds, debug builds, debug builds with TMFLAGS=full (to test the verbose output) under Valgrind (to test for memory errors). [I’ve had to back out several patches due to breaking TMFLAGS=full]
Ensure lirasm builds and passes its tests for both optimised and debug builds. [I’ve forgotten this numerous times, leaving lirasm in a broken state, which is why I created bug 503449].
Ensure unit tests pass with a debug build. [Amusingly enough, I don’t think I’ve ever caused breakage by forgetting this step!]
(For any commit that might affect performance) Check SunSpider timings with an optimised build.
(For complex changes) Check the patch on the try servers. (Nb: they run optimised builds, so will miss assertion failures among other things)
(For changes affecting the ARM backend) Check the patch builds and runs trace-test.js (using a debug build) on my virtual qemu+ARM/Linux machine.
Check tinderbox to make sure the tree is open for commits. [When the tree is closed, there’s no mechanism that actually prevents you from committing. I had to back-out a patch during a tinderbox upgrade because of this.]

It’s quite a list, and I don’t usually do anything with a browser build, when I probably should, so that would make it even longer. And there are other things to get wrong… for example, I never test the –disable-jit configuration and I broke it once.

Pushing

When I’m ready to push a change, I make sure my workspaces are up-to-date with respect to the Mozilla repo. I then commit the change to my modified repo, then push it from there into ~/moz/ws0/, then check ‘hg outgoing -p’ on that repo to make sure it looks ok, and then push to the Mozilla repo from there. I try to do this quickly so that no-one else lands something in the meantime; this has only happened to me once and I tried to use ‘hg rollback’ to undo my local changes which I think should have worked but seemingly didn’t.

Post-push Checklist

After committing, I do these steps:

Mark the bug’s “whiteboard” field as “fixed-in-tracemonkey”.
Put a link to the commit in a comment for the bug, of the form http://hg.mozilla.org/tracemonkey/rev/<revhash>/. I always test the link before submitting the comment.

Conclusions

That’s a lot of stuff. Two of my more notable conclusions are:

Automation is a wonderful thing. In particular, having scripts for the complicated tasks (e.g. running the unit tests, running sunspider, running sunspider under Cachegrind) has saved me lots of time and typing (and lots of head-scratching and re-running when I realised I forgot some command line option somewhere). And this automation was made much easier once I settled on a standard workspace+build layout.
The pre-push checklist is both disconcertingly long and disconcertingly incomplete. And I had to work it out almost entirely by myself — I’m not aware of any such check-list documented anywhere else. Having lots of possible configurations really hurts testability. I’m not sure how to improve this.

If you made it this far, congratulations! That was pretty dry, especially if you’re not a Tracemonkey developer. I’d love to hear suggestions for improving what I’m doing.