09
Jun 22

Ephemeron Tables aka JavaScript WeakMaps and How They Work

Introduction

I read Ephemerons explained today after finding it on Hacker News, and it was good but lengthy. It was also described in terms of the Squeak language and included post-mortem finalization, which is unavailable in JavaScript (and frankly sounds terrifying from an implementation point of view!) I thought I’d try my hand at writing up a shorter and hopefully simpler explanation covering only what is available in JS.

Ephemerons—Effin’ Ron What?

Ephemeron tables are the underlying data structure for JavaScript WeakMaps. WeakMaps are very similar to plain Maps where if you have a Map and a key, you can look up a value. The differences are (1) you can only use objects (and soon symbols) as WeakMap keys, (2) the API is limited to prevent retrieving any entries without having the corresponding key in hand, and (3) WeakMaps are hooked into the garbage collector (GC) so that they don’t keep as much stuff alive.

If a regular Map is alive, then so are all of its keys. And values. And anything they might contain, recursively.

If a WeakMap is alive on the other hand, then it won’t keep any key or value alive unless something else is keeping a particular key alive. Then it will keep the corresponding value alive.

That’s it. Everything else falls out of that.

In terms of usage, WeakMaps are good for annotating an object with data that isn’t useful if the object is no longer needed. You could map an object to some expensive-to-compute cached information, for example. Or maybe you want to track whether an Error object has been logged, or associate a DOM object with some information about it. It’s like adding an invisible property to an object that you can only look at if you look it up in some invisibleProperty WeakMap.

WeakMaps and Garbage Collection

Let me expand on that last point a little. Say you have your invisibleProperty WeakMap filled up with a bunch of entries mapping various objects to their properties. If any of the objects dies, then the corresponding value is no longer kept alive simply by being in that WeakMap. (It may not actually die, because something else may refer to it.) Moreover, if you discard the invisibleProperty WeakMap itself (by setting invisibleProperty=null and not having anything else that refers to it), then none of those key/value entries will keep the corresponding value alive anymore.

Mathematically, a WeakMap entry is an edge from a WeakMap WM and a key K to some value V:

    BOTH(WM,K)→V

In order for the entry to keep V alive, both WM and K have to be alive. If either is dead, then the entry has no effect on V’s liveness, and in fact is unobservable. (You can’t look something up in a weakmap that you don’t have. And you can’t look up a key that you don’t have either. So whether the entry exists or not makes no difference to you.)

WeakMap GC Implementation

An important consequence of the above rule is that something might be alive for reasons that require looking through a chain of WeakMap entries. You might have a WeakMap entry value that is itself a WeakMap. Or the value is used as a key in the same or another WeakMap. This complicates the simple marking GC implementation, which is: start from a set of objects known to be live, and mark everything that they contain (or can directly reach in any way) as live, recursively.

With WeakMaps, when you mark some object as being live, you don’t know whether it might be a key in some random WeakMap out there. Or perhaps you do, because you’ve cleverly set a bit on everything used as a WeakMap key—but this does you little good, because you also need to know whether any of the WeakMaps the key is in are themselves alive, and you may not have figured that out yet.

The simple solution: mark everything normally, but collect a list of all of the WeakMaps you discover to be live. Then loop through all of their entries, check each key to see if it’s alive, and if so mark its corresponding value as live.

But one loop may not be enough—if you mark any values, then you may have discovered either a new WeakMap or an object that is used in a known or not-yet-known WeakMap. So you’ll need to keep repeating until no new values are marked. In terms of computational complexity, you’ll visit up to n objects each time through the loop, and you’ll loop up to n times, for a total of O(n²) operations. Perhaps you’re not familiar with that expression? It is written in the language of computer science, where it is pronounced “oh, crap!”.

Now, normally it would be really hard to hit the worst case here. But on the Web, anything—no matter how stupid—can somehow make somebody money. Or amuse them, or whatever. Therefore somebody will do it.

Linear-time Implementation

There’s a straightforward fix, though—don’t do the above. Instead, every time you discover a live WeakMap, add all of its entries to a big hashtable. Additionally, every time you mark an object, look it up in the hashtable and if you find it, mark the values of any entries you find. If n is the number of live objects, you’ll visit each one once and do a constant amount of work, for a running time of O(n) (pronounced “oops I forgot about the constants”).

You wouldn’t actually implement it quite that way, but you can use a little optimism to skip the slow part almost all of the time. And when you can’t—well, O(n²) means for small n, it will take a few extra milliseconds. For large n, it will take until Thursday. So at least you won’t be collecting garbage until Thursday.

Sorry, what?

Do your tracing in two phases:

  • Start from a set of roots. Mark everything reachable from the roots, recursively. Whenever you encounter an ephemeron table (and therefore know it is live, since it is reachable from a root), iterate over all of its entries. If an entry’s key is already known to be live, trace its value. Otherwise, add it to a table of pending entries, keyed by the key. After this phase is done, you will probably have visited the vast majority of the object graph. (Everything remaining is only reachable by going through one or more ephemeron table entries.)
  • At this point, you have a mostly-marked graph, and a table of ephemeron keys, some of which are now marked (but weren’t when you added them to the table). Scan through the table and find all of the now-marked keys, and trace their values. However, now whenever you visit any object (the value or anything reachable from the value), immediately look up the value in the table and trace through any entries you find. (If you encounter a not-yet-seen ephemeron table during this process, do the same thing as before.)

Every object in the graph will be visited at most twice, and every operation on an object is O(1)—constant time. So the overall scan is O(fast n + slow n) = O(n).


20
Dec 19

Running taskcluster tasks locally

Work right from your own home!

It can be difficult to debug failures in Taskcluster that don’t happen locally. Interactive tasks are very useful for this, but interactive tasks broke during the last migration — a relevant bug is bug 1596632, which is duped to a just-fixed bug, so maybe it works now?. I recently encountered a situation where I really needed to interactively debug something, so I decided to take the plunge and discover the answer to the question: how can I run tasks locally?

Local tasks provide not only the advantages of interactive tasks, but also allow running against your local checkout. That makes for a much faster edit-run-curse-debug cycle, and opens up possibilities for using this in a lot more situations than the usual last-ditch efforts that interactive try server tasks are usually used for. (Or at least, that’s how I use them. And mostly don’t use them.)

I’m going to walk through the process of setting up and running a taskcluster job in a local container. Note that I have no idea how generally applicable this is. I will give the steps necessary to run the SM(gdb) job, which builds the JS shell and runs some gdb prettyprinter tests against it. I have no idea how far it will get you to running something like mochitests.

Getting the image

Taskcluster normally runs Docker images. So the first step is to get your very own copy of the appropriate docker image. There’s a handy blog post by someone who actually knows what he’s talking about that I found well after the fact (of course). But I’m going to give the exact steps that I used:

  • Click on the task you’re trying to replicate in treeherder.
  • Open the full log file.
  • Search for a line that says something like “Downloading artifact “public/image.tar.zst” from task ID: VuFo68PeQjCH7k15tSN2Dg.” near the beginning of the file. Call that ID $IMAGEID.
  • Run ./mach taskcluster-load-image --task-id $IMAGEID from your Gecko checkout.

and then optionally,

  • Curse and flail around when something goes wrong with the docker import process, as it always seems to.
  • Maybe install docker in the first place. Whoops, forgot to mention that.
  • You probably want it to be running as well.

Getting the image up and running

mach will helpfully give you a command to run a shell in the image, something like

/usr/bin/docker-current run -ti --rm debian7-amd64-build:e2e821aea119e4a264340c22b79324ac804955b605577dd225df5f4f8e98e0cc bash

. Don’t do that. It’s a great command, but it’s a little overzealous about cleaning up after itself. But grab out that image name: IMAGE=debian7-amd64-build:e2e821aea119e4a264340c22b79324ac804955b605577dd225df5f4f8e98e0cc

Although for now, I guess it’s really not bad. Just remove the --rm option and give it a try.

If you get a shell to pop up, congratulations! Be happy! If not, try asking someone with a clue or, failing that, ask me. I’m sfink in the #developers channel on IRC, or if you’re reading that after we’ve spun up our new Matrix overlord, I’ll probably be moving there. Oh, and if you’re in the Mozilla secret club, I suppose I won’t ignore you if you hit me (@sfink) up on Slack either.

Anyway, we’re going need to download some stuff into this image, which means we need a network. Mine didn’t start with a network. I don’t know much about Docker, but this got me a network:

  • ifconfig to figure out your local IP address, or do it some other way. My IP was 10.0.0.14.
  • docker network create -o "com.docker.network.bridge.host_binding_ipv4"="10.0.0.14" my-network

    , replacing the “10.0.0.14” with your own IP and, if you wish, “my-network” with something cooler-sounding. That’ll spit out some monstrous ID like 1793d9caad6d5973922b7a78ae11a2bce6005781ca18c0e253d1c2c5317f5c93 that you have to read out in Pig Latin in under 5 seconds. Or you can just ignore it.

  • docker ps to get the ID of your running container. (Or add -a if you’re going to be running a container you’ve created already.) Call that $CONTAINER_ID.
  • docker network connect my-network $CONTAINER_ID

Come to think of it, I only did that once with an old container I’m not longer using, and all of the new containers I’ve created come up with a functioning network from the get-go. So you can probably ignore all of the above.

Grafting your source into your container

Now that you have a container with a network running and everything, it’s time to throw it out and start over. I did say “don’t do that”, remember?

The next goal is to start up a container with your local source tree bind-mounted. Let’s call the absolute path to your checkout $SRCDIR.

  • Let’s expand your container-creating command to something like:
    docker run -ti -v $SRCDIR:/builds/worker/source:z $IMAGE bash

    [Note 2]

  • But don’t run that either. Or at least, don’t run it if you actually are trying to run the gdb task, because it requires some extra privileges in order to do the right ptrace magic.
  • Here’s the actual command I use:
    docker run -ti -v $SRCDIR:/builds/worker/source:z --cap-add=SYS_PTRACE --security-opt seccomp=unconfined $IMAGE bash

Ignoring the gdb ptrace goop, what that’s doing is bind-mounting $SRCDIR on your host so that it shows up at /builds/worker/source within your container, and additionally does the fixup necessary for selinux to allow you to then access the data from within the container. If you’re worried about stuff running within the container messing up your source checkout, you could add ,ro to the volume portion of that command:

docker run -ti -v $SRCDIR:/builds/worker/source:z,ro --cap-add=SYS_PTRACE --security-opt seccomp=unconfined $IMAGE bash

. But honestly, I’ve never tried doing that yet.

Snarfing taskcluster initialization

Hopefully, you now have a shell open in a container that is basically identical to what runs in taskcluster. You’re home free, right?

Not so fast. Taskcluster does some magic setup, I’m not entirely sure how, to provide an environment with a bunch of important settings that don’t come with a default shell. I figured out a bunch of stuff you could do manually to replicate this environment. Here’s a list of steps that I recommend you do not take:

  • Go back to your push on treeherder.
  • Click on the Task link in the bottom left pane.
  • Expand the “payload” section.
  • Somehow convert the whole “env” section to environment variable setting commands. I used to save the whole payload as a JSON file /tmp/task.json, then run
    perl -lne 'if (/"env"/ .. /^\s*\}/) { print "export $1='\''$2'\''" if /"(.*?)": "(.*)"/ }' /tmp/task.json
  • Cut & paste that into the shell running on your container.
  • Also cut & paste
    export TASKCLUSTER_ROOT_URL=https://firefox-ci-tc.services.mozilla.com

    to prevent it from attempting to access stuff via internal URLs that won’t work from your desktop.

  • Now grab the “command” key from that payload and stitch it together into a shell command to paste…
      …but that’s way too much work, and is incomplete besides.

      Running the command

      That last step, where you grabbed the command out of the payload? It’s not going to work. The main reason is that it attempts to do some hg fingerprinting thing that won’t work when you’re running outside of the data center. But the automation I created to avoid that also does the environment initialization piece, so I’ll whack the two little birdies with one rock.

      • Download https://raw.githubusercontent.com/hotsphink/sfink-tools/master/bin/mk-task-runner or checkout all of https://github.com/hotsphink/sfink-tools and find it in bin/.
      • Look at the bottom left pane for the Task field of the job you’re cloning. Copy the magic task ID next to it, something like fpcWJf1hTEinv8F49luc_w. Let’s call that $TASKID.
      • From within your source checkout, run mk-task-runner $TASKID. That’ll download the task descriptor and grab out the relevant pieces, and generate a simple run-task.sh.
      • Because this is in your source checkout, it should be visible from within the container. So run source/run-task.sh.

      This should run the whole task, and after it’s done, drop you into another shell with the environment settings preserved in case you want to do some further poking around.

      …except, once again, it probably won’t work.

      run-task

      Tasks run via a script taskcluster/scripts/run-task. Which is great, and it almost works perfectly for our purposes. Except it tries to do its own checkout of gecko, and it spends a bunch of time downloading stuff and then deletes it at the end. Both of those are not so helpful if you’re trying to run and rerun against your own checkout.

      I have bug 1605232 open for patches that add options to avoid that, but (1) it hasn’t landed, (2) it hasn’t been reviewed, (3) it may not be the direction The Powers That Be want to go, and (4) they might really rather not be using run-task for both automation and manual running in the first place. All of which could lead to this solution changing. If I’m a good person[Note 1], I’ll come back to this post and update it with the updated information when things change.

      In the meantime, you have two main options:

      1. Edit run-task.sh to get rid of the --keep and --existing-gecko-checkout=... options and let it run against a fresh checkout a re-download stuff, or
      2. Apply the patch in the above bug to your local checkout.

      Aftermath

      I was careful to post this during the holiday season to be sure you wouldn’t read it, but it looks like you somehow did anyway. If taskclustery people who actually work on this stuff would like to correct my undoubtedly numerous mistakes, I would be most appreciative, so please get in touch. Or if you can tell me where I’m making it all too hard.

      If you use this and it works for you, I’d be curious to know what you’re using it for. If you try to use it and it doesn’t work, I’d kinda like to know that too (I haven’t tried any other tasks yet.) If you try to use it and get angry about it not working, or it eats your data, I’m perfectly okay with you not getting in touch with me.


      Footnote 1: I’m not a good person.

      Footnote 2: The documentation says you should really be using --mount in place of -v aka --volume. But my version of docker doesn’t have --mount.


17
Aug 18

Type examination in gdb

Sometimes, the exact layout of objects in memory becomes very important. Some situations you may encounter:

  • When overlaying different types as “views” of the same memory location, perhaps via reinterpret_cast, unions, or void*-casting. You want to know where the field in one view lands in another.
  • When examining a struct layout’s packing, to see if there is space being wasted.
  • When looking at a crash at some offset like 0xc8, it’s common for that to be a NULL pointer dereference of a field of some structure, as if you did MyStruct* foo = nullptr; return foo->field;.

A very handy command for looking at the underlying storage of types is pahole. But I’m usually in gdb already when I want to examine types. Plus, I somehow have never gotten comfortable with the separate pahole command — possibly because it has a tendency to seg fault when I try running it. The latest gdb (8.1) also has it built-in if you run ptype/o typename.

Tom Tromey implemented a simple version of pahole inside gdb using the Python scripting APIs. I stole1 his code, fixed some bugs, mucked with his output layout, added some more bugs, fixed some of those, and added it to my gdb startup script collection. I also abstracted out the type traversal and created an additional command to examine what is at a given offset.

offset

Let’s say we have a crash at the address 0x58. Suspecting a NULL pointer dereference while manipulating a js::RegExpShared object, let’s look at what is at that offset:

(gdb) offset 0x58 js::RegExpShared
Scanning byte offsets 88..95
overlap at byte 88..127 with this.tables : js::RegExpShared::JitCodeTables
overlap at byte 88..95 with this.tables.mBegin : mozilla::UniquePtr *

By default, when you examine an offset, offset will look at native word’s worth of memory. Here, I’m running on 64-bit, so we’re looking at the 8 bytes starting at 0x58 (aka 88 decimal). (You could do offset/1 88 js::RegExpShared to only look at the single byte offset 88.)

From the above output, we can see that the field tables overlaps the offset range being inspected. tables itself is a js::RegExpShared::JitCodeTables structure, and its mBegin field occupies the relevant offsets.

Let’s look at another example:

(gdb) offset 184 JSContext
Scanning byte offsets 184..191
overlap at byte 184..199 with this.kind_ : js::WriteOnceData
overlap at byte 184..187 with this.kind_.value : js::ContextKind
overlap at byte 188..187 with this.kind_.check : js::CheckUnprotected
overlap at byte 188..191 with this.kind_ : <32-bit hole> in js::WriteOnceData before field 'nwrites'

This has some weirdnesses. First, notice the “byte 188..187” range of kind_.check. That’s an empty js::CheckUnprotected struct2.

Next, we seem to have collided with a hole in the JSContext structure. We can use pahole to look at the overall structure. But first, let’s switch to a simpler structure (JSContext is huge and the output would be pages long).

(gdb) offset 0 js::jit::ABIArg
Scanning byte offsets 0..7
overlap at byte 0..3 with this.kind_ : js::jit::ABIArg::Kind
overlap at byte 4..7 with this : <32-bit hole> in js::jit::ABIArg before field 'u'

Now we can get a higher-level view with pahole.

pahole

(gdb) pahole js::jit::ABIArg
  offset size
       0   16 : struct js::jit::ABIArg {
       0    4 :   kind_ : js::jit::ABIArg::Kind
       4    4 : --> 32 bit hole in js::jit::ABIArg <--
       8    8 :   u : struct union {...} {
   8  +0    1 :     gpr_ : js::jit::Register::Code
   8  +0    8 :     fpu_ : js::jit::FloatRegister::Code
   8  +0    4 :     offset_ : uint32_t
                  } union {...}
                } js::jit::ABIArg

This displays the full structure, with each field (or hole) displayed beside its offset and size within the overall type. Offsets of fields directly inside of the given type are given directly. Offsets within sub-structures are given as the offset of that structure within the outermost struct, plus an offset within the inner struct.

These outputs can get very noisy when they are deeply nested, possibly displaying a lot more data than you care about. You can clamp the depth of the tree with a /N suffix option if you wish:

(gdb) pahole/1 js::jit::ABIArg
  offset size
       0   16 : struct js::jit::ABIArg {
       0    4 :   kind_ : js::jit::ABIArg::Kind
       4    4 : --> 32 bit hole in struct js::jit::ABIArg <--
       8    8 :   u : union {...}
                } struct js::jit::ABIArg

Probably not useful with this example, since the interesting stuff is now hidden, but it makes large or deeply nested types much more readable.

Installing

If you would like to use this stuff, you can see the full directions for installing my random helper crap, or you can just grab the one file gdbinit.pahole.py and source it from your ~/.gdbinit:

source ~/path/to/gdbinit.pahole.py

Footnotes

1. "Stole" in the GPLv3 sense, that is -- tromey's original and my modified version are both released under the GPLv3 license.

2. Whether empty structs should even be displayed is questionable. What does that even mean?


01
Jun 17

sfink Mozilla workflow

Intro

I thought I’d write up one of those periodic posts describing my workflow. My workflow is not best for everyone. Nor is it the best possible one for me, since I’m a creature of habit and cling to comfortable tools. But it can be helpful to look at what others do, and see what you might be able to steal.

This is going to be more of a summary overview than an in-depth description or tutorial. I am happy to expand on bits you are curious about. Note that there are good docs already for the “normal” workflow at http://mozilla-version-control-tools.readthedocs.io/en/latest/hgmozilla/index.html

A number of things here use local crap that I’ve piled up over time. I’ve published a repository containing some of them. At the moment, I have it uploaded to two difference places. I don’t know how long I’ll keep them in sync before giving up on one:

Also, note that the WordPress formatting of this document isn’t very good; you’d probably be better off reading this on github, especially since I will be keeping it up to date there and not here on my blog.

Code Management

I use mercurial. I like mercurial. I used git first, for quite a while, but it just doesn’t stick in my brain.

I formerly used mq, and when I’d finally had enough of it, I tried to make my vanilla hg workflow provide as many of the same benefits as possible. I also use evolve[1], though it’s mostly just to make some things nicer.

I use phases heavily to keep track of what’s “mine”. If you’re pushing to any personal repositories, be sure to mark them non-publishing.

Pulling from upstream

I use the mozilla-unified repository. I have this in my ~/.hgrc:

[paths]
unified = https://hg.mozilla.org/mozilla-unified

so I can pull with

% hg pull unified

Read more on the unified repo. I will usually rebase on top of inbound. ./mach mercurial-setup should set you up with firefoxtree, which will cause the above pull to advance some local tags that will conveniently give you the head of the various repositories. My usual next step is

% hg rebase -d inbound

That assumes you are currently updated to the “patch stack” that you want to rebase, probably with a bookmark at its head.

What’s my state?

The biggest thing I missed from mq was an easy way to see my current “patch stack”. My main tool for this is an alias hg ls:

% hg ls
418116|8b3ea20f546c   Bug 1333000 - Display some additional diagnostic information for ConstraintTypeSet corruption, r=jandem 
418149|44e7e11f4a71   No bug. Attempt to get error output to appear. 
418150|12723a4fa5eb   Bug 1349531 - Remove non-threadsafe static buffers 
418165|9b790021a607   Bug 1367900 - Record the values and thresholds for GC triggers 
418171|5e12353100f6   Bug 1167452 - Incrementalize weakmap marking weakmap.incremental

You can’t see the colors, sorry. (Or if you can, you’re looking at this document on bitbucket and the colors are random and crazy.) But the first line is orange, and is the public[2] revision that my current patch stack is based on. The remaining lines are the ancestry of my current checkout. Note the weird format: I have it display “|” so I can double-click the hash and copy it. If I were smarter, I would teach my terminal to work with the normal ‘:’ separator. Without breaking URL highlighting.

“weakmap.incremental” is green in my terminal. It’s a bookmark name. Bookmarks are my way of keeping track of multiple things I’m working on. They’re sort of feature branches, except I have a bad habit of piling up a bunch of unrelated things in my patch stack. If they start interfering with each other too much, I’ll rebase them onto the tip of my mozilla-inbound checkout and give them their own bookmark names:

% hg rebase -d inbound weakmap.incremental
% hg book -r 9b790021a607 gc.triggers
% hg rebase -d inbound gc.triggers

The implementation of hg ls in my ~/.hgrc is:

[alias]
# Will only show changesets that chain to the working copy.
ls = !if [[ -n "$1" ]]; then r="$1"; else r=.; fi; $HG log -r "parents(::$r and not public()) + ::$r and not public()" --template "{label('changeset.{phase}', '{rev}|{node|short}')} {label('tags.normal', ifeq(tags, '', '', ifeq(tags, 'tip', '', '{tags}\n    ')))}  {desc|firstline} {label('tags.normal', bookmarks)}\n"
sl = ls

(Note that I mistype hg ls as hg sl about 40% of the time. You may not be so burdened.) There are better aliases for this. I think ./mach mercurial setup might give you hg wip or something now? But I like the terse output format of mine. (Just ignore that monstrosity of a template in the implementation.)

That works for a single stack of patches underneath a root bookmark. To see all of my stacks, I do:

% hg lsb
work                           Force-disable JIT optimization tracking
haz.specialize                 Implement a mechanism to specialize functions on the function pointers passed in
sixgill.tc                     Try marking JSCompartment as a GCThing for the hazard analysis
phase-self-time                phase diagnostics -- it really is what I said, with parallel tasks duration
GCCellPtr.TraceEdge            Implement TraceEdge for GCCellPtr
weakmap.incremental            Bug 1167452 - Incrementalize weakmap marking

‘lsb’ stands for ‘ls bookmarks’. And the above output is truncated, because it’s embarrassing how much outdated crap I have lying around. The implementation of lsb in my ~/.hgrc:

[alias]
lsb = log -r 'bookmark() and not public()' -T '{pad("{bookmarks}", 30)} {desc|firstline}\n'

Note that this displays only non-public changesets. (A plain hg bookmarks will dump out all of them… sorted alphabetically. Bleagh.) That means that when I land something, I don’t need to do anything to remove it from my set of active features. If I land the whole stack, then it’ll be public and so will disappear from hg lsb. If I land part of the stack, then the bookmarked head will still be visible. (But if I bookmarked portions of the stack, then the right ones will disappear. Phases are cool.)

Working on code

Updating, bookmarking

When starting on something new, I’ll update to ‘inbound’ (feel free to use ‘central’ if you don’t want to live dangerously. Given that you’ll have to rebase onto inbound before landing anyway, ‘central’ is probably a much smarter choice.) Then I’ll create a bookmark for the feature/fix I’m working on:

% hg pull unified
% hg update -r inbound
% hg book remove.xul

Notice the clunky name “remove.xul”. I formerly used ‘-‘ to separate words in my bookmark names, but ‘-‘ is a revset operator. It’ll still work for many things (and I think it’d work with everything if you did eg hg log -r 'bookmark("remove-xul")', but that’s too much typing). Using periods as separators, that’s just hg log -r remove.xul.

Making commits

I will start out just editing code. Once it’s in a reasonable state, or I need to switch to something else, I’ll commit normally:

% hg commit -m 'make stuff gooder'

Then while I’m being a good boy and continuing to work on the feature named in the bookmark, I’ll just keep amending that top commit:

% hg amend

hg amend is a command from the mutable-history aka evolve extension[1]. If you’re not using it, you could substitute hg commit --amend, but it will annoyingly keep asking you to update the commit message. There’s a fix, but this document is about my workflow, not yours.

But often, I will get distracted and begin working on a different feature. I could update to inbound or central and start again, but that tends to touch too many source files and slow down my rebuilds, and I have a lot of inertia, so usually I’ll just start hacking within the same bookmarked patch stack. When done or ready to work on the original (or a new) feature, I’ll make another commit.

When I want to go back to working on the original feature, I still won’t bother to clean things up, because I’m a bad and lazy person. Instead, I’ll just start making a bunch of micro-commits pertaining to various of the patches in my current stack (possibly one at a time with hg commit, or possibly picking apart my working directory changes with hg commit -i; see below). I use a naming convention in the patch descriptions of “M-“. So after a little while, my patch stack might look like:

418116|8b3ea20f546c   Bug 1333000 - Display some additional diagnostic information for ConstraintTypeSet corruption, r=jandem 
418149|44e7e11f4a71   No bug. Attempt to get error output to appear. 
418150|12723a4fa5eb   Bug 1349531 - Remove non-threadsafe static buffers 
418165|9b790021a607   Bug 1367900 - Record the values and thresholds for GC triggers 
418171|5e12353100f6   Bug 1167452 - Incrementalize weakmap marking
418172|deadbeef4dad   M-triggers
418173|deadbeef4dad   M-static
418174|deadbeef4dad   M-triggers
418175|deadbeef4dad   M-weakmap
418176|deadbeef4dad   M-triggers

What a mess, huh? Now comes the fun part. I'm a huge fan of the 'chistedit' extension[3]. The default 'histedit' will do the same thing using your text editor; I just really like the curses interface. I have an alias to make chistedit use a reasonable default for which revisions to show, which I suspect is no longer needed now that histedit has changed to default to something good. But mine is:

[alias]
che = chistedit -r 'not public() and ancestors(.)'

Now hg che will bring up a curses interface showing your patch stack. Use j/k to move the highlight around the list. Highlight one of the patches, say the first "M-triggers", and then use J/K (K in this case) to move it up or down in the list. Reshuffle the patches until you have your modification patches sitting directly underneath the main patch, eg

pick  418116|8b3ea20f546c   Bug 1333000 - Display some additional diagnostic information for ConstraintTypeSet corruption, r=jandem 
pick  418149|44e7e11f4a71   No bug. Attempt to get error output to appear. 
pick  418150|12723a4fa5eb   Bug 1349531 - Remove non-threadsafe static buffers 
pick  418173|deadbeef4dad   M-static
pick  418165|9b790021a607   Bug 1367900 - Record the values and thresholds for GC triggers 
pick  418174|deadbeef4dad   M-triggers
pick  418172|deadbeef4dad   M-triggers
pick  418176|deadbeef4dad   M-triggers 
pick  418171|5e12353100f6   Bug 1167452 - Incrementalize weakmap marking
pick  418175|deadbeef4dad   M-weakmap

Now use 'r' to "roll" these patches into their parents. You should end up with something like:

pick  418116|8b3ea20f546c   Bug 1333000 - Display some additional diagnostic information for ConstraintTypeSet corruption, r=jandem 
pick  418149|44e7e11f4a71   No bug. Attempt to get error output to appear. 
pick  418150|12723a4fa5eb   Bug 1349531 - Remove non-threadsafe static buffers 
roll^ 418173|deadbeef4dad
pick  418165|9b790021a607   Bug 1367900 - Record the values and thresholds for GC triggers 
roll^ 418174|deadbeef4dad
roll^ 418172|deadbeef4dad
roll^ 418176|deadbeef4dad
pick  418171|5e12353100f6   Bug 1167452 - Incrementalize weakmap marking
roll^ 418175|deadbeef4dad

Notice the caret that shows the direction of the destination patch, and that the commit messages for the to-be-folded patches are gone. If you like giving your micro-commits good descriptions, you might want to use 'f' for "fold" instead, in which case all of your descriptions will be smushed together for your later editing pleasure.

Now press 'c' to commit the changes. Whee! Use hg ls to see that everything is nice and pretty.

There is a new hg absorb command that will take your working directory changes and automatically roll them into the appropriate non-public patch. I haven't started using it yet.

(chistedit has other nice tricks. Use 'v' to see the patch. j/k now go up and down a line at a time. Space goes down a page, page up/down work. J/K now switch between patches. Oops, I just noticed I didn't update the help to include that. 'v' to return back to the patch list. Now try 'm', which will bring up an editor after you 'c' commit the changes, allowing you to edit the commit message for each patch so marked.)

From my above example, you might think I use one changeset per bug. That's very bug-dependent; many times I'll have a whole set of patches for one bug, and I'll have multiple bugs in my patch stack at one time. If you do that too, be sure to put the bug number in your commit message early to avoid getting confused[4].

Splitting out changes for multiple patches

I'm not very disciplined about keeping my changes separate, and often end up in a situation where my working directory has changes that belong to multiple patches. Mercurial handles this well. If some of the changes should be applied to the topmost patch, use

% hg amend -i

to bring up a curses interface that will allow you to select just the changes that you want to merge into that top patch. Or skip that step, and instead do a sequence of

% hg commit -i -m 'M-foo'

runs to pick apart your changes into fragments that apply to the various changesets in your stack, then do the above.

Normally, I'll use hg amend -i to select the new stuff that pertains to the top patch, hg commit -i to pick apart stuff for one new feature, and a final hg commit to commit the rest of the stuff.

% hg amend -i  # Choose the stuff that belongs to the top patch
% hg commit -i -m 'Another feature'
% hg commit -i -m 'Yet another feature'
% hg commit -m 'One more feature using the remainder of the changes'

And if you accidentally get it wrong and amend a patch with stuff that doesn't belong to it, then do

% hg uncommit -a
% hg amend -i

That will empty out the top patch, leaving the changes in your working directory, then bring up the interface to allow you to re-select just the stuff that belongs in that top patch. The remnants will be in your working directory, so proceed as usual.

When I want to work on a patch "deeper" in the stack, I use hg update -r or hg prev to update to it, then make my changes and hg amend to include them into the changeset. If I am not at the top changeset, this will invalidate all of the other patches. My preferred way to fix this up is to use hg next --evolve to rebase the old child on top of my update changeset and update to its new incarnation.

The usual evolve workflow you'll read elsewhere is to run hg evolve -a to automatically rebase everything that needs rebasing, but these days I almost always use hg next --evolve instead just so it does it one at a time and if a rebase runs into conflicts, it's more obvious to me which changeset is having the trouble. In fact, I made an alias

[alias]
advance = !while $HG next --evolve; do :; done

to advance as far as possible until the head changeset is reached, or a conflict occurs. YMMV.

Resolving conflicts

Speaking of conflicts, all this revision control craziness doesn't come for free. Conflicts are a fact of live, and it's nice to have a good merge tool. I'm not 100% happy with it, but the merge tool I prefer is kdiff3:

[ui]
merge = kdiff3

[merge-tools]
kdiff3.executable = ~/bin/kdiff3-wrapper
kdiff3.args = --auto $base $local $other -o $output -cs SyncMode=1
kdiff3.gui = True
kdiff3.premerge = True
kdiff3.binary = False

I don't remember what all that crap is for. It was mostly an attempt to get it to label the different patches being merged correctly, but I did it in the mq days, and these days I ignore the titles anyway. I kind of wish I did know which was which. Don't use the kdiff3.executable setting, since you don't have kdiff3-wrapper[5]. The rest is probably fine.

Uploading patches to bugs

I'm an old fart, so I almost always upload patches to bugzilla and request review there insead of using MozReview. If I already have a bug, the procedure is generally

% hg bzexport -r :fitzgen 1234567 -e

In the common case where I have a patch per bug, I usually won't have put the bug number in my commit message yet, so due to this setting in my ~/.hgrc:

[bzexport]
update-patch = 1,

bzexport will automatically prefix my commit message with "Bug 1234567 - ". It won't insert "r=fitzgen" or "r?fitzgen" or anything; I prefer to do that manually as a way to keep track of whether I've finished incorporating any review comments.

If I don't already have a bug, I will create it via bzexport:

% hg bzexport --new -r :fitzgen -e --title 'Crashes on Wednesdays'

Now, I must apologize, but that won't work for you. You will have to do

% hg bzexport --new -C 'Core :: GC' -r :fitzgen -e --title 'Crashes on Wednesdays'

because you don't have my fancy schmancy bzexport logic to automatically pick the component based on the files touched. Sorry about that; I'd have to do a bunch of cleanup to make that landable. And these days it's be better to rely on the moz.build bug component info instead of crawling through history.

Other useful flags are --blocks, --depends, --cc, and --feedback. Though I'll often fill those in when the editor pops up.

Oh, by the way, if you're nervous about it automatically doing something stupid when you're first learning, run with -i aka --interactive. It will ask before doing each step. Nothing bad will happen if you ^C out in the middle of that (though it will have already done what you allowed it to do; it can't uncreate a bugzilla bug or whatever, so don't say yes until you mean it.)

If I need to upload multiple patches, I'll update to each in turn (often using hg next and hg prev, which come with evolve) and run hg bzexport for each.

Uploading again

I'm sloppy and frequently don't get things right on the first try, so I'll need to upload again. Now this is a little tricky, because you want to mark the earlier versions as obsolete. In mq times, this was pretty straightforward: your patches had names, and it could just find the bug's patch attachment with the matching name. Without names, it's harder. You might think that it would be easiest to look for a matching commit message, and you'd probably be right, but it turns out that I tend to screw up my early attempts enough that I have to change what my patches do, and that necessitates updating the commit message.

So if you are using evolve, bzexport is smart and looks backwards through the history of each patch to find its earlier versions. (When you amend a changeset, or roll/fold another changeset into it, evolve records markers saying your old patch was "succeeded" by the new version.) For the most part, this Just Works. Unless you split one patch into two. Then your bzexport will get a bit confused, and your new patches will obsolete each other in bugzilla. 🙁 My bzexport is more smarterer, and will make a valiant attempt to find an appropriate "base" changeset to use for each one. It still isn't perfect, but I have not yet encountered a situation where it gets it wrong. (Or at least not very wrong. If you fold two patches together, it'll only obsolete one of the originals, for example.) That fix should be relatively easy to land, and I "promise" to land it soon[6].

Remember to use the -r flag again when you re-upload, assuming you're still ready for it to be reviewed. You don't need the bug number (or --new option) anymore, because bzexport will grab the bug number from the commit message, but it won't automatically re-request review from the same person. You might want to just upload the patch without requesting review, after all. But usually this second invocation would look like:

% hg bzexport -r :fitzgen

(the lack of -e there means it won't even bother to bring up an editor for a new comment to go along with the updated attachment. If you want the comment, use -e. Or --comment "another attempt" if you prefer.)

Incorporating review comments

I've already covered this. Update to the appropriate patch, make your changes, hg amend, hg advance to clean up any conflicts right away, then probably hg update -r <...> to get back to where you were.

Landing

I update to the appropriate patch. Use hg amend -m to update the commit message, adding the "r=fitzgen". Or if I need to do a bunch of them, I will run hg che (or just hg chistedit), go to each relevant patch, use 'm' to change the action to 'mess' (short for "message"), 'c' to commit to this histedit action string, and edit the messages in my $EDITOR.

Now I use chistedit to shuffle my landable patches to the top of the screen (#0 is the changeset directly atop a public changeset). Do not reorder them any more than necessary. I'll update to the last changeset I want to land, and hg push mozilla-inbound -r .. (Ok, really I use an 'mi' alias, and :gps magic makes '-r .' the default for mozilla repos. So I lied, I do hg push mi.)

Next I'll usually do a final try push. I cd to the top of my source checkout, then run:

% ./mach try 

If you don't know try syntax, use https://mozilla-releng.net/trychooser/ to construct one. I've trained my brain to know various ones useful to what I work on, but you can't go too far wrong with

% ./mach try -b do -p all -u all[x64]

And this part is a lie; I actually use my hg trychooser extension which has a slick curses UI based on a years-old database of try options. That I never use anymore. I do it manually, with something like

% hg trychooser -m 'try: -b do -p all -u all[x64]'

Forking your stack

If you commit a changeset on top of a non-top patch, you will fork your stack. The usual reason is that you've updated to some changeset, made a change, and run hg commit. You now have multiple heads. hg will tell you "created new head". Which is ok, except it's not, because it's much more confusing than having a simple linear patch stack. (Or rather, a bunch of linear patch stacks, each with a bookmark at its head.)

I usually look up through my terminal output history to find the revision of the older head, then rebase it on top of the new head. But if you don't have that in your history, you can find it with the appropriate hg log command. Something like

% hg log -r 'children(.^) and not .'
changeset:   418174:b7f1d672f3cd

will give it to you directly (see also hg help revsets), assuming you haven't done anything else and made a mess. Now rebase the old child on top of your new head:

% hg rebase -b b7f1d672f3cd -d .

It will leave you updated to your new head, or rather the changeset that was formerly a head, but the other changesets will now be your descendants. hg next a bunch of times to advance through them, or use my hg advance alias to go all the way, or do it directly:

% hg update -r 'heads(.::)'

(if you're not used to the crazy revset abbreviations, you may prefer to write that as hg update -r 'heads(descendants(.))'. I'm trying not to use too many abbreviations in this doc, but typing out "descendants" makes my fingers tired.)

Workspace management

So being able to jump all over your various feature bookmarks and things is cool, but I'm working with C++ here, and I hate touching files unnecessarily because it means a slow and painful rebuild. Personally, I keep two checkouts, ~/src/mozilla and ~/src/mozilla2. If I were more disciplined about disk space, I'd probably have a few more. Most people have several more. I used to have clever schemes of updating a master repository and then having all the others pull from it, but newer hg (and my DSL, I guess) is fast enough that I now just hg pull unified manually whenever I need to. I use the default objdir, located in the top directory of my source checkout, because I like to be able to run hg commands from within the objdir. But I suspect it messes me up because hg and watchman have to deal with a whole bunch of files within the checkout area that they don't care about.

watchman

Oh yeah, watchman. It makes many operations way, way faster. Or at least it did; recently, it often slows things down before it gives up and times out. Yeah, I ought to file a bug on it. The log doesn't say much.

I can't remember how to set up watchman, sorry. It looks like I built it from source? Hm, maybe I should update, then. You need two pieces: the watchman daemon that monitors the filesystem, and the python mercurial extension that talks to the daemon to accelerate hg operations. The latter part can be enabled with

[extensions]
fsmonitor =

Maybe ./mach bootstrap sets up watchman for you these days? And ./mach mercurial-setup sets up fsmonitor? I don't know.

Debugging

debug

I have this crazy Perl script that I've hacked in various horrible ways over the years. It's awful, and awfully useful. If I'm running the JS shell, I do

% debug ./obj-js-debug/dist/bin/js somefile.js

to start up Emacs with gdb running inside it in gud mode. Or

% debug --record ./obj-js-debug/dist/bin/js somefile.js

to make an rr recording of the JS shell, then bring up Emacs with rr replay running inside it. Or

% debug --record ./jstests.py /obj-js-debug/dist/bin/js sometest.js

to make an rr recording of a whole process tree, then find a process that crashed and bring up Emacs with rr replay on that process running inside it. Or

% debug

to bring up Emacs with rr replay running on the last rr recording I've made, again automatically picking a crashing process. If it gets it wrong, I can always do

% rr ps
# identify the process of interest
% debug --rrpid 1234

to tell it which one. Or

% ./mach test --debugger=debug some/test/file

to run the given test with hopefully the appropriate binary running under gdb inside Emacs inside the woman who swallowed a fly I don't know why. Or

% debug --js ./obj-js-debug/dist/bin/js somefile.js

to bring up Emacs running jorendb[7].

rr

I love me my rr. I have a .gdbinit.py startup file that creates a handy command for displaying the event number and tick count to keep me oriented chronologically:

(rr) now
2592/100438197

Or I can make rr display that output on every prompt:

(rr) set rrprompt on
rr-aware prompt enabled
(rr 538/267592) fin
Run till exit from #0 blah blah
(rr 545/267619)

I have a .gdbinit file with some funky commands to set hardware watchpoints on GC mark bits so I can 'continue' and 'reverse-continue' through an execution to find where the mark bits are set. And strangely dear to my heart is the 'rfin' command, which is just an easier to type alias for 'reverse-finish'. Other gdb commands:

(rr) log $thread sees the bad value  # $thread is replaced by eg T1
(rr) log also, obj is now $1         # gdb convenience vars ok
(rr) rfin
(rr) log {$2+4} bytes are required   # {any gdb expr}
(rr) n
(rr) log -dump
562/8443 T2 sees the bad value
562/8443 also, obj is now 0x7ff687749c00
346/945 7 bytes are required
(rr) log -sorted
   346/945 7 bytes are required
=> 562/8443 T2 sees the bad value
   562/8443 also, obj is now 0x7ff687749c00
(rr) log -edit  # brings up $EDITOR on your full log file

The idea is to be able to move around in time, logging various things, and then use log -sorted to display the log messages in chronological order according to the execution. (Note that when you do this, the next point in time coming up will be labeled with "=>" to show you when you are.) You might consider using this in conjunction with command as a simple way of automatically tracing the evolution of some value:

(rr) b HashTable::put
Breakpoint 1 set at HashTable::put(Entry)
(rr) comm 1
> log [$thread] in put(), table size is now {mEntries}
> cont
> end
(rr) c

Boom! You now have the value of mEntries every time put() is called. Or consider doing that with a hardware watchpoint. (But watch out for log messages in breakpoints; it will execute the log command every time you encounter the breakpoint, so if you go forwards and backwards across the breakpoint several times, you'll end up with a bunch of duplicate entries in your log. log -edit is useful for manually cleaning those up.)

Note that the default log filename is based on the process ID, and the logging will append entries across multiple rr replay runs. So if you run muliple sessions of rr replay on the same process recording, all of your log messages will be collected together. Use set logfile to switch to a different file.

Finally, there's a simple pp command, where pp foo is equivalent to python print(foo).


[1] https://www.mercurial-scm.org/wiki/EvolveExtension - install evolve by cloning hg clone https://bitbucket.org/marmoute/mutable-history somewhere, then adding it into your ~/.hgrc:

[extensions]
evolve = ~/lib/hg/mutable-history/hgext/evolve.py

[2] "public" is the name of a mercurial phase. It means a changeset that has been pushed to mozilla-inbound or similar. Stuff you're working on will ordinarily be in the "draft" phase until you push it.

[3] hg clone https://bitbucket.org/facebook/hg-experimental somewhere, then activate it with

[extensions]
chistedit = ~/lib/hg/hg-experimental/hgext3rd/chistedit.py

[4] When I have one patch per bug, I'll usually use hg bzexport --update to fill in the bug numbers. Especially since I normally file the bug in the first place with hg bzexport --new, so I don't even have a bug number until I do that.

[5] kdiff3-wrapper was pretty useful back in the day; kdiff3 has a bad habit of clearing the execute (chmod +x) bit when merging, so kdiff3-wrapper is a shell script that runs kdiff3 and then fixes up the bits afterwards. I don't know if it still has that issue?

[6] The quotes around "promise" translate more or less to "do not promise".

[7] jorendb is a relatively simple JS debugger that jorendorff wrote, I think to test the Debugger API. I suspect he's amused that I attempt to use it for anything practical. I'm sure the number of people for whom it is relevant is vanishingly small, but I love having it when I need it. (It's for the JS shell only. Nobody uses the JS shell for any serious scripting; why would you, when you have web browser and Node?) (I'm Nobody.)