21
Jan 12

bzexport –new: crash test dummies wanted

Scenario 1: you have a patch to some bug sitting in our mercurial queue. You want to attach it to a bug, but the bugzilla interface is painful and annoying. What do you do?

Use bzexport. It’s great! You can even request review at the same time.

What I really like about bzexport is that while writing and testing a patch, I’m in an editor and the command line. I may not even have a browser running, if I’m constantly re-starting it to test something out. Needing to go to the bugzilla web UI interrupts my flow. With bzexport, I can stay in the shell and move onto something else immediately.

Scenario 2: You have a patch, but haven’t filed a bug yet. Neither has anybody else. But your patch has a pretty good description of what the bug is. (This is common, especially for small things.) Do you really have to go through the obnoxious bug-filing procedure? It sure is tempting just to roll this fix up into some other vaguely related bug, isn’t it? Surely there’s a simple way to do things the right way without bouncing between interfaces?

Well, you’re screwed. Unless you’re willing to test something out for me. If not, please stop reading.
Continue reading →


03
Nov 11

Patch reordering

I have a patch queue that looks roughly like:

  initial-API
  consumer-1
  consumer-2
  unrelated
  consumer-3-plus-API-changes-and-consumer-1-and-2-updates-for-new-API

(So my base repo has a patch ‘initial-API-changes’ applied to it, followed by a patch ‘consumer-1’, etc.)

The idea is that I am working on a new API of some sort, and have a couple of independent consumers of that API. The first two are “done”, but when working on the 3rd, I realize that I need to make changes to or clean up the API that they’re all using. So I hack away, and end up with a patch that contains both consumer 3 plus some API changes, and to get it to compile I also update consumers 1 and 2 to accommodate the new changes. All of that is rolled up into a big hairball of a patch.

Now, what I want is:

  final-API
  consumer-1 (new API)
  consumer-2 (new API)
  unrelated
  consumer-3 (new API)

But how do I do that (using mq patches)? I can use qcrefresh+qnew to fairly easily get to:

  initial-API
  consumer-1 (old API)
  consumer-2 (old API)
  unrelated
  consumer-3 (new API)
  API-changes-plus-API-changes-for-consumers-1-and-2

or I could split out the consumer 1 & 2 API changes:

  initial-API
  consumer-1 (old API)
  consumer-2 (old API)
  unrelated
  consumer-3 (new API)
  API-changes
  consumer-2-API-changes
  consumer-1-API-changes

which theoretically I could qfold the consumer 1 and consumer 2 patches:

  initial-API
  consumer-1 (new API)
  consumer-2 (new API)
  unrelated
  consumer-3 (new API)
  API-changes

Unfortunately, consumer-1-API-changes collides with API-changes, so the fold will fail. It shouldn’t collide, really, but it does because part of the code to “register” consumer-1 with the new API happens to sit right alongside the API itself. Even worse, how do I “sink” the ‘API-changes’ patch down so I can fold it into initial-API to produce final-API? (Apologies for displaying my stacks upside-down from my terminology!) A naive qfold will only work if the API-changes stuff is separate from all the consumer-* patches.

My manual solution is to start with the initial queue:

  initial-API
  consumer-1 (old API)
  consumer-2 (old API)
  unrelated
  consumer-3-plus-API-changes-and-consumer-1-and-2-updates-for-new-API

and then use qcrefresh to rip the API changes and their effects on consumers 1 & 2 back out, leaving:

  initial-API
  consumer-1 (old API)
  consumer-2 (old API)
  unrelated
  API-changes-and-consumer-1-and-2-updates-for-new-API
  (in working directory) consumer-3 (new API)

I qrename/qmv the current patch to ‘api-change’ and qnew ‘consumer-3’ (its original name), cursing about how my commit messages are now on the wrong patch. Now I have

  initial-API
  consumer-1 (old API)
  consumer-2 (old API)
  unrelated
  api-change (API changes and consumer 1 and 2 updates for new API)
  consumer-3 (new API)

Now I know that ‘unrelated’ doesn’t touch any of the same files, so I can qgoto consumer-2 and qfold api-change safely, producing:

  initial-API
  consumer-1 (old API)
  consumer-2 (new API, but also with API change and consumer 1 updates)
  unrelated
  consumer-3 (new API)

I again qcrefresh,qmv,qnew to pull a reduced version of the api-change patch, giving:

  initial-API
  consumer-1 (old API)
  api-change (with API change and consumer 1 updates)
  consumer-2 (new API)
  unrelated
  consumer-3 (new API)

Repeat. I’m basically taking a combined patch and sinking it down towards its destination, carving off pieces to incorporate into patches as I pass them by. Now I have:

  initial-API
  api-change (with *only* the API change!)
  consumer-1 (new API)
  consumer-2 (new API)
  unrelated
  consumer-3 (new API)

and finally I can qfold api-change into initial-API, rename it to final-API, and have my desired result.

What a pain in the ass! Though the qcrefresh/qmv/qnew step is a lot better than what I’ve been doing up until now. Without qcrefresh, it would be

 % hg qrefresh -X .
 % hg qcrecord api-change
 % hg qnew consumer-n
 % hg qpop
 % hg qpop
 % hg qpop
 % hg qpush --move api-change
 % hg qpush --move consumer-n
 % hg qfold old-consumer-n

which admittedly preserves the change message from old-consumer-n, which is an advantage over my qcrefresh version.
Or alternatively: fold all of the patches together, and qcrecord until you have your desired final result. In this particular case, the ‘unrelated’ patch was a whole series of patches, and they weren’t unrelated enough to just trivially reorder them out of the way.

Without qcrecord, this is intensely painful, and probably involves hand-editing patch files.

My dream workflow would be to have qfold do the legwork: first scan through all intervening patches and grab out the portions of the folded patch that only modify nonconflicting files. Then try to get clever and do the same thing for the portions of the conflicted files that are independent. (The cleverness isn’t strictly necessary, but I’ve found that I end up selecting the same portions of my sinking patch over and over again, which gets old.) Then sink the patch as far as it will go before hitting a still-conflicting file, and open up the crecord UI to pull out just the parts that belong to the patch being folded (aka sunk). Repeat this for every intervening conflicting patch until the patch has sunk to its destination, then fold it in. If things get too hairy, then at any point abort the operation, leaving behind a half-sunk patch sitting next to the unmodified patch it conflicted with. (Alternatively, undo the entire operation, but since I keep my mq repo revision-controlled, I don’t care all that much.)

I originally wanted something that would do 3-way merges instead of the crecord UI invocations, but merges really want to move you “forward” to the final result of merging separate patches/lines of development. Here, I want to go backwards to a patch that, if merged, would produce the result I already have. So merge(base,base+A,base+B) -> base+AB which is the same as base+BA. From that, I could infer a B’ such that base+A+B’ is my merged base+AB, but that doesn’t do me any good.

In my case, I have base+A+B and want B” and A” such that base+B”+A” == base+A+B.

To anyone who made it this far: is there already an easy way to go about this? Is there something wrong with my development style that I get into these sorts of situations? In my case, I had already landed ‘initial-API’; please don’t tell me that the answer is that I always have to get the API right in the first place. Does anyone else get into this mess? (I can’t say I’ve run into this all that often, but it’s happened more than once or twice.)

I suppose if I had landed consumers 1 and 2, I would’ve just had to modify their uses of the API afterwards. So I could do that here, too. But reviews could tangle things up pretty easily — if a reviewer of consumer 1 or 2 notices the API uglinesses that I fixed for consumer 3, then landing the earlier consumers becomes dependent on landing consumer 3, which sucks. But also, none of this is really ready to land, and I’d like to iterate the API in my queue for a while with all the different consumers as test users, *without* lumping everything together into one massive patch.


20
Apr 11

Wading through history

Recently — well, actually, by now it wasn’t recently at all — I received a review request for a patch to JSD. It fixed an intermittent crash when using Firebug on a page that went into an endless stack-eating loop. A couple of people had worked on reproducing it, and the exact conditions were a little flaky, so I first tried it out myself. Kaboom! Yay!

So I imported the patch just to verify that it fixed the problem. Before compiling with it, I updated my tree to the latest version. Why? I don’t know. Just because it’s what I usually do. It seemed like a good idea at the time.

Only it wasn’t. It was a really, really dumb idea. I was changing two variables while trying to test one of them, and I got what I deserved: it stopped crashing after the patch, but when digging in to verify that it really was behaving as intended, I discovered it still wasn’t crashing.

This was just before the All Hands, and although I poked at it every few days, I didn’t make any headway: the patch seemed good, but I really wanted to confirm that it fixed the crash. (There were reasons why I was a little skeptical, but it’s not really relevant here.)

Eventually, when I had some time to think about it properly, I realized the best thing to do would be to revert to the older version that crashed for me. But how to find it?

One way would be to binary search nightlies. But I happened to be on a poor network connection, and downloading nightlies was insanely slow.

Also, I thought I should be able to do better. I run with an mq extension (mq = Mercurial Queues) that commits my patch queue on any change. Get it at git://github.com/hotsphink/mqext.git (I really should switch to bitbucket, rather than pointlessly restricting my audience to people who are minimally comfortable with both git and hg.) So all I had to do was to go back to the point where I imported the patch from bugzilla.

Finding the right moment was easy: ‘hg log –mq’ showed me all the changes made to my patch queue, one of which was commented “IMPORT: bz://643360” (an autogenerated comment courtesy of mqext.)  That was changeset 026ac43e9114. Yay!

But that changeset is for my patch queue, not my source repo. Fortunately, mq stores ‘parent’ fields in patch files that give the source repo changeset id that a patch was applied on top of. I’ll skip a number of failed attempts to track through this, and just give my final recipe:

  1. (already described) hg log –mq to find the appropriate changeset in the patch queue repo.
  2. cd to .hg/patches and run hg cat -r changeset series. This is because you need to know the names of the patch files in order to look at them — or specifically, the name of the first patch file, because it’s the only one whose parent will still be in the source repo. All other patches’ parents will be the source repo with mq patches applied to them, and will have been stripped out of the repo due to intervening actions. Because hg (or rather, mq) is not interested in preserving history.
  3. hg cat -r firstpatchname and look for the “# Parent changeset” line.
  4. cd back to your source repo and fetch that revision however you want — update to it, or clone a repo with it, or whatever.

I’m guessing this little recipe isn’t going to be useful to very many people, but I wanted to write it out for myself. So phbbbtt!!!

 


09
Dec 10

Mercurial MQ extension extension

I love using Mercurial’s MQ extension for managing patch queues, even though I have a strong suspicion that it’s fundamentally the wrong idea. I’m only going to discuss one part of that wrongness now, though: it forgets things. Lots of things.

Much of the point of using a revision control system is to not forget anything. I should be able to freely try various lines of development, and get back any of my earlier work. Normally, that would just mean being able to revert to earlier revisions of my source tree, although even there I should really be able to revert portions of changesets. But when using additional tools like mq that manage how I got to a particular source tree, I should be able to back up to any previous state with the tool’s assistance. Fundamentally, it’s not about moving back and forth through a history of artifact versions. It’s that I should never lose any work, even if I do something that in retrospect turns out to be dumb. Or especially when I do something dumb, I should say — that’s why I’m using a revision control system instead of a dumb backup system. It’s supposed to understand source code and what perverse things developers do when writing and modifying it.

Here’s a concrete example:

  • Developer edits code
  • hg qnew my-amazing-patch
  • Developer edits code some more
  • hg qrefresh
  • Developer says “oh f#@@#!!!!”

The problem is that when the developer refreshed the patch, Mercurial forgot the original patch. It also forgot the source tree that existed when the original patch was applied. So if those further edits turned out to be a Bad Idea, well, oops!

Yes, there is a way out: mq patch queues can themselves be revision controlled. Then as long as the developer remembers to hg commit --mq after every change to the patch queue, everything is golden.

You could even argue that this is the Right Way to work. After all, you don’t expect — or even want — your revision control system to remember every character you type. You’d never be able to identify the right point in time to back up to amid the mass of older revisions. Leaving the decision to the developer as to when a state is important enough to remember just makes sense.

Except it doesn’t. The developer already decided that the state was important by running qnew or qrefresh. Why burden the poor sap with yet another decision? Especially when making that decision requires typing in another command, which means that the mental threshold for interestingness is higher, which means it’ll pretty much never happen.

See https://bitbucket.org/sfink/mqext/ for the obvious solution. That’s actually a grab bag of mq extensions, all of which should really be submitted upstream. But I haven’t bothered.

The part that’s relevant to what I’m about here is that I added -Q options to all of the patch queue-modifying functions I could think of. Specifying -Q will commit the change to the patch queue repository, with a commit message describing the basic change (or you can set the message with -M).

Or you can go a step further, as I did, and use the [defaults] section in your ~/.hgrc to set the -Q flag automatically for whichever commands. See the help message (or the README) for details on installation and usage. Update: and now it’s easier, because you can set qcommit = auto in your [mqext] section and it’ll add the -Q option to the relevant commands. Which is good, since there are more of them than you think.

If you install this, you may want to try out the ‘qshow’ command, too. It’s my favorite of the other things implemented in that extension (I alias it to just ‘show’ because my left pinky is slow.) I use it constantly to review the various patches in the queue. hg show <n> is the way I usually use it; it prints out patch #n in your queue (the numbers come from hg qseries -v, though you really ought to just put -v in your [defaults] section too. Or alias series=qseries -v as I did.)

Feel free to use it, fork it, complain about it, or whatever. I’m still trying to figure out whether I really like it or not. It slows down qref operations, which kinda sucks. But I guess if I really cared I would turn off the default -Q for that one command, and just specify it manually. And I haven’t done that yet.

Oh right. One crucial thing I should mention: actually using any of this saved state is a dangerous affair. Why? Well, because you probably have a couple of patches in your queue applied at the time you decide to back up to an older state, and modifying applied patches is not very healthy. Especially if you reordered your series file. In fact, I would probably recommend doing these steps before (or just after, it doesn’t matter) reverting to an older revision of your patch queue:

  1. hg update -r qparent -C
  2. rm $(hg root --mq)/stateThat will “unapply” all patches, forcefully. You can then qpush (or better, qgoto) the place you want in your queue. Note that shell $(…) is the modern version of backticks, in case you’re unfamiliar.Finally, here’s a sampler of the sorts of log messages the extension extension produces:
    UPDATE: multipage-test
     js/jsd/jsd_xpc.cpp               |    1 +
     js/jsd/test/Makefile.in          |    3 +-
     js/jsd/test/browser_multipage.js |  428 +++++++++++++++++++++++++++++++++++++++
     js/src/jsapi.cpp                 |    4 +
     js/src/jscntxt.cpp               |    4 +-
     js/src/jscompartment.cpp         |    1 +
     js/src/jswrapper.cpp             |    9 +
     7 files changed, 447 insertions(+), 3 deletions(-)
    
    NEW: rename-multipage
    
    RENAME: bug615277-JM-execHook-3 -> bug615277-JM-execHook
    
    DELETE: bug-612717.diff
    
    UPDATE: better-note-dump

    Or as the output of hg log --mq (which only shows the 1st line of each commit message):

    changeset:   92:e2ed45b4a8bf
    user:        Steve Fink 
    date:        Tue Dec 07 14:54:46 2010 -0800
    summary:     UPDATE: bug615277-JM-execHook
    
    changeset:   91:6e36813b7291
    user:        Steve Fink 
    date:        Tue Dec 07 14:51:45 2010 -0800
    summary:     NEW: rename-multipage
    
    changeset:   90:b66861e98c29
    user:        Steve Fink 
    date:        Tue Dec 07 14:38:15 2010 -0800
    summary:     RENAME: bug615277-JM-execHook-3 -> bug615277-JM-execHook
    
    changeset:   89:c02111e0d18d
    user:        Steve Fink 
    date:        Tue Dec 07 14:37:27 2010 -0800
    summary:     NEW: bug615277-JM-execHook-3