Mozilla — Comments Off
17
Aug 18

Type examination in gdb

Sometimes, the exact layout of objects in memory becomes very important. Some situations you may encounter:

When overlaying different types as “views” of the same memory location, perhaps via reinterpret_cast, unions, or void*-casting. You want to know where the field in one view lands in another.
When examining a struct layout’s packing, to see if there is space being wasted.
When looking at a crash at some offset like 0xc8, it’s common for that to be a NULL pointer dereference of a field of some structure, as if you did MyStruct* foo = nullptr; return foo->field;.

A very handy command for looking at the underlying storage of types is pahole. But I’m usually in gdb already when I want to examine types. Plus, I somehow have never gotten comfortable with the separate pahole command — possibly because it has a tendency to seg fault when I try running it. The latest gdb (8.1) also has it built-in if you run ptype/o typename.

Tom Tromey implemented a simple version of pahole inside gdb using the Python scripting APIs. I stole¹ his code, fixed some bugs, mucked with his output layout, added some more bugs, fixed some of those, and added it to my gdb startup script collection. I also abstracted out the type traversal and created an additional command to examine what is at a given offset.

`offset`

Let’s say we have a crash at the address 0x58. Suspecting a NULL pointer dereference while manipulating a js::RegExpShared object, let’s look at what is at that offset:

(gdb) offset 0x58 js::RegExpShared
Scanning byte offsets 88..95
overlap at byte 88..127 with this.tables : js::RegExpShared::JitCodeTables
overlap at byte 88..95 with this.tables.mBegin : mozilla::UniquePtr *

By default, when you examine an offset, offset will look at native word’s worth of memory. Here, I’m running on 64-bit, so we’re looking at the 8 bytes starting at 0x58 (aka 88 decimal). (You could do offset/1 88 js::RegExpShared to only look at the single byte offset 88.)

From the above output, we can see that the field tables overlaps the offset range being inspected. tables itself is a js::RegExpShared::JitCodeTables structure, and its mBegin field occupies the relevant offsets.

Let’s look at another example:

(gdb) offset 184 JSContext
Scanning byte offsets 184..191
overlap at byte 184..199 with this.kind_ : js::WriteOnceData
overlap at byte 184..187 with this.kind_.value : js::ContextKind
overlap at byte 188..187 with this.kind_.check : js::CheckUnprotected
overlap at byte 188..191 with this.kind_ : <32-bit hole> in js::WriteOnceData before field 'nwrites'

This has some weirdnesses. First, notice the “byte 188..187” range of kind_.check. That’s an empty js::CheckUnprotected struct².

Next, we seem to have collided with a hole in the JSContext structure. We can use pahole to look at the overall structure. But first, let’s switch to a simpler structure (JSContext is huge and the output would be pages long).

(gdb) offset 0 js::jit::ABIArg
Scanning byte offsets 0..7
overlap at byte 0..3 with this.kind_ : js::jit::ABIArg::Kind
overlap at byte 4..7 with this : <32-bit hole> in js::jit::ABIArg before field 'u'

Now we can get a higher-level view with pahole.

`pahole`

(gdb) pahole js::jit::ABIArg
  offset size
       0   16 : struct js::jit::ABIArg {
       0    4 :   kind_ : js::jit::ABIArg::Kind
       4    4 : --> 32 bit hole in js::jit::ABIArg <--
       8    8 :   u : struct union {...} {
   8  +0    1 :     gpr_ : js::jit::Register::Code
   8  +0    8 :     fpu_ : js::jit::FloatRegister::Code
   8  +0    4 :     offset_ : uint32_t
                  } union {...}
                } js::jit::ABIArg

This displays the full structure, with each field (or hole) displayed beside its offset and size within the overall type. Offsets of fields directly inside of the given type are given directly. Offsets within sub-structures are given as the offset of that structure within the outermost struct, plus an offset within the inner struct.

These outputs can get very noisy when they are deeply nested, possibly displaying a lot more data than you care about. You can clamp the depth of the tree with a /N suffix option if you wish:

(gdb) pahole/1 js::jit::ABIArg
  offset size
       0   16 : struct js::jit::ABIArg {
       0    4 :   kind_ : js::jit::ABIArg::Kind
       4    4 : --> 32 bit hole in struct js::jit::ABIArg <--
       8    8 :   u : union {...}
                } struct js::jit::ABIArg

Probably not useful with this example, since the interesting stuff is now hidden, but it makes large or deeply nested types much more readable.

Installing

If you would like to use this stuff, you can see the full directions for installing my random helper crap, or you can just grab the one file gdbinit.pahole.py and source it from your ~/.gdbinit:

source ~/path/to/gdbinit.pahole.py

Footnotes

1. "Stole" in the GPLv3 sense, that is -- tromey's original and my modified version are both released under the GPLv3 license.

2. Whether empty structs should even be displayed is questionable. What does that even mean?

Uncategorized — 6 Comments
01
Jun 17

sfink Mozilla workflow

Intro

I thought I’d write up one of those periodic posts describing my workflow. My workflow is not best for everyone. Nor is it the best possible one for me, since I’m a creature of habit and cling to comfortable tools. But it can be helpful to look at what others do, and see what you might be able to steal.

This is going to be more of a summary overview than an in-depth description or tutorial. I am happy to expand on bits you are curious about. Note that there are good docs already for the “normal” workflow at http://mozilla-version-control-tools.readthedocs.io/en/latest/hgmozilla/index.html

A number of things here use local crap that I’ve piled up over time. I’ve published a repository containing some of them. At the moment, I have it uploaded to two difference places. I don’t know how long I’ll keep them in sync before giving up on one:

(mercurial) https://bitbucket.org/sfink/sfink-tools
(git) https://github.com/hotsphink/sfink-tools

Also, note that the WordPress formatting of this document isn’t very good; you’d probably be better off reading this on github, especially since I will be keeping it up to date there and not here on my blog.

Code Management

I use mercurial. I like mercurial. I used git first, for quite a while, but it just doesn’t stick in my brain.

I formerly used mq, and when I’d finally had enough of it, I tried to make my vanilla hg workflow provide as many of the same benefits as possible. I also use evolve[1], though it’s mostly just to make some things nicer.

I use phases heavily to keep track of what’s “mine”. If you’re pushing to any personal repositories, be sure to mark them non-publishing.

Pulling from upstream

I use the mozilla-unified repository. I have this in my ~/.hgrc:

[paths]
unified = https://hg.mozilla.org/mozilla-unified

so I can pull with

% hg pull unified

Read more on the unified repo. I will usually rebase on top of inbound. ./mach mercurial-setup should set you up with firefoxtree, which will cause the above pull to advance some local tags that will conveniently give you the head of the various repositories. My usual next step is

% hg rebase -d inbound

That assumes you are currently updated to the “patch stack” that you want to rebase, probably with a bookmark at its head.

What’s my state?

The biggest thing I missed from mq was an easy way to see my current “patch stack”. My main tool for this is an alias hg ls:

% hg ls
418116|8b3ea20f546c   Bug 1333000 - Display some additional diagnostic information for ConstraintTypeSet corruption, r=jandem 
418149|44e7e11f4a71   No bug. Attempt to get error output to appear. 
418150|12723a4fa5eb   Bug 1349531 - Remove non-threadsafe static buffers 
418165|9b790021a607   Bug 1367900 - Record the values and thresholds for GC triggers 
418171|5e12353100f6   Bug 1167452 - Incrementalize weakmap marking weakmap.incremental

You can’t see the colors, sorry. (Or if you can, you’re looking at this document on bitbucket and the colors are random and crazy.) But the first line is orange, and is the public[2] revision that my current patch stack is based on. The remaining lines are the ancestry of my current checkout. Note the weird format: I have it display “|” so I can double-click the hash and copy it. If I were smarter, I would teach my terminal to work with the normal ‘:’ separator. Without breaking URL highlighting.

“weakmap.incremental” is green in my terminal. It’s a bookmark name. Bookmarks are my way of keeping track of multiple things I’m working on. They’re sort of feature branches, except I have a bad habit of piling up a bunch of unrelated things in my patch stack. If they start interfering with each other too much, I’ll rebase them onto the tip of my mozilla-inbound checkout and give them their own bookmark names:

% hg rebase -d inbound weakmap.incremental
% hg book -r 9b790021a607 gc.triggers
% hg rebase -d inbound gc.triggers

The implementation of hg ls in my ~/.hgrc is:

[alias]
# Will only show changesets that chain to the working copy.
ls = !if [[ -n "$1" ]]; then r="$1"; else r=.; fi; $HG log -r "parents(::$r and not public()) + ::$r and not public()" --template "{label('changeset.{phase}', '{rev}|{node|short}')} {label('tags.normal', ifeq(tags, '', '', ifeq(tags, 'tip', '', '{tags}\n    ')))}  {desc|firstline} {label('tags.normal', bookmarks)}\n"
sl = ls

(Note that I mistype hg ls as hg sl about 40% of the time. You may not be so burdened.) There are better aliases for this. I think ./mach mercurial setup might give you hg wip or something now? But I like the terse output format of mine. (Just ignore that monstrosity of a template in the implementation.)

That works for a single stack of patches underneath a root bookmark. To see all of my stacks, I do:

% hg lsb
work                           Force-disable JIT optimization tracking
haz.specialize                 Implement a mechanism to specialize functions on the function pointers passed in
sixgill.tc                     Try marking JSCompartment as a GCThing for the hazard analysis
phase-self-time                phase diagnostics -- it really is what I said, with parallel tasks duration
GCCellPtr.TraceEdge            Implement TraceEdge for GCCellPtr
weakmap.incremental            Bug 1167452 - Incrementalize weakmap marking

‘lsb’ stands for ‘ls bookmarks’. And the above output is truncated, because it’s embarrassing how much outdated crap I have lying around. The implementation of lsb in my ~/.hgrc:

[alias]
lsb = log -r 'bookmark() and not public()' -T '{pad("{bookmarks}", 30)} {desc|firstline}\n'

Note that this displays only non-public changesets. (A plain hg bookmarks will dump out all of them… sorted alphabetically. Bleagh.) That means that when I land something, I don’t need to do anything to remove it from my set of active features. If I land the whole stack, then it’ll be public and so will disappear from hg lsb. If I land part of the stack, then the bookmarked head will still be visible. (But if I bookmarked portions of the stack, then the right ones will disappear. Phases are cool.)

Working on code

Updating, bookmarking

When starting on something new, I’ll update to ‘inbound’ (feel free to use ‘central’ if you don’t want to live dangerously. Given that you’ll have to rebase onto inbound before landing anyway, ‘central’ is probably a much smarter choice.) Then I’ll create a bookmark for the feature/fix I’m working on:

% hg pull unified
% hg update -r inbound
% hg book remove.xul

Notice the clunky name “remove.xul”. I formerly used ‘-‘ to separate words in my bookmark names, but ‘-‘ is a revset operator. It’ll still work for many things (and I think it’d work with everything if you did eg hg log -r 'bookmark("remove-xul")', but that’s too much typing). Using periods as separators, that’s just hg log -r remove.xul.

Making commits

I will start out just editing code. Once it’s in a reasonable state, or I need to switch to something else, I’ll commit normally:

% hg commit -m 'make stuff gooder'

Then while I’m being a good boy and continuing to work on the feature named in the bookmark, I’ll just keep amending that top commit:

% hg amend

hg amend is a command from the mutable-history aka evolve extension[1]. If you’re not using it, you could substitute hg commit --amend, but it will annoyingly keep asking you to update the commit message. There’s a fix, but this document is about my workflow, not yours.

But often, I will get distracted and begin working on a different feature. I could update to inbound or central and start again, but that tends to touch too many source files and slow down my rebuilds, and I have a lot of inertia, so usually I’ll just start hacking within the same bookmarked patch stack. When done or ready to work on the original (or a new) feature, I’ll make another commit.

When I want to go back to working on the original feature, I still won’t bother to clean things up, because I’m a bad and lazy person. Instead, I’ll just start making a bunch of micro-commits pertaining to various of the patches in my current stack (possibly one at a time with hg commit, or possibly picking apart my working directory changes with hg commit -i; see below). I use a naming convention in the patch descriptions of “M-“. So after a little while, my patch stack might look like:

418116|8b3ea20f546c   Bug 1333000 - Display some additional diagnostic information for ConstraintTypeSet corruption, r=jandem 
418149|44e7e11f4a71   No bug. Attempt to get error output to appear. 
418150|12723a4fa5eb   Bug 1349531 - Remove non-threadsafe static buffers 
418165|9b790021a607   Bug 1367900 - Record the values and thresholds for GC triggers 
418171|5e12353100f6   Bug 1167452 - Incrementalize weakmap marking
418172|deadbeef4dad   M-triggers
418173|deadbeef4dad   M-static
418174|deadbeef4dad   M-triggers
418175|deadbeef4dad   M-weakmap
418176|deadbeef4dad   M-triggers


What a mess, huh? Now comes the fun part. I'm a huge fan of the 'chistedit' extension[3]. The default 'histedit' will do the same thing using your text editor; I just really like the curses interface. I have an alias to make chistedit use a reasonable default for which revisions to show, which I suspect is no longer needed now that histedit has changed to default to something good. But mine is:
[alias]
che = chistedit -r 'not public() and ancestors(.)'
Now hg che will bring up a curses interface showing your patch stack. Use j/k to move the highlight around the list. Highlight one of the patches, say the first "M-triggers", and then use J/K (K in this case) to move it up or down in the list. Reshuffle the patches until you have your modification patches sitting directly underneath the main patch, eg
pick  418116|8b3ea20f546c   Bug 1333000 - Display some additional diagnostic information for ConstraintTypeSet corruption, r=jandem 
pick  418149|44e7e11f4a71   No bug. Attempt to get error output to appear. 
pick  418150|12723a4fa5eb   Bug 1349531 - Remove non-threadsafe static buffers 
pick  418173|deadbeef4dad   M-static
pick  418165|9b790021a607   Bug 1367900 - Record the values and thresholds for GC triggers 
pick  418174|deadbeef4dad   M-triggers
pick  418172|deadbeef4dad   M-triggers
pick  418176|deadbeef4dad   M-triggers 
pick  418171|5e12353100f6   Bug 1167452 - Incrementalize weakmap marking
pick  418175|deadbeef4dad   M-weakmap

Now use 'r' to "roll" these patches into their parents. You should end up with something like:
pick  418116|8b3ea20f546c   Bug 1333000 - Display some additional diagnostic information for ConstraintTypeSet corruption, r=jandem 
pick  418149|44e7e11f4a71   No bug. Attempt to get error output to appear. 
pick  418150|12723a4fa5eb   Bug 1349531 - Remove non-threadsafe static buffers 
roll^ 418173|deadbeef4dad
pick  418165|9b790021a607   Bug 1367900 - Record the values and thresholds for GC triggers 
roll^ 418174|deadbeef4dad
roll^ 418172|deadbeef4dad
roll^ 418176|deadbeef4dad
pick  418171|5e12353100f6   Bug 1167452 - Incrementalize weakmap marking
roll^ 418175|deadbeef4dad

Notice the caret that shows the direction of the destination patch, and that the commit messages for the to-be-folded patches are gone. If you like giving your micro-commits good descriptions, you might want to use 'f' for "fold" instead, in which case all of your descriptions will be smushed together for your later editing pleasure.
Now press 'c' to commit the changes. Whee! Use hg ls to see that everything is nice and pretty.
There is a new hg absorb command that will take your working directory changes and automatically roll them into the appropriate non-public patch. I haven't started using it yet.
(chistedit has other nice tricks. Use 'v' to see the patch. j/k now go up and down a line at a time. Space goes down a page, page up/down work. J/K now switch between patches. Oops, I just noticed I didn't update the help to include that. 'v' to return back to the patch list. Now try 'm', which will bring up an editor after you 'c' commit the changes, allowing you to edit the commit message for each patch so marked.)
From my above example, you might think I use one changeset per bug. That's very bug-dependent; many times I'll have a whole set of patches for one bug, and I'll have multiple bugs in my patch stack at one time. If you do that too, be sure to put the bug number in your commit message early to avoid getting confused[4].
Splitting out changes for multiple patches
I'm not very disciplined about keeping my changes separate, and often end up in a situation where my working directory has changes that belong to multiple patches. Mercurial handles this well. If some of the changes should be applied to the topmost patch, use
% hg amend -i
to bring up a curses interface that will allow you to select just the changes that you want to merge into that top patch. Or skip that step, and instead do a sequence of
% hg commit -i -m 'M-foo'
runs to pick apart your changes into fragments that apply to the various changesets in your stack, then do the above.
Normally, I'll use hg amend -i to select the new stuff that pertains to the top patch, hg commit -i to pick apart stuff for one new feature, and a final hg commit to commit the rest of the stuff.
% hg amend -i  # Choose the stuff that belongs to the top patch
% hg commit -i -m 'Another feature'
% hg commit -i -m 'Yet another feature'
% hg commit -m 'One more feature using the remainder of the changes'
And if you accidentally get it wrong and amend a patch with stuff that doesn't belong to it, then do
% hg uncommit -a
% hg amend -i
That will empty out the top patch, leaving the changes in your working directory, then bring up the interface to allow you to re-select just the stuff that belongs in that top patch. The remnants will be in your working directory, so proceed as usual.
Navigating through your stack
When I want to work on a patch "deeper" in the stack, I use hg update -r  or hg prev to update to it, then make my changes and hg amend to include them into the changeset. If I am not at the top changeset, this will invalidate all of the other patches. My preferred way to fix this up is to use hg next --evolve to rebase the old child on top of my update changeset and update to its new incarnation.
The usual evolve workflow you'll read elsewhere is to run hg evolve -a to automatically rebase everything that needs rebasing, but these days I almost always use hg next --evolve instead just so it does it one at a time and if a rebase runs into conflicts, it's more obvious to me which changeset is having the trouble. In fact, I made an alias
[alias]
advance = !while $HG next --evolve; do :; done
to advance as far as possible until the head changeset is reached, or a conflict occurs. YMMV.
Resolving conflicts
Speaking of conflicts, all this revision control craziness doesn't come for free. Conflicts are a fact of live, and it's nice to have a good merge tool. I'm not 100% happy with it, but the merge tool I prefer is kdiff3:
[ui]
merge = kdiff3

[merge-tools]
kdiff3.executable = ~/bin/kdiff3-wrapper
kdiff3.args = --auto $base $local $other -o $output -cs SyncMode=1
kdiff3.gui = True
kdiff3.premerge = True
kdiff3.binary = False
I don't remember what all that crap is for. It was mostly an attempt to get it to label the different patches being merged correctly, but I did it in the mq days, and these days I ignore the titles anyway. I kind of wish I did know which was which. Don't use the kdiff3.executable setting, since you don't have kdiff3-wrapper[5]. The rest is probably fine.
Uploading patches to bugs
I'm an old fart, so I almost always upload patches to bugzilla and request review there insead of using MozReview. If I already have a bug, the procedure is generally
% hg bzexport -r :fitzgen 1234567 -e
In the common case where I have a patch per bug, I usually won't have put the bug number in my commit message yet, so due to this setting in my ~/.hgrc:
[bzexport]
update-patch = 1,
bzexport will automatically prefix my commit message with "Bug 1234567 - ". It won't insert "r=fitzgen" or "r?fitzgen" or anything; I prefer to do that manually as a way to keep track of whether I've finished incorporating any review comments.
If I don't already have a bug, I will create it via bzexport:
% hg bzexport --new -r :fitzgen -e --title 'Crashes on Wednesdays'
Now, I must apologize, but that won't work for you. You will have to do
% hg bzexport --new -C 'Core :: GC' -r :fitzgen -e --title 'Crashes on Wednesdays'
because you don't have my fancy schmancy bzexport logic to automatically pick the component based on the files touched. Sorry about that; I'd have to do a bunch of cleanup to make that landable. And these days it's be better to rely on the moz.build bug component info instead of crawling through history.
Other useful flags are --blocks, --depends, --cc, and --feedback. Though I'll often fill those in when the editor pops up.
Oh, by the way, if you're nervous about it automatically doing something stupid when you're first learning, run with -i aka --interactive. It will ask before doing each step. Nothing bad will happen if you ^C out in the middle of that (though it will have already done what you allowed it to do; it can't uncreate a bugzilla bug or whatever, so don't say yes until you mean it.)
If I need to upload multiple patches, I'll update to each in turn (often using hg next and hg prev, which come with evolve) and run hg bzexport for each.
Uploading again
I'm sloppy and frequently don't get things right on the first try, so I'll need to upload again. Now this is a little tricky, because you want to mark the earlier versions as obsolete. In mq times, this was pretty straightforward: your patches had names, and it could just find the bug's patch attachment with the matching name. Without names, it's harder. You might think that it would be easiest to look for a matching commit message, and you'd probably be right, but it turns out that I tend to screw up my early attempts enough that I have to change what my patches do, and that necessitates updating the commit message.
So if you are using evolve, bzexport is smart and looks backwards through the history of each patch to find its earlier versions. (When you amend a changeset, or roll/fold another changeset into it, evolve records markers saying your old patch was "succeeded" by the new version.) For the most part, this Just Works. Unless you split one patch into two. Then your bzexport will get a bit confused, and your new patches will obsolete each other in bugzilla. 🙁 My bzexport is more smarterer, and will make a valiant attempt to find an appropriate "base" changeset to use for each one. It still isn't perfect, but I have not yet encountered a situation where it gets it wrong. (Or at least not very wrong. If you fold two patches together, it'll only obsolete one of the originals, for example.) That fix should be relatively easy to land, and I "promise" to land it soon[6].
Remember to use the -r flag again when you re-upload, assuming you're still ready for it to be reviewed. You don't need the bug number (or --new option) anymore, because bzexport will grab the bug number from the commit message, but it won't automatically re-request review from the same person. You might want to just upload the patch without requesting review, after all. But usually this second invocation would look like:
% hg bzexport -r :fitzgen
(the lack of -e there means it won't even bother to bring up an editor for a new comment to go along with the updated attachment. If you want the comment, use -e. Or --comment "another attempt" if you prefer.)
Incorporating review comments
I've already covered this. Update to the appropriate patch, make your changes, hg amend, hg advance to clean up any conflicts right away, then probably hg update -r <...> to get back to where you were.
Landing
I update to the appropriate patch. Use hg amend -m to update the commit message, adding the "r=fitzgen". Or if I need to do a bunch of them, I will run hg che (or just hg chistedit), go to each relevant patch, use 'm' to change the action to 'mess' (short for "message"), 'c' to commit to this histedit action string, and edit the messages in my $EDITOR.
Now I use chistedit to shuffle my landable patches to the top of the screen (#0 is the changeset directly atop a public changeset). Do not reorder them any more than necessary. I'll update to the last changeset I want to land, and hg push mozilla-inbound -r .. (Ok, really I use an 'mi' alias, and :gps magic makes '-r .' the default for mozilla repos. So I lied, I do hg push mi.)
Next I'll usually do a final try push. I cd to the top of my source checkout, then run:
% ./mach try 
If you don't know try syntax, use https://mozilla-releng.net/trychooser/ to construct one. I've trained my brain to know various ones useful to what I work on, but you can't go too far wrong with
% ./mach try -b do -p all -u all[x64]
And this part is a lie; I actually use my hg trychooser extension which has a slick curses UI based on a years-old database of try options. That I never use anymore. I do it manually, with something like
% hg trychooser -m 'try: -b do -p all -u all[x64]'
Forking your stack
If you commit a changeset on top of a non-top patch, you will fork your stack. The usual reason is that you've updated to some changeset, made a change, and run hg commit. You now have multiple heads. hg will tell you "created new head". Which is ok, except it's not, because it's much more confusing than having a simple linear patch stack. (Or rather, a bunch of linear patch stacks, each with a bookmark at its head.)
I usually look up through my terminal output history to find the revision of the older head, then rebase it on top of the new head. But if you don't have that in your history, you can find it with the appropriate hg log command. Something like
% hg log -r 'children(.^) and not .'
changeset:   418174:b7f1d672f3cd
will give it to you directly (see also hg help revsets), assuming you haven't done anything else and made a mess. Now rebase the old child on top of your new head:
% hg rebase -b b7f1d672f3cd -d .
It will leave you updated to your new head, or rather the changeset that was formerly a head, but the other changesets will now be your descendants. hg next a bunch of times to advance through them, or use my hg advance alias to go all the way, or do it directly:
% hg update -r 'heads(.::)'
(if you're not used to the crazy revset abbreviations, you may prefer to write that as hg update -r 'heads(descendants(.))'. I'm trying not to use too many abbreviations in this doc, but typing out "descendants" makes my fingers tired.)
Workspace management
So being able to jump all over your various feature bookmarks and things is cool, but I'm working with C++ here, and I hate touching files unnecessarily because it means a slow and painful rebuild. Personally, I keep two checkouts, ~/src/mozilla and ~/src/mozilla2. If I were more disciplined about disk space, I'd probably have a few more. Most people have several more. I used to have clever schemes of updating a master repository and then having all the others pull from it, but newer hg (and my DSL, I guess) is fast enough that I now just hg pull unified manually whenever I need to. I use the default objdir, located in the top directory of my source checkout, because I like to be able to run hg commands from within the objdir. But I suspect it messes me up because hg and watchman have to deal with a whole bunch of files within the checkout area that they don't care about.
watchman
Oh yeah, watchman. It makes many operations way, way faster. Or at least it did; recently, it often slows things down before it gives up and times out. Yeah, I ought to file a bug on it. The log doesn't say much.
I can't remember how to set up watchman, sorry. It looks like I built it from source? Hm, maybe I should update, then. You need two pieces: the watchman daemon that monitors the filesystem, and the python mercurial extension that talks to the daemon to accelerate hg operations. The latter part can be enabled with
[extensions]
fsmonitor =
Maybe ./mach bootstrap sets up watchman for you these days? And ./mach mercurial-setup sets up fsmonitor? I don't know.
Debugging
debug
I have this crazy Perl script that I've hacked in various horrible ways over the years. It's awful, and awfully useful. If I'm running the JS shell, I do
% debug ./obj-js-debug/dist/bin/js somefile.js
to start up Emacs with gdb running inside it in gud mode. Or
% debug --record ./obj-js-debug/dist/bin/js somefile.js
to make an rr recording of the JS shell, then bring up Emacs with rr replay running inside it. Or
% debug --record ./jstests.py /obj-js-debug/dist/bin/js sometest.js
to make an rr recording of a whole process tree, then find a process that crashed and bring up Emacs with rr replay on that process running inside it. Or
% debug
to bring up Emacs with rr replay running on the last rr recording I've made, again automatically picking a crashing process. If it gets it wrong, I can always do
% rr ps
# identify the process of interest
% debug --rrpid 1234
to tell it which one. Or
% ./mach test --debugger=debug some/test/file
to run the given test with hopefully the appropriate binary running under gdb inside Emacs inside the woman who swallowed a fly I don't know why. Or
% debug --js ./obj-js-debug/dist/bin/js somefile.js
to bring up Emacs running jorendb[7].
rr
I love me my rr. I have a .gdbinit.py startup file that creates a handy command for displaying the event number and tick count to keep me oriented chronologically:
(rr) now
2592/100438197
Or I can make rr display that output on every prompt:
(rr) set rrprompt on
rr-aware prompt enabled
(rr 538/267592) fin
Run till exit from #0 blah blah
(rr 545/267619)
I have a .gdbinit file with some funky commands to set hardware watchpoints on GC mark bits so I can 'continue' and 'reverse-continue' through an execution to find where the mark bits are set. And strangely dear to my heart is the 'rfin' command, which is just an easier to type alias for 'reverse-finish'. Other gdb commands:
(rr) log $thread sees the bad value  # $thread is replaced by eg T1
(rr) log also, obj is now $1         # gdb convenience vars ok
(rr) rfin
(rr) log {$2+4} bytes are required   # {any gdb expr}
(rr) n
(rr) log -dump
562/8443 T2 sees the bad value
562/8443 also, obj is now 0x7ff687749c00
346/945 7 bytes are required
(rr) log -sorted
   346/945 7 bytes are required
=> 562/8443 T2 sees the bad value
   562/8443 also, obj is now 0x7ff687749c00
(rr) log -edit  # brings up $EDITOR on your full log file
The idea is to be able to move around in time, logging various things, and then use log -sorted to display the log messages in chronological order according to the execution. (Note that when you do this, the next point in time coming up will be labeled with "=>" to show you when you are.) You might consider using this in conjunction with command as a simple way of automatically tracing the evolution of some value:
(rr) b HashTable::put
Breakpoint 1 set at HashTable::put(Entry)
(rr) comm 1
> log [$thread] in put(), table size is now {mEntries}
> cont
> end
(rr) c
Boom! You now have the value of mEntries every time put() is called. Or consider doing that with a hardware watchpoint. (But watch out for log messages in breakpoints; it will execute the log command every time you encounter the breakpoint, so if you go forwards and backwards across the breakpoint several times, you'll end up with a bunch of duplicate entries in your log. log -edit is useful for manually cleaning those up.)
Note that the default log filename is based on the process ID, and the logging will append entries across multiple rr replay runs. So if you run muliple sessions of rr replay on the same process recording, all of your log messages will be collected together. Use set logfile  to switch to a different file.
Finally, there's a simple pp command, where pp foo is equivalent to python print(foo).

[1] https://www.mercurial-scm.org/wiki/EvolveExtension - install evolve by cloning hg clone https://bitbucket.org/marmoute/mutable-history somewhere, then adding it into your ~/.hgrc:
[extensions]
evolve = ~/lib/hg/mutable-history/hgext/evolve.py
[2] "public" is the name of a mercurial phase. It means a changeset that has been pushed to mozilla-inbound or similar. Stuff you're working on will ordinarily be in the "draft" phase until you push it. ↩
[3] hg clone https://bitbucket.org/facebook/hg-experimental somewhere, then activate it with
[extensions]
chistedit = ~/lib/hg/hg-experimental/hgext3rd/chistedit.py
[4] When I have one patch per bug, I'll usually use hg bzexport --update to fill in the bug numbers. Especially since I normally file the bug in the first place with hg bzexport --new, so I don't even have a bug number until I do that.
[5] kdiff3-wrapper was pretty useful back in the day; kdiff3 has a bad habit of clearing the execute (chmod +x) bit when merging, so kdiff3-wrapper is a shell script that runs kdiff3 and then fixes up the bits afterwards. I don't know if it still has that issue?
[6] The quotes around "promise" translate more or less to "do not promise".
[7] jorendb is a relatively simple JS debugger that jorendorff wrote, I think to test the Debugger API. I suspect he's amused that I attempt to use it for anything practical. I'm sure the number of people for whom it is relevant is vanishingly small, but I love having it when I need it. (It's for the JS shell only. Nobody uses the JS shell for any serious scripting; why would you, when you have web browser and Node?) (I'm Nobody.)



				
		
				Uncategorized —   5 Comments

			    17
Dec 15 
				Animation Done Wrong (aka Fix My Code, Please!)

				
					A blast from the past: in early 2014, we enabled Generational Garbage Collection (GGC) for Firefox desktop. But this blog post is not about GGC. It is about my victory dance when we finally pushed to button to let it go live on desktop firefox. Please click on that link; it is, after all, what this blog post is about.
Old-timers will recognize the old TBPL interface, since replaced with TreeHerder. I grabbed a snapshot of the page after I pushed GGC live (yes, green pushes really used to be that green!), then hacked it up a bit to implement the letter fly-in. And that fly-in is what I’d like to talk about now.
At the time I wrote it, I barely knew Javascript, and I knew even less CSS. But that was a long time ago.
Today… er, well, today I don’t know any more of either of those than I did back then.
Which is ok, since my whole goal here is to ask: what is the right way to implement that page? And specifically, would it be possible to do without any JS? Or perhaps minimal JS, just some glue between CSS animations or something?
To be specific: I am very particular about the animation I want. After the letters are done flying in, I want them to cycle through in the way shown. For example, they should be rotating around in the “O”. In general, they’re just repeatedly walking a path that is possibly discontinuous (as with any letter other than “O”). We’ll call this the marquee pattern.
Then when flying in, I want them to go to their appropriate positions within the marquee pattern. I don’t want them to fly to a starting position and only start moving in the marquee pattern once they get there. Oh noes, no no no. That would introduce a visible discontinuity. Plus which, the letters that started out close to their final position would move very slowly at first, then jerk into faster motion when the marquee began. We couldn’t have that now, could we?
I knew about CSS animations at the time I wrote this. But I couldn’t (and still can’t) see how to make use of them, at least without doing something crazy like redefining the animation every frame from JS. And in that case, why use CSS at all?
CSS can animate a smooth path between fixed points. So if I relaxed the constraint above (about the fly-in blending smoothly into the marquee pattern), I could pretty easily set up an animation to fly to the final position, then switch to a marquee animation. But how can you get the full effect? I speculated about some trick involving animating invisible elements with one animation, then having another animation to fly-in from each element’s original location to the corresponding invisible element’s marquee location, but I don’t know if that’s even possible.
You can look at the source code of the page. It’s a complete mess, combining as it does a cut and paste of the TBPL interface, plus all of jquery crammed in, and then finally my hacky code at the end of the file. Note that this was not easy code for me to write, and I did it when I got the idea at around 10pm during a JS work week, and it took me until about 4am. So don’t expect much in the way of comments or sanity or whatnot. In fact, the only comment that isn’t commenting out code appears to be the text “????!”.
The green letters have the CSS class “success”. There are lots of hardcoded coordinates, generated by hand-drawing the text in the Gimp and manually writing down the relevant coordinates. But my code doesn’t matter; the point is that there’s some messy Javascript called every frame to calculate the desired position of every one of those letters. Surely there’s a better way? (I should note that the animation is way smoother in today’s Nightly than it was back then. Progress!)
Anyway, the exact question is: how would someone implement this effect if they actually knew what they were doing?
I’ll take my answer off the air. In the comments, to be precise. Thank you in advance.
				
								

			

				
		
				Uncategorized —   3 Comments

			    05
Jun 15 
				Firefox directions

				
					Some time back, I started thinking about what Firefox could do for me and its other users. Here are my thoughts, unashamedly biased and uninformed. If you don’t think something here is an awful idea, either I’ve failed or you aren’t reading closely enough.
Mozilla-specific advantages
Mozilla provides Firefox with a unique situation. We are not directly compelled to monetize Firefox or any other product. We do not need to capture users or wall them in. That doesn’t mean we don’t want to make money or gain market share — eventually, we need both, or Mozilla’s influence on the Web dies — but we have a degree of freedom that no other big player possesses.
Privacy and User Sovereignty
We can afford to give our users as much privacy as they might want. If you ask the vast majority of users about this, I suspect they’ll think it sounds like a theoretically good thing, but they don’t know what it really means, they don’t know what Firefox can do for them that other browsers don’t, and they don’t have any strong reason to care. All three of those are true of me, too.
Let’s tell them. Make an about:privacy that shows a chart of privacy features and behaviors of Firefox, Mozilla and its servers, and our major competitors. Give specific, realistic examples of information that is transmitted, what it might be used for (both probable and far-fetched). Include examples from jealous boyfriends, cautious employers, and restrictive regimes. Expose our own limitations and dirty laundry: “if this server gets hacked or someone you don’t like has privileged access, they will see that you crash a lot on youporn.com“. It won’t all fit on a page, but hopefully some impactful higher-order bits can be presented immediately, with links to go deeper. Imagine a friendly journalist who wants to write an article claiming that Firefox is the clear choice for people who care about controlling their personal data and experiences on the web: our job is to provide all the ammunition they’d need to write a convincing and well-founded article. Don’t bias or sugarcoat it — using Firefox instead of Chrome is going to protect very little from identity theft, and Google has more resources dedicated to securing their servers than we do.
If possible, include the “why”. We won’t track this data because it isn’t useful to us and we err on the side of the user. Chrome will because it’s part of their business model. Mention the positive value of lock-in to a corporation, and point out just how many sources of information Google can tap.
Update: Wait! Hold on! As a commenter pointed out, that is the exact sort of bias I just said we shouldn’t use. Google does not use Chrome to gather data as I implied. I was wrong, and made assumptions based on uninformed opinions about the motivations involved and their ramifications. Google has an incentive to limit its data collection, since not doing so would anger their users. In the end, I still feel like Mozilla is more free to side with the user than Google is, and I have to believe that now or in the future there will be significant real differences in behavior as a result, but collecting the sort of data I was implying through the browser is not one of those differences.
Anyway, back to talking about how Firefox can highlight Mozilla’s privacy advantages:
Point to Lightbeam. Make cookies visible — have them appear temporarily as floating icons when they are sent, and show them in View Source. Notify the user when a password or credit card number is sent unencrypted. Allow the user to delete and modify cookies. Or save them to external files and load them back in. Under page info, enumerate as much identity information as we can (as in, show what the server can figure out, from cookies to OS to GL capabilities.)
Gaming
I don’t know if it’s just because nobody else needs to care yet, but it seems like we have a lead on gaming in the browser. It’s an area where players would be willing to switch browsers, even if only temporarily, to get a 10% higher frame rate. Until rich web gaming starts capturing a substantial number of user hours, it doesn’t seem like the other browser manufacturers have enough of a reason to care. But if we can pull people out of the extremely proprietary and walled-off systems that are currently used for gaming and get them onto the open web, then not only do we get a market share boost but we also expand the range of things that people commonly do on our open platform. It’ll encourage remixing and collaboration and pushing the envelope, counteracting humanity’s current dull descent into stupefied consumption. The human race will develop more interconnections, develop better ways of resolving problems, and gain a richer and stronger culture for the Borg to destroy when they finally find us.
Er, sorry. Got a little carried away there. Let me just say that gaming represents more than obsessive self-indulgence. It is a powerful tool for communication and education and culture development and improved government. You’ll never truly understand in your bones how the US won its war for independence until you’ve lived it — or at least, simulated it. (Hint: it’s not because our fighters had more hit points.)
Addons
Addons are a major differentiator for Firefox. And most of them suck. Even ignoring the obvious stuff (malware, adware, etc.), for which plans are in motion to combat them, it still seems like addons aren’t providing the value they could be. People have great ideas, but sadly Chrome seems to be the main beneficiary these days. Some of that is simply due to audience size, but I don’t think that’s all of it.
I know little about addons, but I have worked on a few. At least for what I was doing, they’re a pain to write. Perhaps I always just happen to end up wanting to work on the trickiest pieces to expose nicely, but my experience has not been pleasant. How do you make a simple change and try it out? Please don’t make me start up (or restart) the browser between every change. What’s the data model of tabs and windows and things? What’s the security model? I periodically try to work on a tab management extension, but everything I do ends up getting silently ignored, probably because it doesn’t have the right privileges. I asked lots of questions at the last Summit but the answers were complicated, and incomprehensible to someone like me who is unfamiliar with how the whole frontend is put together.
And why isn’t there straightforward code that I can read and adapt? It seems like the real code that drives the browser looks rather different from what I’d need for my own addon. Why didn’t it work to take an existing addon, unpack it, modify it, and try it out? Sure, I probably did something stupid and broke it, but the browser wasn’t very good at telling me what.
That’s for complicated addons. Something else I feel is missing is super lightweight addons. Maybe Greasemonkey gives you this; I’ve barely used it. But say I’m on a page, or better yet a one-page app. I want something a little different. Maybe I want to remove a useless sidebar, or add a button that autofills in some form fields, or prevent something from getting grayed out and disabled, or iterate through an external spreadsheet to automatically fill out a form and submit it, or autologin as a particular user, or maybe just highlight every instance of a certain word. And I want this to happen whenever I visit the page. Wouldn’t it be great if I could just right-click and get an “automatic page action” menu or something? Sure, I’d have to tell it how to recognize the page, and it might or might not require writing JavaScript to actually perform the action. But if the overhead of making a simple addon could be ridiculously low, and it gave me a way of packaging it up to share with other people (or other computers of mine), it could possibly make addons much more approachable and frequently used.
It would also be an absolute disaster, in that everyone and her dog would start writing tiny addons to do things that really shouldn’t be done with addons. But so be it. Think of something easy enough to be suggested in a support document as a workaround for some site functionality gap. Even better, I’d like the browser (or, more likely, an addon-generating addon) to automatically do version control (perhaps by auto-uploading to github or another repo?), and make it easy to write self-tests and checks for whether the required page and platform functionality are still present.
Addons also don’t feel that discoverable. I can search by name, but there’s still the matter of guessing  how serious (stable, maintained, high quality) an addon is. It turns my stomach to say this, but I kind of want a more social way of browsing and maintaining lists of addons. “People who are mentally disturbed in ways similar to you have left these addons enabled for long periods of time without disabling them or removing them in a fit of anger: …” Yes, this would require a degree of opt-in tracking evil, but how else can I find my true brethren and avoid the polluted mindset of godless vi-using heathens?
Hey, remember when we pissed off our addon authors by publicly shaming them with performance measurements? Could we do something similar, but only expose the real dirt after you’ve installed the addon?
Which brings me to addon blaming. It’s very hard to correctly blame a misbehaving addon, which makes me too conservative about trying out addons. I would be more willing to experiment if I had a “Why Does My Firefox Suck Right Now?” button that popped up an info box saying “because addon DrawMustachesOnCatPictures is serializing all of your image loads”. Ok, that’s probably too hard — how about just “addon X is eating your CPU”?
Why Does My Firefox Suck Right Now?
On a related note, I think a big problem is that Firefox sometimes behaves very badly and the user doesn’t know why. We really need to get better at helping people help themselves in diagnosing these problems. It feels like a shame to me when somebody loves Firefox, but they start running into some misbehavior that they can’t figure out. If we’re really lucky, they’ll try the support forums. If that doesn’t work, or they couldn’t be bothered in the first place, they come to somebody knowledgeable and ask for help. The user is willing to try all kinds of things: install diagnostic tools, email around attachments of log files, or whatever — but as far as I can tell these things are rarely useful. And they should be. We’re not very good at gathering enough data to track the problem down. A few things serve as illustrative counterexamples: restarting in safe mode is enormously helpful, and about:memory is a great tool that can pinpoint problems. Theoretically, the profiler ought to be good for diagnosing slowdowns and hangs, but I haven’t gotten much out of it in practice. (Admittedly, my own machine is Linux, and the stackwalking has never worked well enough here. But it hasn’t been a silver bullet for my friends’ Windows machines either.)
These are the sorts of situations where we are at high risk of losing users. If a PC runs Sunspider 5% faster but opening a new tab mysteriously takes 5 seconds, somebody’s going to switch browsers. Making the good case better is far less impactful than eliminating major suckage. If somebody comes to us with a problem, we should have a very well-worked out path to narrow it down to an addon or tab or swapping or networking or virus scanning or holy mother of bagels, you have *how* many tabs open?! Especially if IE and Chrome do just fine on the same computer (empty profiles or not.)
Browsing the F*ing Web
That’s what Firefox is for, right? So I have some problems there too. What’s the deal with tabs? I like opening tabs. It means I want to look at something.
I’m not fond of closing tabs. I mean, it’s fine if I’m done with whatever I was looking at. But that’s only one tab, and it’s not enough to keep other tabs from accumulating. Closing any other tab means I have to stop what I’m doing to think about whether I still want/need the tab. It’s like picking up trash. I’m willing to accept the necessity in real life, but in a computer-controlled virtual space, do I really have to?
Sadly, that means a tab explosion. Firefox is good about allowing it to happen (as in, large numbers of tabs generally work surprisingly well), but pretty crappy at dealing with the inevitable results. I know lots of people have thought hard on how to improve things here, but none of the solutions I’ve seen proposed felt exactly right.
I don’t have a solution either, but I’ll propose random things anyway:
Tabs vs bookmarks vs history is artificial. They’re all stuff you wanted at some point, some of which you want now, and some of which you’ll want again in the future. I want perfection: I want to open tabs forever without ever needing to close any, but I want the interface to only display the tabs I’m interested in right now.
Bookmarks are just tabs that I claim I might want again in the future, but I don’t want to clutter up my tab bar with right now. History additionally has all the tabs that I claim I don’t need to see again, except maybe sometime when I remember that I’ve seen something before and need it again.
Yes, I am misusing “tabs” to mean “web pages”. Sue me.
So. Let me have active tabs, collected in some number of windows, preferably viewable on the left hand side in a hierarchical organization à la Tree Style Tabs. Give me buttons on the tabs to quickly say “I don’t care at all about this anymore”, “categorize for when I want to browse by topic and find it again”, “queue this up [perhaps in a named queue] for when I am avoiding work”, and “I only want this cluttering my screen as long as these other tabs are still visible”. (Those correspond to “close tab”, “bookmark tab”, “enqueue tab”, and “reparent tab”.) Allow me to find similar tabs and inform the browser about those, too. Right-clicking on a bugzilla tab ought to give me a way to find all bugzilla tabs and close them en masse, or reparent them into a separate group. Make it easy to scan through tab groups, enqueue some, and then close the rest. I should be able to sort all the tabs by the last time I looked at them, so I can kill off the ancient ones — without losing my original sort order.
Some context: I have a lot of tabs open. Many more than fit on one screen (even using Tree Style Tabs.) Cleaning them up is a pain because of the soggy middle: the ones early in the list are things that I’ve had around for a long time and resisted closing because they’re useful or I really really want to get around to reading them. The ones late in the list are recently open and likely to be current and relevant. The stuff in the middle is mostly crap, and I could probably close a hundred in a minute or two, except the tab list keeps jumping around and when I click in the middle I keep having to wait for the unloaded pages to load just so I can kill them.
I want to throw those ancient tabs in a “to read” queue. I want to find all of my pastebin tabs and kill them off, or maybe list them out in age order so I can just kill the oldest (probably expired anyway) ones. I don’t want the “to read” queue in my active tab list most of the time, but I want to move it in (drag & drop?) when I’m in the mood. I want to temporarily group my tabs by favicon and skim through them, deleting large swathes. I want to put the knot-tying tab and origami instruction tab into a separate “to play with” folder or queue. I want to collect my set of wikipedia pages for Jim Blandy’s anime recommendations into a group and move them to a bookmark bar, which I may want to either move or copy back to the active tab list when I’m ready to look at them again. I want to kill off all the bugzilla pages except the ones where I’ve entered something into a form field. I want to skim through my active tab list with j/k keys and set the action for each one, to be performed when I hit the key to commit to the actions. I want undo. I want one of those actions, a single keystroke, to set which window the tabs shows up in the active tabs list. I want to sort the tabs by memory usage or CPU time. I want to unload individual tabs until I select them again.
I want a lot of stuff, don’t I?
Here is the place I originally intended to start talking about loaded and unloaded tabs, the perils and advantages of auto-unloading, and all that, but:
This Post
I just checked the timestamp on this post. I wrote it on August 14, 2014, and have sat on it for nearly a year. It’s been waiting for the day that I’ll finally get around to finishing it up, perhaps splitting it into several blog posts, etc. Thanks to Yoric’s Firefox re-imaginings I came back to look, and realized that what’s going to happen is that this will get old and obsolete and just die. I’d be better off posting it, rough and incomplete as it is. (And I *still* haven’t watched those anime. Where has time gone?!)
Whoa. I just looked at the preview, and this post is *long*. Sorry about that. If I were a decent human being, I would have chopped it up into digestible pieces. I guess I just don’t like you that much.
				
								

			

		
		
			← Older Entries
			Newer Entries →

sfink @ Mozilla One more Blog.mozilla.com weblog than you need

Archives

Type examination in gdb

`offset`

`pahole`

Installing

Footnotes

sfink Mozilla workflow

Intro

Code Management

Pulling from upstream

What’s my state?

Working on code

Updating, bookmarking

Making commits

Splitting out changes for multiple patches

Navigating through your stack

Resolving conflicts

Uploading patches to bugs

Uploading again

Incorporating review comments

Landing

Forking your stack

Workspace management

watchman

Debugging

debug

rr

Animation Done Wrong (aka Fix My Code, Please!)

Firefox directions

Mozilla-specific advantages

Privacy and User Sovereignty

Gaming

Addons

Why Does My Firefox Suck Right Now?

Browsing the F*ing Web

This Post

Archives