Jan 16

for-purpose instead of non-profit

I began talking with a guy in his midforties who ran an investment fund and told me about his latest capital raise. We hit it off while discussing the differences between start-ups on the East and West Coasts, and I enjoyed learning about how he evaluated new investment opportunities. Although I’d left that space a while ago, I still knew it well enough to carry a solid conversation and felt as if we were speaking the same language. Then he asked what I did.

“I run a nonprofit organization called Pencils of Promise.”

“Oh,” he replied, somewhat taken aback. “And you do that full-time?”

More than full-time, I thought, feeling a bit judged. “Yeah, I do. I used to work at Bain, but left to work on the organization full-time.”

“Wow, good for you,” he said in the same tone you’d use to address a small child, then immediately looked over my shoulder for someone new to approach…

On my subway ride home that night I began to reflect on the many times that this scenario had happened since I’d started Pencils of Promise. Conversations began on an equal footing, but the word nonprofit could stop a discussion in its tracks and strip our work of its value and true meaning. That one word could shift the conversational dynamic so that the other person was suddenly speaking down to me. As mad as I was at this guy, it suddenly hit me. I was to blame for his lackluster response. With one word, nonprofit, I had described my company as something that stood in stark opposition to the one metric that his company was being most evluated by. I had used a negative word, non, to detail our work when that inaccurately described what we did. Our primary driver was not the avoidance of profits, but the abundance of social impact…

That night I decided to start using a new phrase that more appropriately labeled the motivation behind our work. By changing the words you use to describe something, you can change how other perceive it. For too long we had allowed society to judge us with shackling expectations that weren’t supportive of scale. I knew that the only way to win the respect of our for-profit peers would be to wed our values and idealism to business acumen. Rather than thinking of ourselves as nonprofit, we would begin to refer to our work as for-purpose.

From The Promise of a Pencil by Adam Braun.

Jan 16

gecko and c++ onboarding presentation

One of the things the Firefox team has been doing recently is having onboarding sessions for new hires. This onboarding currently covers:

  • 1st day setup
  • Bugzilla
  • Building Firefox
  • Desktop Firefox Architecture / Product
  • Communication and Community
  • Javascript and the DOM
  • C++ and Gecko
  • Shipping Software
  • Telemetry
  • Org structure and career development

My first day consisted of some useful HR presentations and then I was given my laptop and a pointer to a wiki page on building Firefox.  Needless to say, it took me a while to get started!  It would have been super convenient to have an introduction to all the stuff above.

I’ve been asked to do the C++ and Gecko session three times.  All of the sessions are open to whoever wants to come, not just the new hires, and I think yesterday’s session was easily the most well-attended yet: somewhere between 10 and 20 people showed up.  Yesterday’s session was the first session where I made the slides available to attendees (should have been doing that from the start…) and it seemed equally useful to make the slides available to a broader audience as well. The Gecko and C++ Onboarding slides are up now!

This presentation is a “living” presentation; it will get updated for future sessions with feedback and as I think of things that should have been in the presentation or better ways to set things up (some diagrams would be nice…).  If you have feedback (good, bad, or ugly) on particular things in the slides or you have suggestions on what other things should be covered, please contact me!  Next time I do this I’ll try to record the presentation so folks can watch that if they prefer.

Oct 15

gecko include file statistics

I was inspired to poke at which files were most heavily #include‘d and which files contributed the most text as a result of their #include‘ing after seeing the simplicity of Libre Office’s script for doing so. I had to rewrite it in Python, as the obvious modifications to the awk script weren’t working, and I had no taste for debugging awk code. I’ve put the script up as a gist:

It’s intended to be run from a newly built objdir on Linux like so:

python includebloat.py .

The ability to pick a subdirectory of interest:

python includebloat.py dom/bindings/

was useful to me when I was testing the script, so I wasn’t groveling through several thousand files at a time.

The output lines are formatted like so:

total_size file_size num_of_includes filename

and are intended to be manipulated further via sort, etc. The script might work on Mac and Windows, but I make no promises.

The results were…interesting, if not especially helpful at suggesting modifications for future work. I won’t show the entirety of the script’s output, but here are the top twenty files by total size included (size of the file on disk multiplied by number of times it appears as a dependency), done by filtering the script’s output through sort -n -k 1 -r | head -n 20 | cut -f 1,4 -d ' ':

332478924 /usr/lib/gcc/x86_64-linux-gnu/4.9/include/avx512fintrin.h
189877260 /home/froydnj/src/gecko-dev.git/js/src/jsapi.h
161543424 /usr/include/c++/4.9/bits/stl_algo.h
141264528 /usr/include/c++/4.9/bits/random.h
113475040 /home/froydnj/src/gecko-dev.git/xpcom/glue/nsTArray.h
105880002 /usr/include/c++/4.9/bits/basic_string.h
92449760 /home/froydnj/src/gecko-dev.git/xpcom/glue/nsISupportsImpl.h
86975736 /usr/include/c++/4.9/bits/random.tcc
76991387 /usr/include/c++/4.9/type_traits
72934768 /home/froydnj/src/gecko-dev.git/mfbt/TypeTraits.h
68956018 /usr/include/c++/4.9/bits/locale_facets.h
68422130 /home/froydnj/src/gecko-dev.git/js/src/jsfriendapi.h
66917730 /usr/include/c++/4.9/limits
66625614 /home/froydnj/src/gecko-dev.git/xpcom/glue/nsCOMPtr.h
66284625 /usr/include/x86_64-linux-gnu/c++/4.9/bits/c++config.h
63730800 /home/froydnj/src/gecko-dev.git/js/public/Value.h
62968512 /usr/include/stdlib.h
57095874 /home/froydnj/src/gecko-dev.git/js/public/HashTable.h
56752164 /home/froydnj/src/gecko-dev.git/mfbt/Attributes.h
56126246 /usr/include/wchar.h

How does avx512fintrin.h get included so much? It turns out <algorithm> drags in a lot of code, despite people usually only needing min, max, or swap. In this case, <algorithm> includes <random> because std::shuffle requires std::uniform_int_distribution from <random>. This include chain is responsible for essentially all of the /usr/include/c++/4.9-related files in the above list.

If you are compiling with SSE2 enabled (as is the default on x86-64 Linux), then<random> includes <x86intrin.h> because <random> contains a SIMD Mersenne Twister implementation. And <x86intrin.h> is a clearinghouse for all sorts of x86 intrinsics, even though all we need is a few typedefs and intrinsics for SSE2 code. Minus points for GCC header cleanliness here.

What about the top twenty files by number of times included (filter the script’s output through sort -n -k 3 -r | head -n 20 | cut -f 3,4 -d ' ')?

2773 /home/froydnj/src/gecko-dev.git/mfbt/Char16.h
2268 /home/froydnj/src/gecko-dev.git/mfbt/Attributes.h
2243 /home/froydnj/src/gecko-dev.git/mfbt/Compiler.h
2234 /home/froydnj/src/gecko-dev.git/mfbt/Types.h
2204 /home/froydnj/src/gecko-dev.git/mfbt/TypeTraits.h
2132 /home/froydnj/src/gecko-dev.git/mfbt/Likely.h
2123 /home/froydnj/src/gecko-dev.git/memory/mozalloc/mozalloc.h
2108 /home/froydnj/src/gecko-dev.git/mfbt/Assertions.h
2079 /home/froydnj/src/gecko-dev.git/mfbt/MacroArgs.h
2002 /home/froydnj/src/gecko-dev.git/xpcom/base/nscore.h
1973 /usr/include/stdc-predef.h
1955 /usr/include/x86_64-linux-gnu/gnu/stubs.h
1955 /usr/include/x86_64-linux-gnu/bits/wordsize.h
1955 /usr/include/x86_64-linux-gnu/sys/cdefs.h
1955 /usr/include/x86_64-linux-gnu/gnu/stubs-64.h
1944 /usr/lib/gcc/x86_64-linux-gnu/4.9/include/stddef.h
1942 /home/froydnj/src/gecko-dev.git/mfbt/Move.h
1941 /usr/include/features.h
1921 /opt/build/froydnj/build-mc/js/src/js-config.h
1918 /usr/lib/gcc/x86_64-linux-gnu/4.9/include/stdint.h

Not a lot of surprises here. A lot of these are basic definitions for C++ and/or Gecko (<stdint.h>, mfbt/Move.h).

There don’t seem to be very many obvious wins, aside from getting GCC to clean up its header files a bit. Getting us to the point where we can use <type_traits> instead of own homegrown mfbt/TypeTraits.h would be a welcome development. Making js/src/jsapi.h less of a mega-header might help some, but brings of a burden of “did I remember to include the correct JS header files”, which probably devolves into people cutting-and-pasting complete lists, which isn’t a win. Splitting up nsISupportsImpl.h seems like it could help a little bit, though with unified compilation, I suspect we’d likely wind up including all the split-up files at once anyway.

Sep 15

compiler-enforced locked accesses

If you’ve done any amount of threaded programming, you’ve probably run across code that looked like:

  // Only accessed with the mutex held.
  uint32_t mFlags;
  bool mConnected;
  nsTArray<int32_t> mData;

  // Only called with the mutex held.
  void DoSomething();

Perhaps you’ve even gotten to debug code which inadvertently violated the locking requirements of the members.

Several months ago, I reviewed a patch by David Keeler that addressed the second half of the above example. Methods that had locking requirements looked like:

  void DoSomething(MutexAutoLock& aProofOfLock);

which ensures (at compile time!) that you can’t call the function without locking the mutex first. I thought this was a nice technique, said as much in my review, and have been looking for places to apply it ever since.

The explicitness and the requirement to constantly pass MutexAutoLock& variables around is a feature, not a bug. Doing so encourages you to limit the amount of code that needs to be executed with locks held, keeping the concurrent parts of the code and the synchronized parts of the code clearly delimited. In this respect, this technique is similar to the IO monad in Haskell. I don’t know whether the extra verbosity would make the code more difficult to read or not, especially if the techniques for guarding members suggested below were applied as well.

This coding style also came in handy a week or so ago investigating overly high CPU usage when playing YouTube videos. Our event queue for nsIRunnables did its own locking internally, and exposed its internal reentrant monitor “for power users”. This led to code like:

  ReentrantMonitorAutoEnter mon(mEvents.GetReentrantMonitor());

where the PutEvent call would do an extra round of locking (“just to be sure”), which was wasted work. Data structures like this doing their own locking internally typically isn’t a great idea, so part of the work in the above bug was to separate the locking requirements of the queue from who actually needs to do the locking. Or, in other words, we can have the class that owns the event queue do the locking, and have the event queue’s methods enforce the locking programmatically:

  MonitorAutoLock mon(mMonitor);
  mEvents.PutEvent(event, mon);

Now there’s no wasted work in PutEvent, because it already knows the appropriate locking has been done. The ability to use non-reentrant monitors—which are more efficient—was a nice bonus resulting from the separation of concerns here.

This technique can also help solve the first half of the problem we presented at the beginning of this post: ensuring members are only accessed with locks held.

template<typename T>
class Guarded;

class Guarded<uint32_t>
  Guarded() : mValue(0) {}

  uint32_t Value(MutexAutoLock& aProofOfLock)
    return mValue;

  void Assign(MutexAutoLock& aProofOfLock, uint32_t aNewValue)
    mValue = aNewValue;

  // Since accesses should only be done under the lock, and copying
  // and moving would therefore require locks, we require the user
  // to ensure those constraints are met with explicit calls to the
  // above rather than the compiler sneaking unlocked accesses in.
  Guarded(const Guarded&) = delete;

  uint32_t mValue;

The above class isn’t production quality code; it’s intended to sketch out how explicit locking requirements might work. A more robust version might require a Mutex& reference to be passed to the constructor, and member functions assert that the MutexAutoLock& parameters actually lock the specified mutex. Specializing for each of the integer types would also get tiresome, so we’d need to do a better job there. Handling types with methods could be done with something like the following, I think:

template<typename T>
class GuardedAggregate
  GuardedAggregate() : mValue() {}

  // The core idea here is that the user would write:
  // GuardedAggregrate<nsTArray> mArray;
  // and then accesses would be done via:
  // mArray.Value(lock).Method(...);
  // This means that we don't have to proxy every single one of
  // the aggregate's methods, but the locking requirements are
  // still explicit.
  class Proxy
    Proxy(MutexAutoLock& aProofOfLock, T* aValue) : mValue(aValue)

    T* operator->()
      return mValue;

    T* mValue;

  Proxy Value(MutexAutoLock& aProofOfLock)
    return Proxy(aProofOfLock, &mValue);

  T mValue;

This can also be though of as a compiler-independent, but less flexible version of clang’s Thread Safety Analysis. Folks have been asking about bringing the annotations that analysis requires into Gecko; I wonder if it might work just as well to apply this technique more liberally throughout the codebase.

Sep 15

standardizing things my way

I was reading The Digital Doctor: Hope, Hype, and Harm at the Dawn of Medicine’s Computer Age and ran across a passage that resonated:

Everybody, of course, supports standardization—in theory. But human beings (particularly, but not exclusively, famous Harvard professors practicing at famous Boston hospitals) want things to be standardized their way. The difficulty that doctors face in accepting a workplace that is not custom-designed around their personal preferences is captured in this old joke about the physician staffs of hospitals: What do you call a 99-1 vote of the medical staff? A tie.

Examples abound: coding styles, version control systems, code review systems…

Aug 15

explicit is better than implicit: c++ implicitly defined member functions

In the tradition of The Zen of Python, I’ve been thinking about pushing for explicit declarations of otherwise implicitly-defined member functions in C++, both in code that I write and in code that I review:

// Instances of this class should not be copied.
MyClass(const MyClass&) = delete;
MyClass& operator=(const MyClass&) = delete;

// We are OK with the default semantics.
OtherClass(const OtherClass&) = default;
OtherClass& operator=(const OtherClass&) = default;
OtherClass(OtherClass&&) = default;
OtherClass& operator=(OtherClass&&) = default;

[Background: C++ specifies several member functions that the compiler will implicitly define for you in any class: the default constructor, the copy/move constructor(s), and the copy/move assignment operator(s). I say “implicitly define”, as though that always happens, but there are a number of constraints on when the compiler will do this. For the purposes of the discussion below, I’ll ignore the default constructor bit and focus on the copy/move constructor and assignment operator. (I will also happily ignore all the different variants thereof that can occur, e.g. when the compiler defines MyClass(MyClass&) for you.) I think the arguments apply equally well to the default constructor case, but most classes I deal with tend to either declare their own default constructor or have several user-defined constructors anyway, which prohibit the compiler from implicitly declaring the default constructor.]

I think the argument for = delete is more obvious and less controversial, so I’ll start there.  = delete‘ing functions you don’t want used is part of the API contract of the class.  Functions that shouldn’t be used shouldn’t be exposed to the user, and = delete ensures that the compiler won’t implicitly define part of your API surface (and users thereby unknowingly violate API guarantees).  The copy constructor/assignment operator are the obvious candidates for = delete, but using = delete for the move constructor/assignment operator makes sense in some cases (e.g. RAII classes). Using = delete gives you pleasant compiler error messages, and it’s clearer than:

  MyClass(const MyClass&);
  MyClass& operator=(const MyClass&);

If you’re lucky, there might be a comment to the effect of // Deliberately not defined.  I know which code I’d prefer to read. (Using = delete also ensures you don’t accidentally use the not-defined members inside the class itself, then spend a while waiting for the linker errors to tell you about your screw-up.)

= default appears to be a little harder to argue for.  “Experienced” programmers always know which functions are provided by the compiler, right?

Understanding whether the compiler implicitly defines something requires looking at the entire class definition (including superclasses) and running a non-trivial decision algorithm. I sure don’t want each reader of the code to do that for two or four different member functions (times superclasses, too), all of which are reasonably important in understanding how a class is intended to be used.

Explicitly declaring what you intend can also avoid performance pitfalls. In reading through the C++ specification to understand when things were implicitly declared, I discovered that the same functions can also be implicitly deleted, including this great note: “When the move constructor is not implicitly declared or explicitly supplied, expressions that otherwise would have invoked the move constructor may instead invoke a copy constructor.” So, if the move constructor was implicitly declared at some point, but then was implicitly deleted through some change, expressions that were previously efficient due to moving would become somewhat less so due to copying. Isn’t C++ great?

Being explicit also avoids the possibility of meaning to define something, but getting tripped up by the finer points of the language:

template<typename T>
class MyClass
  // This does not define a copy constructor for MyClass<T>.
  template<typename U>
  MyClass(const MyClass<U>& aOther) : ... { ... }

Comments could serve to notify the reader that we’re OK with the default definition, but if I could choose between encoding something in a place solely intended for humans, or a place both humans and the compiler will understand, I know which one I’d pick.

May 15

white space as unused advertising space

I picked up Matthew Crawford’s The World Outside Your Head this weekend. The introduction, subtitled “Attention as a Cultural Problem”, opens with these words:

The idea of writing this book gained strength one day when I swiped my bank card to pay for groceries. I watched the screen intently, waiting for it to prompt me to do the next step. During the following seconds it became clear that some genius had realized that a person in this situation is a captive audience. During those intervals between swiping my card, confirming the amount, and entering my PIN, I was shown advertisements. The intervals themselves, which I had previously assumed were a mere artifact of the communication technology, now seemed to be something more deliberately calibrated. These haltings now served somebody’s interest.

I have had a similar experience: the gas station down the road from me has begun playing loud “news media” clips on the digital display of the gas pump while your car is being refueled, cleverly exploiting the driver as a captive audience. Despite this gas station being somewhat closer to my house and cheaper than the alternatives, I have not been back since I discovered this practice.

Crawford continues, describing how a recent airline trip bombarded him with advertisements in “unused” (“unmonetized”?) spaces: on the fold-down tray table in his airplane seat, the moving handrail on the escalator in the airport, on the key card (!) for his hotel room. The logic of filling up unused space reaches even to airport security:

But in the last few years, I have found I have to be careful at the far end of [going through airport security], because the bottoms of the gray trays that you place your items in for X-ray screening are now papered with advertisements, and their visual clutter makes it very easy to miss a pinky-sized flash memory stick against a picture of fanned-out L’Oréal lipstick colors…

Somehow L’Oréal has the Transportation Security Administration on its side. Who made the decision to pimp out the security trays with these advertisements? The answer, of course, is that Nobody decided on behalf of the public. Someone made a suggestion, and Nobody responded in the only way that seemed reasonable: here is an “inefficient” use of space that could instead be used to “inform” the public of “opportunities.” Justifications of this flavor are so much a part of the taken-for-granted field of public discourse that they may override our immediate experience and render it unintelligible to us. Our annoyance dissipates into vague impotence because we have no public language in which to articulate it, and we search instead for a diagnosis of ourselves: Why am I so angry? It may be time to adjust the meds.

Reading the introduction seemed especially pertinent to me in light of last week’s announcement about Suggested Tiles. The snippets in about:home featuring Mozilla properties or efforts, or even co-opting tiles on about:newtab for similar purposes feels qualitatively different than using those same tiles for advertisements from third parties bound only to Mozilla through the exchange of money. I have at least consented to the former, I think, by downloading Firefox and participating in that ecosystem, similar to how Chrome might ask you to sign into your Google account when the browser opens. The same logic is decidedly not true in the advertising case.

People’s attention is a scarce resource. We should be treating it as such in Firefox, not by “informing” them of “opportunities” from third parties unrelated to Mozilla’s mission, even if we act with the utmost concern for their privacy. Like the gas station near my house, people are not going to come to Firefox because it shows them advertisements in “inefficiently” used space. At best, they will tolerate the advertisements (maybe even taking steps to turn them off); at worst, they’ll use a different web browser and they won’t come back.

Apr 15

recent tsan improvements for firefox

One of my goals for Q2 was to make Firefox usable with “modern” (version 3.6+) clang. For reasons previously unknown, Firefox would only work with relatively old clang (version ~3.3); indeed, our wiki page for TSan recommends checking out specific SVN versions!

I’m happy to say that goal has been met.  If you like, you can read (some of) the gory details in bug 1153244.  Checking out a development version of Clang and building that should enable you to test Firefox (from mozilla-central, since that’s where the aforementioned bug has landed thus far) with TSan.  Upgrading through several major releases has its benefits: TSan appears to be somewhat more stable—tests that would previously timeout reliably now run without complaint.  I’m not sure that it’s quite reliable enough to run in automation yet, but running TSan tests in automation is definitely more feasible than it was last quarter.

With an eye towards doing that, and also with an eye towards improving individual developer experience, I’ve recently landed several patches that teach our test harnesses about TSan-specific things.  For instance, when we’re running mochitests, we only care about races in the browser itself; we don’t care about races while running any of the supporting tools (particularly xpcshell, which shares a lot of code with the browser through libxul).  So TSan should only be enabled for the actual browser executable.  We already do similar things for our ASan builds.

I also discovered that my local builds weren’t using the llvm-symbolizer tool to produce backtraces in TSan’s reports, but instead defaulting to using addr2lineaddr2line is a fine tool, but llvm-symbolizer understands debug information better (for instance, it shows you synthesized frames for inlined functions, which is a huge win) and llvm-symbolizer is somewhat faster than addr2line. In my case, it was a combination of llvm-symbolizer not being in my PATH, and not informing TSan about the location of llvm-symbolizer ($OBJDIR/dist/bin/, usually). The former was addressed by setting LLVM_SYMBOLIZER in my mozconfig (not necessary if llvm-symbolizer is on your PATH), and the latter was addressed by adding a bit of code to our test scripts.

Now, to update the wiki with better instructions!

Apr 15

on development speedbumps

Last week, I ran into a small speedbump with my development process. I normally write my patches and commit them to git with initial lines that look like:

fix IPDL thinko for never-inline method declarations; r=bent

Then, when I use git-bz to push my patches to bugzilla, it autofills the r? flag for the patch to :bent, according to the r= bit specified in the commit message. Quite convenient.

Except last week, something broke when trying to use that feature. Bugzilla would complain that :bent was an invalid user. “OK, fine,” I initially thought, “maybe :bent is unavailable.” (People often change their Bugzilla name to not include :NAME when they won’t be reachable for a period of time.) Inspection showed this to not be the case. Although the initial evaluation was that git-bz was old and crusty (it screen-scrapes Bugzilla, rather than using Bugzilla’s REST API), it turned out that Bugzilla was indeed at fault.

OK, but what to do until Bugzilla actually gets updated with the fix? Options:

  1. Cope with a little extra work: use git-bz as normal, but manually set the r? flags from Bugzilla’s web UI.
  2. Paralysis: any extra work is unacceptable, so do nothing (at least in terms of writing patches).

I admit option 2 is a bit silly, but that’s exactly what the situation felt like: there had been this unexpected obstacle in my normally smooth development process and it was simply Not Worth going back and doing things the old way.

I then realized that this situation, in reverse, is exactly the “barriers to entry” that people like Gregory Szorc talk about: every extra step that you have to do in Firefox development (or development generally) is one more place for somebody to run into trouble and decide contributing to your project isn’t worth the hassle. Make the extra steps go away! (And then don’t bring them back via bugs. *wink*)

Mar 15

tsan bug finding update

At the beginning of Q1, I set a goal to investigate races with Thread Sanitizer and to fix the “top” 10 races discovered with the tool.  Ten races seemed like a conservative number; we didn’t know how many races there were, their impact, or how difficult fixing them would be.  We also weren’t sure how much participation we could count on from area experts, and it might have turned out that I would spend a significant amount of time figuring out what was going on with the races and fixing them myself.

I’m happy to report that according to Bugzilla, nearly 30 of the races reported this quarter have been fixed.  Folks in the networking stack, JavaScript GC, JavaScript JIT, and ImageLib have been super-responsive in addressing problems.  There are even a few bugs in the afore-linked query that have been fixed by folks on the WebRTC team running TSan or similar thread-safety tools (Clang’s Thread Safety Analysis, to be specific) to detect bugs themselves, which is heartening to see.  And it’s also worth mentioning that at least one of the unfixed DOM media bugs detected by TSan has seen some significant threadsafety refactoring work in its dependent bugs.  All this attention to data races has been very encouraging.

I plan on continuing the TSan runs in Q2 along with getting TSan-on-Firefox working with more-or-less current versions of Clang.  Having to download specific Subversion revisions of the tools or precompiled Clang 3.3 (!) binaries to make TSan work is discouraging, and that could obviously work much better.