24
Feb 15

measuring power usage with power gadget and joulemeter

In the continuing evaluation of how Firefox’s energy usage might be measured and improved, I looked at two programs, Microsoft Research’s Joulemeter and Intel’s Power Gadget.

As you might expect, Joulemeter only works on Windows. Joulemeter is advertised as “a software tool that estimates the power consumption of your computer.” Estimates for the power usage of individual components (CPU/monitor/disk/”base”) are provided while you’re running the tool. (No, I’m not sure what “base” is, either. Perhaps things like wifi?) A calibration step is required for trying to measure anything. I’m not entirely sure what the calibration step does, but since you’re required to be running on battery, I presume that it somehow obtains statistics about how your battery drains, and then apportions power drain between the individual components. Desktop computers can use a WattsUp power meter in lieu of running off battery. Statistics about individual apps are also obtainable, though only power draw based on CPU usage is measured (estimated). CSV logfiles can be captured for later analysis, taking samples every second or so.

Power Gadget is cross-platform, and despite some dire comments on the download page above, I’ve had no trouble running it on Windows 7 and OS X Yosemite (though I do have older CPUs in both of those machines). It works by sampling energy counters maintained by the processor itself to estimate energy usage. As a side benefit, it also keeps track of the frequency and temperature of your CPU. While the default mode of operation is to draw pretty graphs detailing this information in a window, Power Gadget can also log detailed statistics to a CSV file of your choice, taking samples every ~100ms. The CSV file also logs the power consumption of all “packages” (i.e. CPU sockets) on your system.

I like Power Gadget more than Joulemeter: Power Gadget is cross-platform, captures more detailed statistics, and seems a little more straightforward in explaining how power usage is measured.

Roberto Vitillo and Joel Maher wrote a tool called energia that compares energy usage between different browsers on pre-selected sets of pages; Power Gadget is one of the tools that can be used for gathering energy statistics. I think this sort of tool is the primary use case of Power Gadget in diagnosing power problems: it helps you see whether you might be using too much power, but it doesn’t provide insight into why you’re using that power. Taking logs along with running a sampling-based stack profiler and then attempting to correlate the two might assist in providing insight, but it’s not obvious to me that stacks of where you’re spending CPU time are necessarily correlated with power usage. One might have turned on discrete graphics in a laptop, or high-resolution timers on Windows, for instance, but that wouldn’t necessarily be reflected in a CPU profile. Perhaps sampling something different (if that’s possible) would correlate better.


20
Feb 15

finding races in Firefox with ThreadSanitizer

We use a fair number of automated tools for memory errors (AddressSanitizer/Leak Sanitizer for use-after-free and buffer overflows; custom leak checking on refcounted objects; Valgrind tests and Julian Seward’s mochitests on Valgrind periodic testing), but we do very little in terms of checking for data races between threads.  As more and more components of the browser use threads in earnest (parts of the JavaScript engine, including the GC; graphics; networking; Web features like indexedDB, workers, and WebRTC; I have probably left out some others), preventing and/or fixing data races become more and more important as a way of ensuring both correctness and stability. One of my goals this quarter is running mochitests and reftests with ThreadSanitizer (TSan), reporting any races that it finds, and either fixing some of them myself or convincing other people to fix them.

What is a data race? Informally, data races occur when two different threads operate on the same location in memory without any synchronization between the threads. So if you do:

*ptr = 1;

in one thread and:

if (*ptr == 1) {
...
}

in another thread (without locks or similar), that’s a data race. It’s not guaranteed that the second thread will see the value written by the first thread, and if the code was written with that assumption, things can (and usually do) work as expected, but can also go badly wrong. When things do go badly wrong, it can be extremely frustrating to find the actual problem, because the problems don’t typically show up at the point where the data race happened. Of course, since the bug is dependent on caches and timing issues, the problem doesn’t always reproduce on a developer’s machine, either. You can see one of Chromium’s experiences with data races producing a common crash, and it took them nearly three months to find a fix. TSan told them exactly where to look. Google has written blog posts about their experiences using TSan with Chromium and more generally with other software they develop, including Go and their own internal tools.

When faced with evidence of a data race, people sometimes say, “well, int/long/pointer accesses are atomic, so it’s OK.” This is decidedly not true, at least not in the sense that writes to memory locations of such types are immediately visible to all threads. People sometimes try using volatile to fix the problem; that doesn’t work either (volatile says nothing about concurrent accesses between threads, or visibility of operations to other threads). Even if you think your data race is benign–that it can’t cause problems–you might be surprised at what can go wrong with benign data races. Hans Boehm, who led the effort to define the C++ memory model, has written a paper describing why there is no such thing as a benign data race at the C/C++ language level. Data races are always real bugs and are explicitly undefined behavior according to the C++ standard.

I’m not the first person to try using TSan on Firefox inside Mozilla; Christian Holler started filing bugs about races detected by TSan over a year ago.  So far, I’ve filed about 30 bugs from running “interesting” groups of mochitests: mostly mochitests under dom/ and layout/. That doesn’t sound like that many, and there are a couple reasons for that. One is that the same races tend to get reported over and over again. I’ve applied some local fixes to silence some of the really obnoxious races, but you still have to sort through the same races to find the interesting ones. (I should probably spend quality time setting up suppression files, but I’ve avoided doing that thus far for the fear of inadvertently silencing other races that I haven’t seen/reported before.) Test runs are also slow (about 10x slower than a non-TSan run), and it’s fairly common for the browser to simply hang during testing. I’ve been able to fiddle around with the mochitest harness to increase timeouts or the number of timeouts permitted, but that makes tests go even slower, as tests that do timeout take longer and longer to do so.

Fortunately, people have been responsive to the bugs I’ve filed; Honza Bambas, Valentin Gosu, and Patrick McManus on the networking side of things; Terrence Cole, Jon Coppeard, Andrew McCreight, and Shu-yu Guo on the JavaScript side of things; and Milan Sreckovic on the graphics side of things have all been fixing bugs and/or helping analyze what’s going wrong as they’ve been filed.

If you want to try out TSan locally, there’s an excellent introduction to using TSan on MDN.  I strongly recommend following the instructions for building your own, local clang; my experiences have shown that the released versions of clang don’t always work (at least with Firefox) when trying to use TSan. If you find new bugs in your favorite piece of code, please file a bug, CC me, and make it block bug 929478.


17
Feb 15

multiple return values in C++

I’d like to think that I know a fair amount about C++, but I keep discovering new things on a weekly or daily basis.  One of my recent sources of new information is the presentations from CppCon 2014.  And the most recent presentation I’ve looked at is Herb Sutter’s Back to the Basics: Essentials of Modern C++ Style.

In the presentation, Herb mentions a feature of tuple that enables returning multiple values from a function.  Of course, one can already return a pair<T1, T2> of values, but accessing the fields of a pair is suboptimal and not very readable:

pair<...> p = f(...);
if (p.second) {
  // do something with p.first
}

The designers of tuple must have listened, because of the function std::tie, which lets you destructure a tuple:

typename container::iterator position;
bool already_existed;
std::tie(position, already_existed) = mMap.insert(...);

It’s not quite as convenient as destructuring multiple values in other languages, since you need to declare the variables prior to std::tie‘ing them, but at least you can assign them sensible names. And since pair implicitly converts to tuple, you can use tie with functions in the standard library that return pairs, like the insertion functions of associative containers.

Sadly, we’re somewhat limited in our ability to use shiny new concepts from the standard library because of our C++ standard library situation on Android (we use stlport there, and it doesn’t feature useful things like <tuple>, <function>, or <thread_local>. We could, of course, polyfill some of these (and other) headers, and indeed we’ve done some of that in MFBT already. But using our own implementations limits our ability to share code with other projects, and it also takes up time to produce the polyfills and make them appropriately high quality. I’ve seen several people complain about this, and I think it’s something I’d like to fix in the next several months.