One of my Q1 goals is to investigate what can be done for cross-platform power profiling, in service of the Firefox Platform’s Project Candle. Roberto Vitillo already did a bit of exploration in this space last year. Ideally, what would come out of my research would be some sort of event counter that we could hook into the Gecko Profiler. Failing that, it would at least be good to document what is possible for power profiling on each of our supported platforms.
The only power profiling tool I was at all familiar with was powertop. I thought I’d start by figuring out if there was any rough equivalent on OS X. Turns out there is; it’s called powermetrics. You can simply run:
$ sudo powermetrics
at a Terminal prompt and you will periodically (every five seconds by default) see a large amount of information concerning power usage printed. The default data sources that powermetrics draws from includes things like detailed C-state usage for all available processors. For my purposes, I’m just interested in what applications are waking up most frequently. powermetrics supports limiting its data sources with the –samplers switch:
$ sudo powermetrics --samplers tasks
will just display interesting information on all currently running tasks in the system. You might think of it as a more detailed version of Activity Monitor’s Energy tab, and indeed, you can convince powermetrics to output the “Energy Impact” number found in Activity Monitor:
$ sudo powermetrics --samplers tasks --show-process-energy
The numbers from powermetrics
don’t match up exactly with the numbers from Activity Monitor; I’m guessing that the impact of kernel_task (seen in powermetrics) is somehow spread among applications calling into the kernel by Activity Monitor. Or Activity Monitor is aggregating energy usage over a longer time period than every five seconds.
Knowing that your application is waking up rather often is helpful (Firefox is waking up ~once a second while I’m writing this), but even more helpful is where the wakeups are coming from. The easiest way I know of for finding that information is to use dtrace:
$ sudo dtrace -n 'mach_kernel::wakeup { @[ustack()] = count(); }'
It’s worth breaking down what’s going on in the above.
- The
mach_kernel::wakeup
bit selects a probe point in dtrace parlance.mach_kernel
is the “module name” andwakeup
is the “probe name”. You can see a complete list of probes by runningsudo dtrace -l
. - The bits between the braces are the actions to run when the probe point is hit. In our example, we’re counting unique stack traces when wakeups occur;
ustack
is short for “user stack”, i.e. the stack of the userspace program executing.
(dtrace is a pretty snazzy tool; it’s worth checking out Brendan Gregg’s DTrace Tools or FreeBSD’s DTrace tutorial if you’re at all interested by the above script.)
Running the above on my system for several seconds gives stack traces like the following:
libsystem_kernel.dylib`mach_msg_trap+0xa CoreFoundation`CFRunLoopWakeUp+0xab XUL`nsAppShell::ScheduleNativeEventCallback()+0x3c XUL`non-virtual thunk to nsBaseAppShell::OnDispatchedEvent(nsIThreadInternal*)+0x26 XUL`nsThread::PutEvent(nsIRunnable*, nsThread::nsNestedEventTarget*)+0x95 XUL`nsTimerImpl::PostTimerEvent(already_AddRefed)+0x229 XUL`TimerThread::Run()+0x283 XUL`nsThread::ProcessNextEvent(bool, bool*)+0x3d8 XUL`NS_ProcessNextEvent(nsIThread*, bool)+0x35 XUL`mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*)+0x11b XUL`MessageLoop::Run()+0x4d XUL`nsThread::ThreadFunc(void*)+0x15b libnss3.dylib`_pt_root+0xda libsystem_pthread.dylib`_pthread_body+0x83 libsystem_pthread.dylib`_pthread_body libsystem_pthread.dylib`thread_start+0xd 90 libsystem_kernel.dylib`__psynch_cvsignal+0xa libnss3.dylib`pt_PostNotifies+0xb8 libnss3.dylib`PR_Unlock+0x4e XUL`nsTimerImpl::InitCommon(unsigned int, unsigned int)+0x1f7 XUL`mozilla::PreciseRefreshDriverTimer::ScheduleNextTick(mozilla::TimeStamp)+0x215 XUL`mozilla::RefreshDriverTimer::Tick()+0x35 XUL`nsTimerImpl::Fire()+0x3e5 XUL`nsTimerEvent::Run()+0xf9 XUL`nsThread::ProcessNextEvent(bool, bool*)+0x3d8 XUL`NS_ProcessPendingEvents(nsIThread*, unsigned int)+0x51 XUL`nsBaseAppShell::NativeEventCallback()+0x77 XUL`nsAppShell::ProcessGeckoEvents(void*)+0x132 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__+0x11 CoreFoundation`__CFRunLoopDoSources0+0x10d CoreFoundation`__CFRunLoopRun+0x39f CoreFoundation`CFRunLoopRunSpecific+0x128 HIToolbox`RunCurrentEventLoopInMode+0xeb HIToolbox`ReceiveNextEventCommon+0x1af HIToolbox`_BlockUntilNextEventMatchingListInModeWithFilter+0x47 AppKit`_DPSNextEvent+0x3c4 154 libsystem_kernel.dylib`mach_msg_trap+0xa CoreGraphics`_CGSEventAppUnresponsiveStatus+0x79 CoreGraphics`CGSEventAppUnresponsiveStatus+0x38 CoreGraphics`CGSEventIsAppUnresponsive+0x13 Activity Monitor`0x000000010e918bd1+0xa76 Activity Monitor`0x000000010e9038c8+0x141 libsysmon.dylib`sysmon_table_apply+0x28 Activity Monitor`0x000000010e9032df+0x231 Activity Monitor`0x000000010e90a156+0x192 libdispatch.dylib`_dispatch_client_callout+0x8 libdispatch.dylib`_dispatch_barrier_sync_f_invoke+0x39 Activity Monitor`0x000000010e90a09b+0x91 libsysmon.dylib`__sysmon_request_execute_block_invoke_2+0xe0 libxpc.dylib`_xpc_connection_call_event_handler+0x3a libxpc.dylib`_xpc_connection_mach_event+0x76d libdispatch.dylib`_dispatch_client_callout4+0x9 libdispatch.dylib`_dispatch_mach_msg_invoke+0x1bd libdispatch.dylib`_dispatch_queue_drain+0x23b libdispatch.dylib`_dispatch_mach_invoke+0xe8 libdispatch.dylib`_dispatch_queue_drain+0x23b 176 libsystem_kernel.dylib`mach_msg_trap+0xa CoreFoundation`CFRunLoopWakeUp+0xab XUL`nsAppShell::ScheduleNativeEventCallback()+0x3c XUL`non-virtual thunk to nsBaseAppShell::OnDispatchedEvent(nsIThreadInternal*)+0x26 XUL`nsThread::PutEvent(nsIRunnable*, nsThread::nsNestedEventTarget*)+0x95 XUL`nsTimerImpl::PostTimerEvent(already_AddRefed)+0x229 XUL`TimerThread::Run()+0x283 XUL`nsThread::ProcessNextEvent(bool, bool*)+0x3d8 XUL`NS_ProcessNextEvent(nsIThread*, bool)+0x35 XUL`mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*)+0xac XUL`MessageLoop::Run()+0x4d XUL`nsThread::ThreadFunc(void*)+0x15b libnss3.dylib`_pt_root+0xda libsystem_pthread.dylib`_pthread_body+0x83 libsystem_pthread.dylib`_pthread_body libsystem_pthread.dylib`thread_start+0xd 177 libsystem_kernel.dylib`__psynch_cvbroad+0xa libnss3.dylib`PR_Interrupt+0x37 XUL`mozilla::HangMonitor::NotifyActivity(mozilla::HangMonitor::ActivityType)+0xe0 XUL`nsThread::ProcessNextEvent(bool, bool*)+0x3ce XUL`NS_ProcessPendingEvents(nsIThread*, unsigned int)+0x51 XUL`nsBaseAppShell::NativeEventCallback()+0x77 XUL`nsAppShell::ProcessGeckoEvents(void*)+0x132 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__+0x11 CoreFoundation`__CFRunLoopDoSources0+0x10d CoreFoundation`__CFRunLoopRun+0x39f CoreFoundation`CFRunLoopRunSpecific+0x128 HIToolbox`RunCurrentEventLoopInMode+0xeb HIToolbox`ReceiveNextEventCommon+0x1af HIToolbox`_BlockUntilNextEventMatchingListInModeWithFilter+0x47 AppKit`_DPSNextEvent+0x3c4 AppKit`-[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:]+0xc2 XUL`-[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:]+0x56 AppKit`-[NSApplication run]+0x252 XUL`nsAppShell::Run()+0xfd XUL`nsAppStartup::Run()+0x29 201 libsystem_kernel.dylib`mach_msg_overwrite_trap+0xa CoreGraphics`CGXRunOneServicesPass+0x282 CoreGraphics`CGXServer+0x347 WindowServer`DYLD-STUB$$CGXServer libdyld.dylib`start+0x1 WindowServer`0x2 219
You can see that timers going off in Firefox are a major source of wakeups (those are the stacks with TimerThread::Run()
in them).
Even with stacks causing wakeups, it’s not always obvious what sequence of events leads to those wakeups; this is especially true in the e10s era, where a particular stack ending in an IPC message send might only give you half of the picture. Figuring that puzzle out will help solve problems like frequently waking up when animating images with CSS.
We’ve discussed how the profiler could be used to look at power usage in Toronto for the purposes of b2g. We discuss that a possible approach we could take would be to manually instrument around the gecko main event loop. Particularly when we enter the blocking state because no events are available. The downside is this instrumentation would miss a lot of different types of wake ups (IPC, intra-event wake up) but the up side is it would be portable to all our platforms.
With this approach we can get a nice portable feature set and use more robust platform/vendor specific tools where they are available.
Probably good to explore all the options/APIs first.
We also have the Intel power stuff that we added to the sampling. This tells us the power state of the CPU when we take a sample.