Or, how I learned to stop worrying and love HTML5*
Shortly after I started at Mozilla in April, I was assigned my first project — developing the replacement for AMO’s statistics pages for add-on developers. The current page was in need of both a visual and functional refresh. One of the great things about working on AMO is that we can make certain expectations as to what browser our registered users and developers are running — namely, a recent version of a modern browser. This was exciting because it meant the new stats pages could use next-generation web features to deliver a faster and more feature-rich experience to our users. I’m excited to say that, though they’re not ready to replace the old pages quite yet, the new stats pages are fit for public preview. Take a look here!
One of the biggest challenges on the stats pages is the management of data. We store our add-on performance statistics in what is essentially a CSV format: a set of values for each metric and date. Since performance data is historical data, we would wind serving overlapping sets of data for each pageview. I realized that if a user visiting the page today comes back tomorrow, there’s only one day of new data they haven’t yet seen. The solution?
When a user visits a page for the first time, they download the data for the default time frame (which, right now, is the past 30 days). The next time they come back, the data is fetched from
localStorage instead, and the page renders much, much faster. How does this work? When the user requests a certain time range of data, the page looks at the range of time for which it already has data. If the request is satisfied by the local data, we can skip an AJAX call. Subsequently, if any of the time range requested is not available locally, the server is asked only for the portion that is missing. This allows for the user to be able to use the stats pages frequently with very little additional server load.
Having a fast persistent layer for raw data is all well and good, but analysis and visualization are what make the stats pages useful. For that I turned to another fantastic new web technology —
<canvas>. We settled on using the exceptionally nifty Highcharts charting library, which utilizes
<canvas> and inline SVG (with graceful fallbacks) for its rendering.
The charts, as well as all the other data-driven UI, each have their own pageview-length cache layers to store the generated objects they use internally, such as chart configuration objects and individual pages of data. If a particular object hasn’t already been generated, we request the data (going to the server if necessary) and then store the fully generated object for the length of the pageview. This allows for fast switching of data views within the page with minimal lag. The caches use a client-side cache object. that generates and accepts callbacks, to allow asynchronous actions such as AJAX requests to occur. I’ve posted the code for the object here for those interested.
While the charts use the time-series data from the server fairly directly, a number of UI elements are presenting aggregate statistics on the raw data in the form of sums averages and ranked lists. Instead of making another request for data the page essentially has, I decided to compute aggregations locally. Realizing iterative computation is a good way to send the browser on a one-way trip to Beachball Town, I found an excellent opportunity to use Web Workers to perform those tasks in the background. The raw data is sliced, time-wise, into the needed chunk, which is passed off to a Web Worker thread. The result? The computationally expensive job of iteratively manipulating the data is off the main thread, where it refrains from putting a chokehold on the browser, and has the added benefit of running much faster in a dedicated thread.
In the initial implementation, I found that, at times, more than 10 workers were being spawned at a time to perform background tasks. I can count on one hand how many people I know who own a computer with more than 10 CPU cores, and I am not in that group. It wasn’t really any more efficient, and the CPU usage was hairy to boot. To address this, I wrote a simple Web Worker pool that queues jobs and distributes them to a limited number of workers. Much better.
Lastly, one of the goals for the new stats pages is the ability to deep-link or bookmark a particular view of the stats. On browsers that support it, the URL for the current view is being updated using the new
history.pushState(). Of course, we fall back to
location.hash when the fancier API isn’t available.
The process of building the new stats pages led me to this realization: A lot of the demos of HTML5 and its associated trappings spend great effort trying to visually dazzle, and it’s way too easy to write it all off as nothing but eye candy. What people forget is that a lot of the new features in modern web browsers have practical ways to make the development process cleaner, the user experience faster, and the end-result more sophisticated. Though truth be told, prettier pages aren’t so bad either.
* and CSS3, and next-generation JS APIs.