What factors affect Firefox usage?

A few months ago, we provided some initial analysis about how Firefox usage varies by day of the week (e.g., Wednesday vs. Saturday). We’ve since attempted to improve upon that analysis by making sure we’re accounting for everything in our statistical equation, allowing us to correctly isolate certain effects. For example, there is typically a significant drop-off in usage on holidays. In this case, we want to tease out the single effect of a day being a holiday, while also not allowing holidays to influence the other effects that we’re trying to measure (e.g., day of the week). Moreover, our numbers fluctuate in the days immediately following a Firefox release. This is in no way related to an actual change in usage by Firefox users; it’s attributable to the interaction of our updates with our security ping process (described here), but we still have to be cognizant of this factor while trying to understand our active daily user numbers.

Given these observable variables, our regression equation looks like:

Active Daily Users = α + β1dayofweek + β2month + β3holiday + β4daysafterupdate + ε

Are there any variables that we’re forgetting about on the right hand side of that equation? In other words, is there anything you’d expect to affect the number of active daily Firefox users that we also need to be thinking about (or that could be affecting the four variables we’re already including)?

The output for our regression equation is below. It uses our active daily user numbers from 2007 (number of instances = 327) and the R-squared is 0.95. The percentages are the regression coefficients relative to the constant (34.3 million).

We’d like to continue improving upon this analysis in the future; if you have any feedback, we’re all ears.

7 responses

  1. yacoubean wrote on :

    What about external factors, like other browser releases? It would be interesting to see if there is any affect when a new version of IE comes out, or maybe when a major security hole is found in another browser.

  2. Ville wrote on :

    Are weekend holidays marked as holidays too? Wouldn’t it be clearer to only account for holidays during the week (mon-fri)? After all there’s little difference between a holiday sunday and regular sunday compared with a holiday monday and a regular monday.

    As a different matter, have you (meaning blog of metrics) tried to (or considered) compile data using the RSS feed redirects coming from localized FX 2.0 builds. You could harvets a wealth of usage data from those data points…

  3. chofmann wrote on :

    I guess once you have established some stable benchmarks for this what you will be watching for might be “events” that drive additional internet use.

    for example:

    traditionally monday after thanksgiving is a big day on-line shopping day…

    Cyber Monday Online Retail Spending Hits Record $733 Million, up 21 Percent Versus Last Year – http://money.cnn.com/news/newsfeeds/articles/prnewswire/AQTU21027112007-1.htm

    this cyber monday took us to a new record 48,585,836 active daily firefox users, a 79% increase over 2006 cyber monday (2006-11-27, 27,065,737 active firefox users on that day)

    big news events also drive people to the web, and we should expect to see bumps to the adjustments for things like large adjustments in the stock market, unexpected political changes, conflicts and wars. It will be tougher to smooth the daily averages for this, but they should be obserable in the raw data. I’d also suspect that the type of holiday also would show some variation.

    Over the past few years we have seen both increases and decreases below the norm in active daily use on Christmas Day. On the face thats a bit hard to explain, but it would be interesting to track to Christmas PC and laptop sales, to see if higher sales and new laptops under the Christmas tree showed up as additional downloads and daily use in some years, and then lagged in other years. The arragement of holidays around weekends making for extended time off might also have affect beyond just the holiday in some years. With the large number of global holidays we should see “holiday” effects for a large number of weeks during the year.

    I’d say a monthly time frame is a much better set of data to work with in most cases where we want to track trends in Firefox use and growth.

  4. AndyEd wrote on :

    Impressive model! That said, it’s pretty easy to account for variance with day of week… the challenge is going deeper. I’d suggest covariate approaches to remove day of week variance and hone in on remaining factors.

  5. Alex Polvi wrote on :

    @Ville, we have not looked at the RSS feed data… however, great idea!

  6. kkovash wrote on :

    @yacoubean: external factors is a great idea. they might be few and far between, but we should still be including them.

    @chofmann: thanks for the comments. in regards to a “monthly time frame”, we are controlling for month (I actually used month fixed effects, which gives the same effect as the equation outlines).

    @Andy: agreed. I think the question for us is, what are those remaining factors?

  7. Benjamin Chuang wrote on :

    It would have been interesting to me, as a novice numbers-person, to see in the chart the old values next to the new values, so you could demonstrate the differences in the improved method of analysis.