A few months ago, we provided some initial analysis about how Firefox usage varies by day of the week (e.g., Wednesday vs. Saturday). We’ve since attempted to improve upon that analysis by making sure we’re accounting for everything in our statistical equation, allowing us to correctly isolate certain effects. For example, there is typically a significant drop-off in usage on holidays. In this case, we want to tease out the single effect of a day being a holiday, while also not allowing holidays to influence the other effects that we’re trying to measure (e.g., day of the week). Moreover, our numbers fluctuate in the days immediately following a Firefox release. This is in no way related to an actual change in usage by Firefox users; it’s attributable to the interaction of our updates with our security ping process (described here), but we still have to be cognizant of this factor while trying to understand our active daily user numbers.
Given these observable variables, our regression equation looks like:
Active Daily Users = α + β1dayofweek + β2month + β3holiday + β4daysafterupdate + ε
Are there any variables that we’re forgetting about on the right hand side of that equation? In other words, is there anything you’d expect to affect the number of active daily Firefox users that we also need to be thinking about (or that could be affecting the four variables we’re already including)?
The output for our regression equation is below. It uses our active daily user numbers from 2007 (number of instances = 327) and the R-squared is 0.95. The percentages are the regression coefficients relative to the constant (34.3 million).
We’d like to continue improving upon this analysis in the future; if you have any feedback, we’re all ears.