Test Pilot New Tab Study Results

[Cross-posted at Mozilla User Research]

The new tab page in Firefox is intentionally left blank, while some browsers present rich information on a newly opened tab.

The decision to leave new tab pages in Firefox blank was driven, in part, by a suspicion that too much information in the new tab may distract users from getting to the destination intended for the new tab. To test whether this suspicion is true and to learn more about user behavior after opening a new tab, Test Pilot recently released the New Tab Study and will soon release a multivariate test on the new tab page. Test Pilot is a platform collecting structured user feedback through Firefox. It currently has about 3 millions users and all the studies are opt in. You can help us better understand how people use their web browser and the Internet in order to build better products by participating studies. Test Pilot add-on is available here.The study ran for 5 days and in all, we collected 256,282 valid submissions.
Results of the study show that on average each user daily:
  • opens 11 new blank tabs
  • loads 7 pages
  • visits 2 unique domains
  • visits 2 pages in a new tab before they leave or close it

Below are details on how a user loads a page in a new tab, their intentions when opening a new tab, and time spent on new tabs below.

How do users load a page in new tabs?

 We detected 11 different methods to load a Web page in a blank tab page. Actions in the Url bar include pressing ENTER through keyboard, clicking the go button on the right side of the bar, clicking the Web page suggestions in the dropdown menu and pressing ENTER key for dropdown suggestions. Similarly 4 actions can be performed in the search bar too. Users can load a previously saved page from the bookmark bar in the toolbar or Bookmark/History in the menu bar.


  • The URL bar is most used when navigating to new websites.
  • The Search bar is also popular. Users rarely use search bar dropdown to look for old search terms.
  • The Bookmark toolbar is used more often than the bookmark menu button.
  • The History Menu button is seldom used.

We can also classify all methods for loading web pages into either keyboard-based or mouse-based category. Generally speaking, users have a slight preference for mouse usage.


Why do users open new tabs?

1.    Are they looking for a specific URL?

13.95% of new tabs (13,941,404) are opened while the text in the clipboard starts with “http” or “www”, which are very likely to be URL strings. The number is surprisingly high, although it may be caused by previous actions rather than by pasting for loading a specific URL.

2.    Users browse a limited set of domains, and only a small proportion of domains attract most visits

If we represent each user as a single point in the plot where x-axis is the number of pageloads, and y-axis is the number of unique domains visited, we can get the following graph. The dash line (diagonal) is what will happen if users always visit a different domain for each page load. When the users are not so active, pageloads less or around a few hundreds, the number of unique domains grows linearly. However, once users get to browse more, distinct domains tend to be stabilized and saturated.

Globally, we check the visit frequencies of all domains, and find that globally only 17.38% domains (461,133 unique domains in total) take 80% of the total page loads (8,291,541 pageloads in total). It verifies the famous “20-80” law of long tail phenomena.

On the individual level, we are interested in whether a single user performs the browsing movements according to the 20-80 law. For each individual, domains taking 80% of the total page visits is defined as “main domains”. A user can confirm the 20-80 law if the ratio of the number of his main domains to the number of distinct domains is around 20%. According to the following fig., active users browse more web pages everyday, but the number of primary sites they go to decreases proportionally. It suggests that when users visit more sites, they prefer to go to the same sites more frequently. The result supports the existence of a speed-dial new tab page to some extent.


Time Spent on New Tabs

According to the study results, on average, users open 2 pages in a new tab before they leave or close it. They load the first web page in 6 seconds (median) after they open new tabs, and stay on the tab for 1 minute (median) once they start browsing. The distributions of these two types of reaction timings display broad tails. The actually mean values are much higher than the medians: users load the first web page in 45 seconds (mean) after they open new tabs, and stay on the tab for 7 minute (mean) once they start browsing, since the outliners and expected noises can vary the mean value a lot.
Meanwhile, how users open a new tab can distinguish 2 groups of mouse-based users and keyboard-based users. The tabs invoked by “Plus Button” and “Double Click on TabBar” represent the group of mouse-based users, and the tabs invoked by “Command+T” represent the the group of keyboard-based users. The results turn out that keyboard-based users act slightly faster than mouse-based ones, and they can stay on the same new tab a bit longer.

The study is preliminary study for redesign requirement of the new tab pages in Firefox. We detect user behavior patterns of how they use the new tabs, including how they load a new page, broadness of domain visited, and the timing of different actions. In the following New Tab Multivariate Test, we will do a comparison between several designs of the new tab page, and more research questions will be answered, including whether too much information in the new tab may distract users from the original target or not.

Join Mozilla Metrics

The Mozilla Metrics team is expanding to meet the growing data related opportunities and challenges faced by Mozilla and the web as a whole. In addition to open positions for a visualization expert and a metrics software engineer, we are also looking for a data analyst to focus on user experience. The UX data analyst will gather structured user insights and then leverage these insights to inspire and inform the design of our products.

Please reach out to us if you (or someone you know) has a passion for data and building a better internet.

The Mozilla community itself is also growing – so if data isn’t your thing, be sure to check out the other career listings as well!

Investigating Users’ Willingness to Recommend Firefox

Market research has shown that the Mozilla mission is a powerful attractor for Firefox users.  Furthermore, additional research has shown that recommendation is a strong method to promote the adoption of Firefox.

These observations lead to the following question: how does one’s willingness to recommend Firefox relate to their knowledge that Firefox is made by Mozilla, a mission-driven non-profit?  As an initial hypothesis, we posited that one’s willingness to recommend Firefox would be positively related to their knowledge of Firefox as a product of a mission-driven non-profit.

Using the beta survey interface, we asked Firefox 3.6 users the following questions:

  1. Did you know that by using Firefox, you are supporting a mission-driven non-profit organization?
  2. How likely are you to recommend Firefox to a friend or colleague? (0-10)

The response scale for question 2 was then used to calculate a Net Promoter Score (NPS), which is a marketing metric to gauge users’ willingness to recommend a product or service.  A person who responds at 6 or lower is considered a “detractor,” whereas one who says “9” or “10” is considered a promoter.  All else are considered “neutral.”

We calculate an NPS by subtracting the proportion of detractors from the proportion of promoters.  Thus, scores are between 0 and 1 and higher is better. By using this metric, we are able to investigate the relationship between the knowledge that Firefox is made by a mission-driven non-profit and one’s willingness to recommend Firefox to others.

As a survey experiment, we also reversed the presentation of the questions, meaning that for some of the time, we asked respondents to give their willingness to recommend before they indicated whether they knew Firefox was made by a mission-driven non-profit.  We did this in order to determine if simply informing users of this fact was enough to induce a “knowledge” effect.

Figure 1a shows that over every level of response, there are more users who say they did not know that Firefox was produced by a mission-driven non-profit than those who say they did.  In particular, the amount of “neutrals” (which can be interpreted as the 5s, since it is the midpoint) is greater in the “without knowledge” group than in the “with knowledge” group.  These data lend some credibility to the idea that knowing Firefox is made by a mission-driven non-profit relates to willingness to recommend and that more people are unaware of this fact than those who are.

Figure 1b shows the initial results of the question ordering.  By asking users to indicate knowledge first, it appears to reduce the amount of 8s, 9s, and 10s from users.   A potential explanation of this effect could be that users can tell that we are trying to induce positive feelings towards Firefox by asking them this information first.  We can interpret these data to indicate that simply telling users that Firefox is made by a mission-driven non-profit is not enough to boost their willingness to promote.

Figure 2 demonstrates the relationship of willingness to recommend by knowledge group.  This effect is quite pronounced.  The NPS of these groups (“Yes, I did know” versus “No, I didn’t know”) are different from each other, where those with knowledge are much more likely to say that they are willing to recommend Firefox to others.

These results support our initial hypothesis: one’s willingness to recommend Firefox is positively related to one’s knowledge that Firefox is made by a mission-driven non-profit.   Note that this is not a causal relationship; from this data, we cannot say that knowledge directly boosts one’s willingness to recommend Firefox.  No statistical tests of inference have been performed.  However, this survey study strongly indicates that this relationship bears further investigation.

Using the New Days Last Ping Metric to Look at Firefox 4 Downloads

As with many software companies, we are keenly interested in gauging active installations and understanding how people use our products. However, as a non-profit organization with a strong interest in promoting privacy, we also recognize there’s a fine line between this activity and tracking people in unwelcome ways. Mozilla has developed a new mechanism that further enhances privacy, while still meeting its objective to create usage metrics.

1. Measuring Usage Activity

1.1 Our Current Approach: Blocklist Cookies

Mozilla has a mechanism for maintaining installed add-ons called a “blocklist” [https://wiki.mozilla.org/Blocklisting]. This involves a  scheduled request to retrieve an updated blocklist file from Mozilla servers.  The request is currently performed not more than once per 24 hours by several Gecko powered applications maintained by Mozilla (e.g., Firefox Desktop and Mobile). Because the request only happens once per day, we can study the pattern and volume of requests to understand how many active installations of a product there are on a particular day. While requests do not collect or use any personal user information, they are currently using a cookie to study how many unique active installations we see during a given time period.

1.2 From FF4 Onwards: Days Since Last Ping

It’s been our experience that using cookies for tracking unique usage is somewhat unreliable  and cookies raise privacy issues for our users. Cookies can be cleared by the user and sometimes even corrupted via proxies. Because of these reasons, the Mozilla Metrics team filed bug 616835 [1]  to implement a new method for tracking unique installations without any need for a cookie or any other form of identifier. This not only gives us the ability to get better usage metrics, but also to strengthen user privacy by removing the old cookie entirely.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=616835

Design and Implementation of Days Since Last Ping

Each time a request is made for the blocklist data, the request includes a new parameter that indicates how many days it has been since the last request. There is very little possibility to derive a fingerprint from this new parameter since it is a low number of bits, it changes on every request, and users will not maintain outlier values unless they consistently have a pattern of extremely occasional usage (i.e. months between usage of the application). If Firefox is left open unattended for 2 weeks, the days last ping value will be 1 for every day even though the user might never have been at their computer.

Computing Active Installations

For each day in the desired time period, we add all the requests with a value indicating that either this is the first ever request to blocklist (a new installation), or the last time the application made the request was before the time period we are analyzing. This means that on the first day, we count all requests with a valid parameter (i.e. between 1 and max_valid). On the second day of the time period, we count all requests with a parameter between 2 and max_valid. After iterating through each day of the time period using this algorithm, we sum all the counts together and we have the number of unique active installations in that time period.


Consider the date range 04 March – 07 March. We proceed as follows:

1. For March 4th, add the ‘new’ count and the number of days last ping (dlp) ==n, n>=1.
2. For March 5th, add the ‘new’ count and number of dlp==n, n>=2. We ignore dlp==1 because those installations would have made the blocklist check with some of value of dlp on March 4th.
3. Similarly for March 6th and March 7th, add ‘new’ and counts of dlp==n, n>=3 and n>=4 respectively.
4. Add all the counts in (1)-(3) to get number of unique active installations in the above period.

With this metric we can also compute the number of new installations being added on a daily basis. Also, we would like to confirm that we have a high proportion of profiles with days last ping equal to 1. This is not the same as a ‘daily user’ but it is encouraging to see a lot of installations using Firefox for two consecutive days.


What we can’t compute the number of installations using Firefox for exactly ‘k’ days in a week or retention patterns.

2. Visualizing the  Behavior of the Days Last Ping Metric

Firefox 4 was released on 22nd March though beta and release candidates were available before that. This is a great opportunity to visualize the dynamics of a metric for a new product.

2.1 New Installations

We are eager to see what proportion of our daily ‘blocklist pings’ come from new installations. Figure 1 displays the proportion of blocklist pings that come from new installations. The heartening observation is the positive slope of the red smoother curve. The peaks are not day of the week effects but correspond to release dates. It is difficult to comment on day of week effects here because of the significant events that occurred in the time period. Nevertheless, new profile percentage appears to around 3.5-4.5% on a daily basis with a slight increasing trend towards the end of the period.

2.2 ‘Daily’ Usage

Figure 2 is a display of the proportion of blocklist updates that come from installations with days last ping equal to 1 which means the installation is active today and the previous day. The proportion varies from 72% to 86%, with a mean hovering around 82%. The red smoother indicates not much change. The 1st,3rd and 5th week have a similar pattern: a low at the beginning of the week (a value of dlp equal to 1 on Monday means that the profile used FF on Sunday), peaking towards the center of the week and dipping as we approach the weekend. The 4th week was the week of the release. Understandably this looks very different. In both Figures 2 and 1, week 2 looks different , probably because of the RC release. I would like to say there is no weekly effect, and indeed the shape is same (except for the two exceptional weeks) but the highs and lows are different.

2.3 ‘Recent’ Usage

Together with counts of days last ping less than 7, we capture more than 90% of the users. Figure 3 shows the proportion of blocklist pings that come from installations that last contacted between 2 to 7 days back. The alignment of troughs and valleys is opposed to the dlp==1 display (Figure 2). There does not seem to be an increasing trend, with the mean around 12-14%, peaking on Thursdays. Why is that? Because the bulk come from dlp=3. On average (across day of week), 95.3 % of the blocklist pings have dlp<=3 and 98.9%<=7. Figure 4 displays the mean cumulative percent of different values of days last ping (between 2 to 7) by the day of week. The key observation is that the curve doesn’t change much – meaning there is little interaction involved here.

2.4 ‘Infrequent’ Usage

Finally, in Figure 5, we get to see the dynamics of the days last ping distribution for a new product. In Figure 5, we plot the density of days last ping greater than 14 for all the days. Each row is 7 days so we can fix a day of week by moving along columns.

Firstly, we see the distribution shifting out to the tails. On one hand this is expected as there is more time available for installations to be used after a long period of inactivity. Also the maximum proportion (see Figure 6) decreases steadily over time from 0.1% to about 0.03 % on 3rd April, dramatically so a few days after FF4 launch. However in both Figures 5 and 6, we see the peak starting to rise again. This means we have more 14 days plus inactive users using FF4. Whether this panel stabilizes is something we can see over the next few months.

2.5 Daily Actives vs. Weekly Actives

Surprisingly, if we look at unique actives over a week, vs the mean daily actives in a week, the numbers are relatively stable, for the 6 weeks the ratios are: 0.692 0.593 0.666 0.994 0.685 and 0.672
In future, we can look at rolling 7 day periods and other window lengths( e.g.  14 days, monthly etc ) and week on week on growth for unique actives.
Reassuring results, and we are all very eager to monitor the progress as FF4’s adoption increases.


Thanks to Daniel Einspanjer for the introduction to the cookie usage and the background on the metrics ping.

Firefox 4: Across the World

After a marathon Firefox 4 download ( 15.85 MM, see  http://blog.mozilla.org/blog/2011/03/25/the-first-48-hours-of-mozilla-firefox-4/ ) we prepared a location intensity graph. Figure 1 displays the cities by frequency of download (independent of how many downloads though the two are strongly related). Blue is infrequent to very few (bottom 10%), bright yellow is many (top 10%).

Figure 1. Location intensity of cities. Blue is rare, bright yellow is high.

Though the above picture is heartwarming at the same time it is disappointing. There are swaths of the world pitch black. The location information is constrained by the accuracy of the geo-location service but that is also related to internet penetration. With Internet access rapidly becoming (in the not too distant future)  a public utility, it is sad that the ‘have not’s have one more item on the list.

Based on reader input I’ve attached a large (2248 × 1268, 1.1MB) PNG file.Click here to download.

Browsing Sessions III: Do Users Overestimate How Long They Browse?

In our last post, we found that the number of installed extensions was a good discriminant of heavier users. In this short follow-up, we’ll delve into the survey data associated with the Beta Interface study.  Here is a snapshot of some of the research we’ve been conducting.

Users overshoot their estimated browsing time

The graph above demonstrates that users tend to simply overestimate how long they use Firefox. Those that typically use the browser less have a more accurate assessment of how long they are browsing. But for users who state a longer browsing time per day, the actual browser usage is lower than their own estimate.

First, a note about the methodology behind this graphic. We estimate the average daily browsing time by aggregating the session lengths of Test Pilot users over the course of the study. Previously we have defined a browser session as a continuous period of user activity in the browser, where successive events are separated by no more than 30 minutes. We subset on the users that state they only use Firefox, to avoid the problem of a different primary browser. 

We thought of a few possible explanations as to why, for heavier users, the estimated time is lower than the stated time. Those users might, for instance, be online and using their computers quite a bit during the day, but have integrated their online workflow with their offline ones. Software engineers are a good example of this – we might expect a programmer to be working on a computer all day, leaving the browser open, and using it every once in a while.  So there may be the perception of constant browser usage.  This certainly rings true from the experience of the Metrics team – we’re on our computers almost all day, with Firefox open, despite working.  This is, however, only speculative at this point, since we don’t have data about when users are on their computers.

There are still some obvious methodological issues with this approach: a user might, for instance, use Firefox on a work computer (with test pilot installed), and a different one for home use, which could account for the difference. As such, we hope to include a survey question asking “How much time a day do you spend on this computer?” in the next version of the study.  At that point, we can update this research.

Mozilla Open Data Competition – Announcing The Winners!

[Note: cross-posted on Mozilla Labs]

Back in November, Mozilla Labs and the Metrics Team together launched the first Mozilla Open Data Visualization Competition. While we set out to discover creative visual answers to the open question, “How do people use Firefox,” we really didn’t know what level of participation to expect from the Mozilla and data analysis communities. In fact, we were overwhelmed by both the number and quality of submissions – so much so that we had to give ourselves an extra few days to thoroughly review them all!

In all, we received 32 high-caliber submissions. The “visualizations’ took a number of forms, from tools to easily query the data to interactive web applications. They also covered a broad range of important topics, from plugin memory consumption to user web activities. You can find all 32 submissions here; entrants, if you haven’t already, be sure to check out the page as our panel of judges has left feedback on each and every submission.

Needless to say, we want to thank all the participants – your work has made our initial open data competition an overwhelming success and many of your insights will directly help the Firefox team develop a better web browser. In thanks, we’ll be sending this awesome Firefox t-shirt to each entrant:

We also want to thank our 3 partner judges: David Smith, Revolution Analytics; Andrew Vande Moere, Information Aesthetics; and Brian Suda, author of A Practical Guide to Designing with Data. The success of the competition was largely due to your help in publicizing the event and thoroughly evaluating the entries.

And now…lets get to the winners!

Grand Prize

Survey Participants vs. All Users – Contributed by: James Fiedler

While deciding amongst the 32 entries was difficult, the focus on a single, very relevant and important question distinguished this entry. James focused on contrasting survey participants with all users (critical as we often use survey data for segmentation), then set up a simple and helpful environment for the user to explore and discover interesting conclusions of their own. This submission is exactly the type of work we were hoping for: an elegant visualization that presents data around an important and complex question in a clear and easy-to-understand way. James will receive a $300 Amazon gift card for his excellent work.


Test Pilot Explorer – Contributed by: Lon Riesberg

One of the more creative entries, Lon created a custom “explorer” that essentially “plays back” time-ordered events as animated plots and includes filters to customize what data is shown. This explorer really shows how you can “see” user behavior on a mass scale, and while we had some quibbles about some of the details of the visualization itself, we found it to be a powerful and enjoyable data exploration tool. Lon will receive a set of all 4 Edward Tufte books for his work.

Firefox Usage by Age – Contributed by: Tom Haynes (University of Michigan)

Tom’s entry also focused on one particular element of the data. His execution sets this submission apart, as his visualization doesn’t try to encompass everything, but tells a clear, specific story around how Firefox usage times vary across age groups. Tom will receive a set of all 4 Edward Tufte books for his work.

Honorable Mention

Given the number of worthy submissions, we decided to hand out 5 Honorable Mention Awards in addition to the original 3 prizes. For varying reasons, we thought these entries were particularly valuable and each team will receive Tufte’s latest book, Beautiful Evidence, in recognition of their great work. Good Job!

Firefox browser – Event Sequences – Contributed by: Benoît Pointet

Firefox 4 beta UI Component Use vs. User Expertise – Contributed by: Nicolas Garcia Belmonte, Maria Luz Caballero

Browser Usage Over the Course of a Day Contributed by: Christian Kreibich

Bookmark a Lot, Browse a Lot – Contributed by: Eugene Tjoa

Firefox Plugin Memory Consumption– Contributed by: Diederik van Liere and David Eaves

Again, thanks to all the participants, judges, and everyone else who helped make this first open data competition such a success! Participants should receive an email within the week with details on how to receive the prizes and t-shirts.

And keep refining those data hacking skills – there will be more open data competitions in the near future!

Browsing Sessions II: Extensions, Time of Day, Number of Sessions, and Session Length

In our last post we delved into the rudimentary dynamics of the “browser session,” defined as a continuous period of user activity in the browser, where successive events are separated by no more than 30 minutes.

In this short post we’ll discuss another way of cutting the data. Below is the plot. For reference, each crossbar contains the 1st and 3rd quartile, along with the median.

A few insights regarding the plot:

  • Users with more extensions have longer and more varied session times than those with fewer.
  • Extensions in general do a better job of discriminating user behavior than the number of sessions.
  • These trends tend to hold over the course of the day, with only minor fluctuations.

Browsing Sessions

We recently provided some simple insights we’ve gleaned from how people use private browsing. In this post we’ll take a higher view, and examine behavior regarding when people generally use their browser.

The tl;dr version: users who have more “sessions” (defined below) tend to browse longer, more diversely, and over a broader swath of the day than more casual users.


Before we begin, the unit of analysis is the “browser session.” Here is our working definition: a browser session is a continuous period of user activity in the browser, where successive events are separated by no more than 30 minutes.

Despite its rudimentary nature, this definition of a session is still fairly common in the web analytics literature.

The median browser session, median number of sessions

As the graph indicates, the median session is only about 30 minutes long, with a very long tail. The first quartile is about 9 minutes long, while the third is about an hour.

The median number of sessions per user, on the other hand, is about 2 a day. Approximately 25% of users actively use the browser only once a day, while the 75th percentile has around 3 sessions a day.

More sessions ↔ longer sessions

Those users with a larger amount of sessions (say, 20 over the week-long study) tend to spend about 10 minutes more per session than those with around 10 sessions.

More sessions ↔ more varied session lengths

Users with more sessions also tend to have much larger variation in the lengths of their sessions, which suggests that for more frequent users, the use case of the browser is in general much more diverse.

More sessions ↔ wider range of use over the day

More frequent users tend to use the browser over a wider swath of the day as well. This is fairly intuitive – more and longer sessions should span a larger part of the day. It is striking, however, how large the range is for users with many sessions. This might be a consequence of the sample bias inherent in the Beta population. Most of our Test Pilot users are tech-savvy young men, so the wide range in which they browse is a little more understandable.

As you can tell there is a lot we can do with just analyzing sessions. We’ll be rolling out more simple insights like these soon – stick around.

Mozilla Open Data Competition – 10 Days Left!

[Note: cross-posted on Mozilla Labs]

Hello Data Hackers!

We just wanted to remind everyone that the submission deadline for the first Mozilla Open Data Visualization Competition is just 10 days away! Submit your entries by December 17th for a chance at a $300 Amazon gift card and a set of all 4 Edward Tufte books!

We’ve already received some great entries, and our panel of expert judges (Kevin Fox and Jinghua Zhang, Mozilla Labs; Hamilton Ulmer, Chris Jung and Blake Cutler, Mozilla Metrics) along with our partner judges (David Smith, Revolution Analytics; Andrew Vande Moere, Information Aesthetics) look forward to seeing the rest of the submissions!

Remember to visit the Official Competition Page for all the information you need, including how to download the data and enter the competition.

Good Luck!