In late 2018 Mozilla conducted an experiment to collect browser Telemetry data with Prio, a privacy-preserving data collection system developed by Stanford Professor Dan Boneh and PhD candidate Henry Corrigan-Gibbs. That experiment was a success: it allowed us to validate that our Prio data collections were correct, efficient, and integrated well with our analysis pipeline. Today, we want to let you know about our next steps in testing data collection with Prio.
As part of Content Blocking, Firefox will soon include default protections against tracking. Our protections are built on top of a blocklist of known trackers. We expect trackers to react to our protections, and in some cases attempt to work around them. We can monitor how our blocklists are applied in Firefox to detect these workarounds.
However, directly monitoring how our blocklists are applied would require data that we feel is too sensitive to collect from release versions of Firefox. That’s why Prio is so important: it allows us to understand how our blocklists are applied across a large number of users, without giving us the ability to determine how they are applied in any individual user’s browser or on any individual page visit.
To support this we’ve developed Firefox Origin Telemetry, which is built on top of Prio. We will use Firefox Origin Telemetry to collect counts of the number of sites on which each blocklist rule was active, as well as counts of the number of sites on which the rules were inactive due to one of our compatibility exemptions. By monitoring these statistics over time, we can determine how trackers react to our new protections and discover abuse.
In the next phase of testing we need validate that Firefox Origin Telemetry works at scale. To provide effective privacy, Prio requires that two independent parties each process a separate portion of the data — a requirement that we will not satisfy during this test. As in our initial test, we will run both data collection servers ourselves to complete end-to-end testing prior to involving a second party. That’s why we are running this test only in our pre-release channels, which we know are used by a smaller audience that has chosen to help us test development versions of Firefox. We’ve ensured that the data we’re collecting falls within our data collection policies for pre-release versions of Firefox, and we’ve chosen to limit the collection to 1% of Firefox Nightly users, as this is all that’s necessary to validate the API.
We expect to start this test during our Nightly 69 development cycle. Collecting this data in a production environment will require an independent third party to run one of the servers. We will provide further updates once we have such a partner in place.