Zeroing in on DNT:1

Tom Lowenthal

In DC, sixty representatives from diverse groups sat together for three days this week and continued the hard work of defining a Do Not Track standard we can all live with. With contributors from the major web browser makers, many different industries, the privacy community, academia, and both the EU and US policy communities, this open process continues to be a meeting of the minds where everyone has a voice. We made great progress and it was fantastic to have so many smart people coming to consensus decisions at our fourth in-person meeting of the W3C Tracking Protection Working Group. After three days, we have two proposals with lots in common.

Tech Specs

We’re close to having a complete technical design for Do Not Track. We’re still working on a few details, but the major technical hurdles have been crossed, and very few points of disagreement remain. Here are the headlines:

  • We know how a site tells its users that it follows DNT, using a well known URI.
  • We’ve mapped out most of the JavaScript API that allows sites and users to talk about opt-ins and opt-outs.

First Parties

A first party is web content that users have meaningful interaction with. There can be multiple first parties on a page. For example, if you visit this page, you have meaningful interaction with Mozilla: we’re the place you’re trying to talk to. If we put a Facebook “Like” button on our page, Facebook would be third party unless you choose to interact with them by clicking the “Like” button. If you did, Facebook becomes a first party along with Mozilla. This approach is much better than definitions which expect only one first party per page, and require all first party webservers to have the same domain name. Best of all, it fits easily with the way that we actually use the web.

First parties don’t have to do much honor a user’s privacy request. If you go to Amazon, we assume that you’re trying to interact with them. Do Not Track will not get in the way of you seeing personalised shopping suggestions or having things shipped to your address. To have so much latitude to use data without undue constraints, Amazon just has to do two things:

  1. They have to respond and promise to follow the W3C Recommendation for Do Not Track, and
  2. they mustn’t share your data with other people, or mix it with data from other sites.

First parties can outsource data processing if they want. If a site outsourcea analytics, that’s fine. They just have to make sure their analystics company keeps data about their users separate: no mixing data from lots of sites.

Third parties

Third parties are a remarkably simple concept: anyone who isn’t a first party or a user.

Anyone, first party or third, can use data without restriction as long as they make sure that it can’t be linked to a particular user. We discussed what you might need to to do make sure that data really can’t be linked back to someone. Some of the approaches were based on k-anonymity or estimates of uniqueness based on characteristics of users who do not have Do Not Track enabled.

It’s also fine to keep server logs for a brief time before they’re rotated out and processed, but this period needs to be short, and logs mustn’t be used for anything else during this period.

We agreed there are some things like security and fraud control which are so important that even business that have no interaction with users need to be able to do them. The web is the platform and we should be careful when tinkereing with the engines that power it. We don’t want implementation of Do Not Track to harm the web, just make it safer.

Differences

Our two current proposals differ as to how “large” a party is. One proposal thinks of a party based on corporate ownership; the other makes decisions based on user expectations and branding. For many websites, this distinction makes no difference. However, for companies with many different unrelated brands this choice determines whether we think it’s more important to avoid costly implementations and restrictions for companies which currently share all their data between brands, or to avoid surprising users who have no idea how far their data can flow.

The group is split as to whether third parties can continue to use unique identifiers to enable those critical uses that are still allowed. For example, can third parties use unique identifiers to fight fraud and bill for ad impressions? How about for frequency capping to eliminate showing the same ad multiple times, even if that means knowing sites a user has visited where the ad displayed? We have more work to do here: how can we best support privacy without breaking business? We spent several hours talking through different possible approaches, from the purely technical to the purely administrative and everything in between.

At the end of the day

It’s increadibly exciting to be forging a path forward that everyone can live with. We still have work to do, and these remaining differences are not minor. But we may reach consensus decisions by coming to agreement on these two issues where we differ at the same time, and addressing them together. With all this rough consensus, we may have to start looking at some running code. Stay tuned!