When Testing Meets Localization

Andrea’s recent post talked about the importance of this fundraising campaign being global, and I wanted to follow up on this with some notes about how our A/B testing strategy, and the volunteers who work on our localization have overlapped.

To start with, I want to note that a huge chunk of this fundraising appeal is only possible thanks to the Mozilla L10n (localization) community. Without their incredible support, we would be a long way short of our current fundraising total. These contributors are donating their time and energy to support Mozilla in a way that has real and measurable impact.

Because they are giving up their free time to do things that no paid-contributor working on this fundraising campaign has the ability to do, we never want to rush or make demands about when these translations need to be done by. And we don’t want to waste L10n’s time asking for more translations than we can actually use.

At the same time, we’re working on a rapid-fire A/B testing strategy for the design and content of our fundraising appeal as seen by English speaking audiences. The team are writing copy, animating images, and building systems which may only be live for a day before we replace them with something that performs better. Much of the work we do here will be discarded, but we know that’s OK because together we’re creating the combination of content and systems that has the most impact we can.

These two approaches are at odds with each other: minimizing content creation tasks versus mass discarding of content. And we’re trying our best to balance the needs of these two approaches this year.

The way we’ve worked is to test extensively against the EN (English) language versions of our Snippet appeal and donation form and whenever we make a significant breakthrough in conversion, roll those features out to the equivalent localized versions. We know this isn’t the optimized version of localized optimization (that’s a fun sentence!) but it’s had real positive impact.

Here is a graph of income from our French supporters:

Screen Shot 2014-12-19 at 21.42.46

The graph above shows what happens when we first localize our default snippet like this example based on our best performing EN text:

Screen Shot 2014-12-19 at 21.59.44

And the second bump on the graph is when we rolled out the snippet design which worked so well for the EN audience, like this example:

Screen Shot 2014-12-19 at 22.00.11

The thing is, we know that the best way to ask for donations when talking to an English speaking audience isn’t the same when speaking to a German audience. It’s not even the same across English audiences. As an English English speaker, I’ve been slowly acclimatized to all the Zs that my American English speaking colleagues like to use (in words like localisation). But there are bigger differences we barely have the time to think through in the thick of the campaign, let alone build and test. For example, in the UK, most fundraising revolves around regular giving (recurring monthly gifts) but that’s not the case in the US. And this End of Year campaign is connected to the major giving days in the US which connect to the tax year and drive those cultural norms. That’s not the same in the UK, though we have a lot of fundraising appeals that link up to Christmas.

For every test we ran this year for our ‘Generic English’ audience, the ‘optimum’ solution for local audiences is different. Red buttons work for some, not for others. ‘Donate Now’ was a good headline in English, but it comes across as shouting in Indonesian.

To truly optimize for every audience, the number of testing variants quickly escalates beyond something we could manage with a team our size. And so everything we do in a campaign like this is a judgement call about where we can have impact given our resources. This is true of all testing strategies but localization and testing have a multiplying effect on the number of variants of any given piece of content.

While we’re not at “localized optimization” yet we have had real impact on our global fundraising this year, mostly due to the L10n community, with a little multiplying bump from some of our testing results.

Maybe one day we can find a way to run more localized testing experiments. I’d love to bring the L10n communities more directly in the campaign process and enable them to run their own design and content tests. I think that would be the most exciting and Mozilla-like way to scale this process. It has lots of challenges and could get really confusing. But I think it’s exciting.

But until then, I think we’re striking a nice balance of testing and localization. If you’re working on a similar fundraising appeal, I think this is a valid approach to take.

1 response

  1. Rob Jaworski wrote on :

    Interesting post, especially to someone like me that works in software [localisation|localization].

    What I like most is your admission that you will and do throwaway a lot of the content that the team creates. That’s the nature of what you are doing. Creating, trying it out, and if it doesn’t work, move on. But on the other side of the coin, you are not asking volunteer translators to create work products that will also be thrown away. For people who are giving their time and skills freely, that’s important to know.

    Now, I wonder if you were to take a non-English stance on things. How about creating copy (and images and other assets?) for a target locale such as France, try it out there, and if it pops, localize it to other locales. What that would mean is you may create a bunch of content to throw away that’s in French, or whatever language you decide to be your source, but it shouldn’t add to the cost of creation (unless it’s in addition to the English source you are creating already).

    Of course, it might not be worth the logistics cost, especially if your creative staff is entirely English speaking. And this would also add to the escalation of testing variants that you mention.

    Good work, keep it up!
    Rob Jaworski
    San Jose, California