The Future of Sync

Intro

There’s a new Sync back-end! The past year or so has been a year of a lot of changes and some of those changes broke things. Our group reorganized, we moved from IRC to Matrix, and a few other things caught us off guard and needed to be addressed. None of those should be excuses for why we kinda stopped keeping you up to date about Sync. We did write a lot of stuff about what we were going to do, but we forgot to share it outside of mozilla. Again, not an excuse, but just letting you know why we felt like we had talked about all of this, even though we absolutely had not.

So, allow me to introduce you to the four person “Services Engineering” team whose job it is to keep a bunch of back-end services running, including Push Notifications and Sync back-end, and a few other miscellaneous services.

For now, let’s focus on Sync.

Current Situation

Sync probably didn’t do what you thought it did.

Sync’s job is to make sure that the bookmarks, passwords, history, extensions and other bits you want to synchronize between one copy of Firefox gets to your other copies of Firefox. Those different copies of Firefox could be different profiles, or be on different devices. Not all of your copies of Firefox may be online or accessible all the time, though, so sync has to do is keep a temporary, encrypted copy on some backend servers which it can use to coordinate later. Since it’s encrypted, Mozilla can’t read that data, we just know it belongs to you. A side effect is that adding a new instance of Firefox (by installing and signing in on a new device, or uninstalling and reinstalling on the same device, or creating a new Firefox profile you then sign in to), just adds another copy of Firefox to Sync’s list of things to synchronize. It might be a bit confusing, but this is true even if you only had one copy of Firefox. If you “lost” a copy of Firefox because you uninstalled it, or your computer’s disc crashed, or your dog buried your phone in the backyard, when you re-installed Firefox, you add another copy of Firefox to your account. Sync would then synchronize your data to that new copy. Sync would just never get an update from the “old” version of Firefox you lost. Sync would just try to rebuild your data from the temporary echoes of the encrypted data that was still on our servers.

That’s great for short term things, but kinda terrible if you, say, shut down Firefox while you go on walk-about only to come back months later to a bad hard drive. You reinstall, try to set up sync, and due to an unexpected Sync server crash we wound up losing your data echos.

That was part of the problem. If we lost a server, we’d basically tell all the copies of Firefox that were using that server, “Whoops, go talk to this new server” and your copy of Firefox would then re-upload what it had. Sometimes this might result in you losing a line of history, sometimes you’d get a duplicate bookmark, but generally, Sync would tend to recover OK and you’d be none the wiser. If that happens when there are no other active copies of Firefox for your account , however, all bets were off and you’d probably lose everything since there were no other copies of your data anywhere.

A New Hope Service

A lot of folks expected it to be a Backup service. The good news is, now it is a backup service. Sync is more reliable now. We use a distributed database to store your data securely, so we no longer lose databases (or your data echos). There’s a lot of benefit for us as well. We were able to rewrite the service in Rust, a more efficient programming language that lets us run on less machines.

Of course, there are a few challenges we face when standing up a service like this.

Sync needs to run with new versions of Firefox, as well as older ones. In some cases, very old ones, which had some interesting “quirks”. It needs to continue to be at least as secure as before while hopefully giving devs a chance to fix some of the existing weirdness as well as add new features. Oh, and switching folks to the new service should be as transparent as possible.

It’s a long, complicated list of requirements.

How we got here

First off we had to decide a few things. Like what data store were we going to use. We picked Google Cloud’s Spanner database for its own pile of reasons, some technical, some non-technical. Spanner provides a SQL like database which means that we don’t have to radically change existing MySQL based code. This means that we can provide some level of abstraction allowing for those who want to self-host without radically altering internal data structures. In addition, Spanner provides us an overall cost savings in running our servers. It’s a SQL like database that should be able to handle what we need to do.

We then picked Rust as our development platform and Actix as the web base because we had pretty good experience with moving other Python projects to them. It’s not been magically easy, and there have been plenty of pain points we’ve hit, but by-and-large we’re confident in the code and it’s proven to be easy enough to work with. Rust has also allowed us to reduce the number of servers we have to run in order to provide the service at the scale we need to offer it, which also helps us reduce costs.

For folks interested in following our progress, we’re working with the syncstorage-rs repo on Github. We also are tracking a bunch of the other issues at the services engineering repo.

Because Rust is ever evolving, often massively useful features roll out on different schedules. For instance, we HEAVILY use the async/await code, which landed in late December of 2019, and is taking a bit to percolate through all the libraries. As those libraries update, we’re going to need to rebuild bits of our server to take advantage of them.

How you can help

Right now, all we can ask is some patience, and possibly help with some of our Good First Bugs. Google released a “stand-alone” spanner emulator that may help you work with our new sync server if you want to play with that part, or you can help us work on the traditional, MySQL stand alone side. That should let you start experimenting with the server and help us find bugs and issues.

To be honest, our initial focus was more on the Spanner integration work than the stand-alone SQL side. We have a number of existing unit tests that exercise both halves and there are a few of us who are very vocal about making sure we support stand-alone SQL databases, but we can use your help testing in more “real world” environments.

For now, folks interested in running the old python 2.7 syncserver still can while we continue to improve stand-alone support inside of syncstorage-rs.

Some folks who run stand-alone servers are well aware that Python 2.7 officially reached “end of life”, meaning no further updates or support is coming from the Python developers, however, we have a bit of leeway here. The Pypy group has said that they plan on offering some support for Python 2.7 for a while longer. Unfortunately, the libraries that we use continue to progress or get abandoned for python3. We’re trying to lock down versions as much as possible, but it’s not sustainable.

We finally have rust based sync storage working with our durable back end running and hosting users. Our goal is to now focus on the “stand-alone” version, and we’re making fairly good progress.

I’m sorry that things have been too quiet here. While we’ve been putting together lots of internal documents explaining how we’re going to do this move, we’ve not shared them publicly. Hopefully we can clean them up and do that.

We’re excited to offer a new version of Sync and look forward to telling you more about what’s coming up. Stay tuned!

19 responses

  1. Jigar wrote on :

    Profiles for sync. It has been long throne issue for me. Example, I need to sync my password to both my work and home computer vomit not addons. Addons I want on all my home computers. Currently, it’s either you have same add-on on all computers or not. If I disable addon sync. I can’t get same add-on for all my home devices.

    1. JR Conlin wrote on :

      Sync operates by syncing data to all computers tied to a given profile. It is not really designed to let you pick and choose what to sync to which computer. (for instance, if you disable syncing add-ons, this will remove all sync data about your add-ons from our servers, which will prevent it from syncing to any other machine.)

      One way you may be able to solve this is by using Profiles (start Firefox with “firefox -p -no-remote”) which will let you run multiple versions of firefox with completely different Profiles. This is what I do in order to keep my personal work separate from my professional work, and it sounds a bit like that’s what you’re doing now. It’s true, however, that these profiles are different and can’t communicate with each other. As far as we’re concerned, they’re separate people and they get treated as securely and privately as anyone else.

      For what it’s worth, I “seeded” my bookmarks from one profile to the new one by using the Import/Export function. https://support.mozilla.org/en-US/kb/export-firefox-bookmarks-to-backup-or-transfer I’ll also add that doing selective sync is tricky, at best, and gets increasingly complicated the more devices you add.

  2. andy wrote on :

    Will making new boolean prefs and prepending services.sync.prefs.sync. to non-sync’d preferences still add them to what is sync’d?

    ex: say I wanted to sync dom.push.enabled
    so I created a new boolean called services.sync.prefs.sync.dom.push.enabled and set it to true- in older sync that would start syncing (both) preferences.

    1. Lina wrote on :

      Hi! Yep, that’s still how it works, with one change: in recent-ish versions of Firefox (68+), new `services.sync.prefs.sync.*` control prefs aren’t synced by default, so your other devices won’t apply the preferences at first.

      You’ll either need to manually set the `services.sync.prefs.sync.dom.push.enabled` control pref on all your devices, or flip the `services.sync.prefs.dangerously_allow_arbitrary` pref on your devices one time to sync the new control prefs. Check out this SUMO article for the details: https://support.mozilla.org/en-US/kb/sync-custom-preferences

      I hope that helps!

  3. Martin wrote on :

    Is the new Sync backend already active?

    1. JR Conlin wrote on :

      Yep.

  4. Jon wrote on :

    This is great news, although I’ve tried to keep my profile backed up on my own, I did always assume sync would give me a backup.

    Sync has been working quite well for me with one small problem: syncing themes. The option is on, and they do sync themes I have installed, but whenever the next device starts syncing, the theme resets back to default and have to manually enable the new theme.

    I’ve worked around this for a while by bookmarking the theme whenever I select a new one, and then when I use the next device and the theme goes back to default, I select the bookmark, click ‘Enable’ and then I’m good to go. I’ve had this problem for years and I’ve never seen anyone else complain about it, is my account broken?

    1. JR Conlin wrote on :

      Unfortunately, I work more on the backend parts than the front, so I’m not fully familiar with which part of sync handles the themes.

      You could try asking in the #firefox_sync channel on chat.mozilla.org

  5. Mekin wrote on :

    I hope this new service solves the sleep-mode state’s sync problem: https://support.mozilla.org/tr/questions/1288007

    1. JR Conlin wrote on :

      Sadly, this is more about the backend that sync is talking to than the bits in the browser. I didn’t see a bug in Bugzilla around it, so the Sync team may not be aware it’s a problem. Have you filed one yet?

  6. Shane wrote on :

    Thanks for the update. Does this mean you guys are gonna be reevaluating the storage.sync quotas? By that I mean the quotas discussed here: https://blog.mozilla.org/addons/2020/07/09/changes-to-storage-sync-in-firefox-79/

    I’ve been hoping for a rollback on these quotas since then because they broke my addon. It turns the new tab page into a page with a big text input field where you can enter notes, save them to pages, and so on. It’s not like I’m writing manifestos on my new tab page or anything, but the quotas are so low that you can’t store more than a paragraph or so per page. I had to switch to storage.local, which is okay, at least my notes save permanently, but not having them synced between devices is a really big downgrade at least for me personally.

    If we can’t get better quotas, would you consider implementing a feature to allow users to easily sync to a private database on google drive or something along those lines? Like provide a link/oauth to a privately hosted sync db or something. I’m not sure what’s really practical or feasible, just hoping you’re looking into possible methods for addons to continue syncing larger packs of data as they did before those quotas. Thanks!

    1. JR Conlin wrote on :

      The quotas exist for our storage management. Running this service costs us a surprising amount of money, and while we are fortunate that we can offer it without cost to customers, at some point, we have to limit things. In addition, I’d warn add on authors to be mindful of how much storage they may be causing their users to be consuming. I don’t know if your add on is responsible, but there are a few users who have several GB of just tombstone records which prevents us from moving their records to the new system. (We’re looking into various solutions for this, but determining the validity of tombstone records can be trickier than you’d first imagine.)

      There are several options for syncing data between devices aside from just using Sync, such as WebRTC, WebDAV, or even using WebPush (and exchange URLs if the data is larger than the max data size).

      1. Shane wrote on :

        Well someone would have to manually create hundreds or thousands of pages and enter a lot of text into them to actually hit the 100KB limit, it’s the per-value limit that puts an artificial and really low limit on how much text the script can save. It does a fine job of cleaning the storage because it only uses generic keys for each page of notes. my understanding is as long as the extension ID remains the same there should never be any orphaned data, though I’m no expert.

        I do know there are other ways users can achieve a similar effect, but they are way more difficult to implement, and would either not work automatically, out of the box, or they would require users to hand their data to a server controlled by me. And I don’t generate any revenues from the addon so I can’t afford to pay for a webserver anyway.

        The only thing I can think of that wouldn’t require mozilla’s intervention or require private hosting is using the google drive API to store application-specific data in a private folder on the user’s google drive account. But it would require users to give their username and password for oauth, and it would be subject to google drive API’s own quotas, which put a limit on the number of requests, not on the amount of data.

        The nature of the app sort of precludes this, because the reason it’s special is that it automatically updates the database when the user types, deletes a page, etc. It’s sort of data-centric in that it renders the html of the page based on the data present in the db. The user doesn’t have to do any kind of special interaction to sync it and may not even be aware that it was syncing. So any time you type it is sending multiple write and read requests, and every time you open the new tab page it’s sending a write request. Probably a single user would cap out my daily quota. Since although they each have their own google drive account with however much storage they have, the addon can have only one API key, which is tied to my specific project. Everyone’s requests would be eating up the same quota for the same project

  7. Andrey Vorobyev wrote on :

    Good morning.

    > Like what data store were we going to use.
    > We picked Google Cloud’s Spanner database for its own pile of reasons

    Your services running on Google Cloud will work in Crimea?

    1. JR Conlin wrote on :

      I believe so, yes. As I understand, there are only a few countries where data service is not allowed (e.g. North Korea), and some countries or regions may block access for their own reasons. That’s one of the reasons we’re working hard on providing the stand-alone server as well.

  8. John-Pierre wrote on :

    Does this mean that the labels you can assign to the bookmarks on Firefox desktop will also be synced to Firefox mobile? It’s a major drawback that these are not being synced, because they are great for saving bookmarks in ‘virtual folders’.

    And will it mean that there is no separate list for mobile vs desktop bookmarks. It doesn’t really make sense to me that they are saved in different lists. They are all bookmarks in a single sync profile after all.

    Thanks
    JP

    1. JR Conlin wrote on :

      As I understand, mobile had a legacy of different teams working on their implementation of the bookmark merge engine. It’s a bit complicated, but Sync is more like a delivery van picking up and dropping off crates of “stuff”. Bookmarks are just one of the crates of “stuff”. It’s up to each device to determine how to unpack and use the data.

      One of the more recent efforts is to fix a fair bit of that by rolling out a more unified architecture. I believe the Application-Services group is working on that (specifically the “Dog Ear” effort).

      1. John-Pierre wrote on :

        Thank you for your reply. Did some searching and found this feature request. Hopefully it will be implemented.

        Another suggestion now there is a complete backend with all bookmarks (no idea where else to post it) is to have web access to the bookmarks, similar to how Delicious from 2003 and tagpacker work(ed). That way you can share the bookmarks inside a tag with friends.

        Thanks
        JP

        1. JR Conlin wrote on :

          I’m not so sure… Sync is end-to-end encrypted, meaning that the server can not (and really does not) want to read any of your info. I mean, you’re absolutely free to export your bookmarks to a file and publish them how you see fit, but trying to sift out “Well, this bookmark is public” vs. “This bookmark contains personal info I may be unaware is being disclosed” is not something anyone who works on back-end systems ever wants to deal with. Privacy scales remarkably well in that respect.

          Plus, competing against pinboard doesn’t have the greatest record.