Introducing the Mozilla Location Service

Hanno Schlichting

11

The Mozilla Location Service is an experimental pilot project to provide geolocation lookups based on publicly observable cell tower and WiFi access point information. Currently in its early stages, it already provides basic service coverage of select locations thanks to our early adopters and contributors.

A world map showing areas with location data. Map data provided by mapbox / OpenStreetMap.

While many commercial services exist in this space, there’s currently no large public service to provide this crucial part of any mobile ecosystem. Mobile phones with a weak GPS signal and laptops without GPS hardware can use this service to quickly identify their approximate location. Even though the underlying data is based on publicly accessible signals, geolocation data is by its very nature personal and privacy sensitive. Mozilla is committed to improving the privacy aspects for all participants of this service offering.

If you want to help us build our service, you can install our dedicated Android MozStumbler and enjoy competing against others on our leaderboard or choose to contribute anonymously. The service is evolving rapidly, so expect to see a more full featured experience soon. For an overview of the current experience, you can head over to the blog of Soledad Penadés, who wrote a far better introduction than we did.

We welcome any ideas or concerns about this project and would love to hear any feedback or experience you might have. Please contact us either on our dedicated mailing list or come talk to us in our IRC room #geo on Mozilla’s IRC server.

For more information please follow the links on our project page.

Hanno Schlichting, on behalf of the geolocation and cloud services teams

Heka 0.3 released

Rob Miller

Those of us here on Mozilla Service’s Heka team were very pleased by the positive response and interest generated by our initial announcement about the project. And we’re even more pleased by the fact that some of you out there have decided to help out, contributing doc tweaks, bug fixes, and, in some cases, completely new plugins back to the Heka core. All the activity has kept us inspired, and we’ve landed a huge number of fixes and improvements ourselves since then. We’re happy to be rolling these out in a new Heka 0.3 release.

A full list of what’s new in this release can be found in the changelog, but here are some of the bigger features:

  • ElasticSearch output: We had just decided that we wanted to write Heka message data out to ElasticSearch (so we could search through our data using a Kibana dashboard) when we received a pull request from Tanguy Leroux providing exactly that. The screenshot below is of a Kibana dashboard. It is displaying a histogram of the 10 (anonymized) Firefox Sync users who received the most 503 HTTP response codes over a specific period of time, extracted by Heka from our load balancer log files.heka-kibana
  • Restartable plugins: It is now possible to specify any Heka input, filter, or output plugin as restartable, so it will reinitialize itself and start over when encountering an error. This is especially useful for plugins that require persistent connections to external services, as it allows them to reconnect. You can also set them to back off exponentially up to a user-defined cap, or add some timing jitter to prevent several reconnection attempts from happening simultaneously.
  • Resume-from-location log file parsing: When shutting down, LogfileInput will note where it stopped parsing a log file, and will try to pick up from the same location when it restarts.
  • Nagios output: If you use Nagios for monitoring, you can now use the NagiosOutput plugin to generate notifications triggered by Heka messages. Combine this with the ability to do arbitrary data processing in Heka’s dynamic Lua filters, and it becomes very easy to set up ad-hoc notifications for specific targeted events.
  • Improved text parsing: We’ve moved the regular expression match group capturing functionality out of the router and into a decoder, so it won’t slow down routing of messages that don’t use capture groups. We also managed to add some timezone-shifting functionality, for cases where a non-UTC time zone is used but not specified in the timestamps.
  • HTTP input: Thanks to an initial effort by David Delassus, we’ve now got an HttpInput plugin that will make HTTP requests and turn the resulting response bodies into Heka messages. You’ll need a custom Lua filter to parse the results and extract useful data, at least until the helpful decoders that we have under development are ready to take over that job for you.
  • Cloudwatch input & output: We’ve added plugins to get data out of and into Amazon’s Cloudwatch metrics service. They’re not in the Heka core, but they’re in the Mozilla Services repository of custom Heka plugins and are available in the released binaries.
  • New mailing list: There’s a new, dedicated Heka mailing list for announcements about changes to configuration options, Heka behavior, and anything else that might impact running Heka servers. Anyone interested in Heka should check it out!

As you can see, that’s a lot of progress. Big thanks to the Heka team and everyone who sent in patches, bug reports, and suggestions – keep them coming!

Heka is improving rapidly, but it’s still best suited for early adopters at this point. If you’re interested in rolling your sleeves up and digging in, please feel free to check out the binaries, the source code, and the documentation. And don’t forget to join the mailing list, and to drop in to the #heka channel on irc.mozilla.org to ask questions or share your experiences.

Introducing Heka

Rob Miller

19

We here on the Mozilla Services team are happy to announce our first beta release (v0.2b1) of Heka, a tool for high performance data gathering, analysis, monitoring, and reporting. Heka’s main component is hekad, a lightweight daemon program that can run on nearly any host machine which does the following:

  • Gathers data through reading and parsing log files, monitoring server health, and/or accepting client network connections using any of a wide variety of protocols (syslog, statsd, http, heka, etc.).
  • Converts the acquired data into a standardized internal representation with a consistent metadata envelope to support effective handling and processing by the rest of the Heka system.
  • Evaluates message contents and metadata against a set of routing rules and determines all of the processing filters and external endpoints to which a message should be delivered.
  • Processes message contents in-flight, to perform aggregation, sliding-window event processing and monitoring, extraction of structured data from unstructured data (e.g. parsing log file output text to generate numeric stats data and/or more processing-friendly data structures), and generation of new messages as reporting output.
  • Delivers any received or internally generated message data to an external location. Data might be written to a database, a time series db, a file system, or a network service, including an upstream hekad instance for further processing and/or aggregation.

Heka is written in Go, which has proven well-suited to building a data pipeline that is both flexible and fast; initial testing shows a single hekad instance is capable of receiving and routing over 10 gigabits per second of message data. We’ve also borrowed and extended some great ideas from Logstash and have built Heka as a plugin-based system. Developers can build custom Input, Decoder, Filter (i.e. data-processing), and Output plugins to extend functionality quickly and easily.

All four of the plugin types can be implemented in Go, but managing these plugins requires editing the config file and restarting and, if you’re introducing new plugins, even recompiling the hekad binary. Heka provides another option, however, by allowing for “Sandboxed Filters,” written in Lua instead of Go. They can be added to and removed from a running Heka instance without the need to edit the config or restart the server. Heka also provides some Lua APIs that Sandboxed Filters can use for managing a circular buffer of time-series data, and for generating ad-hoc graph reports (such as the following example) that will show up on Heka’s reporting dashboard:

56

Heka is a new technology. We’re running it in production in a few places inside Mozilla, but it’s still a bit rough around the edges. Like everything Mozilla produces, however, it’s open source, so we’re releasing early and often to make it available to interested developers (contributors / pull requests welcome!) and early adopters. Here’s a list of resources for those who’d like to learn more:

 

Implementing cross-origin resource sharing (CORS) for Cornice

Alexis Metaireau

This article is the first technical one on the Mozilla services blog. Expect to read more technical content here in the future. We have been using our respective blogs so far to publish content, but we’ll try to publish the new ones on this blog.

For security reasons, it’s not possible to do cross-domain requests. In other words, if you have a page served from the domain lolnet.org, it will not be possible for it to get data from notmyidea.org.

Well, it’s possible, using tricks and techniques like JSONP, but that doesn’t work all the time (see the section below). I remember myself doing some simple proxies on my domain server to be able to query other’s API.

Thankfully, there is a nicer way to do this, namely, “Cross Origin Resource-Sharing”, or CORS.

You want an icecream? Go ask your dad first.

If you want to use CORS, you need the API you’re querying to support it; on the server side.

The HTTP server need to answer to the OPTIONS verb, and with the appropriate response headers.

OPTIONS is sent as what the authors of the spec call a “preflight request”; just before doing a request to the API, the User-Agent (the browser most of the time) asks the permission to the resource, with an OPTIONS call.

The server answers, and tell what is available and what isn’t:

cors_flow

  • 1a. The User-Agent, rather than doing the call directly, asks the server, the API, the permission to do the request. It does so with the following headers:
    • Access-Control-Request-Headers, contains the headers the User-Agent want to access.
    • Access-Control-Request-Method contains the method the User-Agent want to access.
  • 1b. The API answers what is authorized:
    • Access-Control-Allow-Origin the origin that’s accepted. Can be * or the domain name.
    • Access-Control-Allow-Methods a list of allowed methods. This can be cached. Note than the request asks permission for one method and the
      server should return a list of accepted methods.
    • Access-Allow-Headers a list of allowed headers, for all of the methods, since this can be cached as well.
    1. The User-Agent can do the “normal” request.

So, if you want to access the /icecream resource, and do a PUT there, you’ll have the following flow:

> OPTIONS /icecream
> Access-Control-Request-Methods = PUT
> Origin: notmyidea.org
< Access-Control-Allow-Origin = notmyidea.org
< Access-Control-Allow-Methods = PUT,GET,DELETE
200 OK

You can see that we have an Origin Header in the request, as well as a Access-Control-Request-Methods. We’re here asking if we have the right, as notmyidea.org, to do a PUT request on /icecream.

And the server tells us that we can do that, as well as GET and DELETE.

I’ll not cover all the details of the CORS specification here, but bear in mind than with CORS, you can control what are the authorized methods, headers, origins, and if the client is allowed to send authentication information or not.

A word about security

CORS is not an answer for every cross-domain call you want to do, because you need to control the service you want to call. For instance, if you want to build a feed reader and access the feeds on different domains, you can be pretty much sure that the servers will not implement CORS, so you’ll need to write a proxy yourself, to provide this.

Secondly, if misunderstood, CORS can be insecure, and cause problems. Because the rules apply when a client wants to do a request to a server, you need to be extra careful about who you’re authorizing.

An incorrectly secured CORS server can be accessed by a malicious client very easily, bypassing network security. For instance, if you host a server on an intranet
that is only available from behind a VPN but accepts every cross-origin call. A bad guy can inject javascript into the browser of a user who has access to your protected server and make calls to your service, which is probably not what you want.

How this is different from JSONP?

You may know the JSONP protocol. JSONP allows cross origin, but for a particular use case, and does have some drawbacks (for instance, it’s not possible to do DELETEs or PUTs with JSONP).

JSONP exploits the fact that it is possible to get information from another domain when you are asking for javascript code, using the <script> element.

Exploiting the open policy for <script> elements, some pages use them to retrieve JavaScript code that operates on dynamically generated JSON-formatted data from other origins. This usage pattern is known as JSONP. Requests for JSONP retrieve not JSON, but arbitrary JavaScript code. They are evaluated by the JavaScript interpreter, not parsed by a JSON parser.

Using CORS in Cornice

Okay, things are hopefully clearer about CORS, let’s see how we implemented it on the server-side.

Cornice is a toolkit that lets you define resources in python and takes care of the heavy lifting for you, so I wanted it to take care of the CORS support as well.

In Cornice, you define a service like this:

from cornice import Service

foobar = Service(name="foobar", path="/foobar")

# and then you do something with it
@foobar.get()
def get_foobar(request):
    # do something with the request.

To add CORS support to this resource, you can go this way, with the cors_origins parameter:

foobar = Service(name='foobar', path='/foobar', cors_origins=('*',))

Ta-da! You have enabled CORS for your service. Be aware that you’re authorizing anyone to query your server, that may not be what you want.

Of course, you can specify a list of origins you trust, and you don’t need to stick with *, which means “authorize everyone”.

Headers

You can define the headers you want to expose for the service:

foobar = Service(name='foobar', path='/foobar', cors_origins=('*',))

@foobar.get(cors_headers=('X-My-Header', 'Content-Type'))
def get_foobars_please(request):
    return "some foobar for you"

I’ve done some testing and it wasn’t working on Chrome because I wasn’t handling the headers the right way (The missing one was Content-Type, that Chrome was asking for). With my first version of the implementation, I needed the service implementers to explicitely list all the headers that should be exposed. While this improves security, it can be frustrating while developing.

So I introduced an expose_all_headers flag, which is set to True by default, if the service supports CORS.

Cookies / Credentials

By default, the requests you do to your API endpoint don’t include the credential information for security reasons. If you really want to do that, you need to enable it using the cors_credentials parameter. You can activate this one on a per-service basis or on a per-method basis.

Caching

When you do a preflight request, the information returned by the server can be cached by the User-Agent so that it’s not redone before each actual call.

The caching period is defined by the server, using the Access-Control-Max-Age header. You can configure this timing using the cors_max_age parameter.

Simplifying the API

We have cors_headers, cors_enabled, cors_origins, cors_credentials, cors_max_age, cors_expose_all_headers … a fair number of parameters. If you want to have a specific CORS-policy for your services, that can be a bit tedious to pass these to your services all the time.

I introduced another way to pass the CORS policy, so you can do something like that:

policy = dict(enabled=False,
              headers=('X-My-Header', 'Content-Type'),
              origins=('*.notmyidea.org'),
              credentials=True,
              max_age=42)

foobar = Service(name='foobar', path='/foobar', cors_policy=policy)

Comparison with other implementations

I was curious to have a look at other implementations of CORS, in django for instance, and I found a gist about it.

Basically, this adds a middleware that adds the “rights” headers to the answer, depending on the request.

While this approach works, it’s not implementing the specification completely. You need to add support for all the resources at once.

We can think about a nice way to implement this specifying a definition of what’s supposed to be exposed via CORS and what shouldn’t directly in your settings. In my opinion, CORS support should be handled at the service definition level, except for the list of authorized hosts. Otherwise, you don’t know exactly what’s going on when you look at the definition of the service.

Resources

There are a number of good resources that can be useful to you if you want to either understand how CORS works, or if you want to implement it yourself.

Of course, the W3C specification is the best resource to rely on. This specification isn’t hard to read, so you may want to go through it. Especially the “resource processing model” section

Finally, you may want to have a look at the actual implementation in Cornice.

Retiring Firefox Home

mconnor

From the early days, Mozilla has been focused on empowering users across platforms and devices.  We released Firefox Home as an experiment in bringing a part of the Firefox experience to iOS, focusing on Firefox Sync.  This project provided valuable insight and experience with the platform, but we have decided to remove Firefox Home from the Apple App Store and focus our resources on other projects.

For those interested in continuing to use or improve the iOS Sync client that Firefox Home is built on, we have made the source available on GitHub, free of Mozilla trademarks and ready for independent development.  As with all Mozilla projects, we ask developers to be aware of the Mozilla trademark policy.

We remain committed to providing compelling user experiences across as many platforms and devices as possible and will continue to explore the best ways to provide great experiences to iOS users.

– mconnor, on behalf of the Firefox and Services teams

Add-on Sync Coming to Firefox

Gregory Szorc

6

We strive to make your online experience better and we have a new feature in the latest Firebox Beta we think you’ll love: add-on sync.

Add-on sync does what its name implies: it synchronizes add-ons between profiles connected with Firefox Sync. Specifically, it will install, uninstall, enable, and disable add-ons across your devices as you do.

If you are a new Sync user, add-on sync will be enabled by default. However, since Mozilla cares about your privacy and we don’t want to do anything without your explicit permission, existing Sync users will need to manually opt in to the feature. This can be done through the Sync tab in Firefox’s Preferences window. Explicit instructions are available at https://support.mozilla.org/en-US/kb/how-do-i-enable-add-sync.

Once you have add-on sync enabled, you don’t need to do anything special to get your add-ons to sync. As you use your browser, Sync will run in the background. As it does, the current state of your add-ons will be collected and sent to the sync server. As you use your other devices, Sync will apply changes to your local Firefox. Add-on sync runs in the background and you won’t see any pop-ups indicating it is running. And, since some add-ons require a restart for changes to be made, you may not see your new add-ons until you restart your browser. Our studies show that over 99% of Firefox users restart their browser at least daily, so you shouldn’t have to wait too long. If you are curious, you can open the Add-on Manager (about:addons) and see what changes will occur on the next restart.

There are a number of challenges involved with synchronizing add-ons. Because of this, the scope of the initial add-on sync feature has intentionally been limited. For this initial release, an add-on will be synchronized only if all of the following criteria are met:

  • It is an extension or theme
  • It is installed from https://addons.mozilla.org/
  • It is publicly listed on the add-ons site
  • It is installed into the current profile by the user

For now, add-ons are only synchronized between identical application types – changes to a desktop browser will only affect other desktop browsers and changes to a mobile browser will only affect other mobile browsers. Greater functionality between desktop and mobile will come in the future.

Security and privacy are important concerns when designing add-on sync. As with all your Firefox Sync data, add-on data is encrypted in your browser before being transmitted to the Sync server. So, people in the cloud can’t tell what add-ons you have installed, even if they wanted to know.

Add-on sync is a feature in progress and are working on further updates to include in later Firefox releases. You can learn more about add-on sync including background on some key design decisions at https://wiki.mozilla.org/Services/Sync/Addon_Sync. If you would like to get involved, instructions for reaching us can be found at https://wiki.mozilla.org/Services/Sync#Get_Involved.

We hope add-on sync makes managing your online experience with Firefox a little easier. Happy syncing!

Focusing on the integrated Firefox Sync experience

igarcia

1

Today we are announcing the end of official support for the Firefox Sync add-on, focusing resources on improving and supporting the Sync experience for those users in the latest versions of Firefox. This means that the add-on will not be available at addons.mozilla.org and the only way to get Firefox Sync features will be to upgrade to the latest version of Firefox where they are built into the browser.

A lot has happened since Mozilla launched Weave back in 2007: We launched Weave as an add-on, rebranded it as Firefox Sync, moved Firefox to Rapid Release and eventually integrated Firefox Sync into the browser. Our goal has always been to offer the best service so you can seamlessly share your Firefox experience across devices

Since the launch of Firefox 4, the browser comes bundled with all the goodness of Firefox Sync. Because of the continued support for Firefox 3.6, we have continued to support the add-on. This causes a great deal of engineering & testing headaches between the add-on and the integrated Sync. We also see a lot of users unhappiness since the add-on does not support some of the newer, highly desired, features we’ve added to Sync.

Keep in mind the add-on will still be usable for now, but we will not guarantee its reliability in the following months and foreseeable future. We ask everyone still using the Sync add-on to move to newer versions of Firefox to enjoy the best online and Sync experience.

— mconnor & ibai, on behalf of the Services and SUMO teams

Introducing Project Sagrada

mconnor

Project Sagrada is a Services project to build a solid and stable platform to develop new and highly scalable services and applications to support Mozilla’s mission. The initial goal is to make it easy for internal developers to build reliable and high performance backend services, easily deploy them to test or production, and have their deployed services be robust and easy to manage. Based on the current product direction for the known services, this set of capabilities are both the hardest to get right at scale and the most important shared aspects between all of the services.

The first phase of the project will focus on adding capabilities (abuse detection, metrics, crypto, and more) and tooling (testing support, build automation, and packaging/dependency tracking) to the current python services-core framework. Later phases will provide better support for building and deploying web-based applications, increasing amounts of “self-serve” capabilities around deployment and provisioning, and options for isolating experimental/external apps.

Creating a core set of stable and reusable code libraries and, where appropriate, web APIs alongside the programmatic ones, technology choices (i.e. which key-value store, which auth technology, which abuse detection system, etc) are abstracted away and allow the user to focus on the core of code that is unique to their app. We believe this is the best and fastest path to a solid, scalable and highly capable base platform that can be used as a framework and as individual components.

How We Get There

Our efforts to date have focused on building out and refining capabilities as they arisen on a project by project basis. While still maintaining a focus on products we support and products we are working towards launching, our priorities in Q4 and beyond will shift towards building generic support for applications that teams across Mozilla will develop.
We recognize that this is a highly ambitious project, and can be effectively described as completely open-ended. While true, there is much to be said for where we can get in finite time along that path.
The structure of a service generally follows the same pattern: a set of API entrypoints, some authentication, controller logic, storage, and a response. Then there’s the additional questions of deployment, monitoring and maintenance. All of these things combine to make writing an app from scratch a challenge, especially if you want to build it to scale out of the box.

What if we could make that 20% easier? That would seem to be a big win, both internally and externally. We probably have that today – use our base application and generating your entrypoints is pretty simple. Sure, you still have to write all your controller logic, and figure out your storage story, etc, but hey, 20% is a pretty good start.

What if we could make it 40% easier? Simple key-value storage APIs and a strong metrics component might do that. And so on and so on.

Why not use one of the many available frameworks out there? We could probably pick up a quick win building on one of them, and we haven’t ruled out the possibility. However, most frameworks are built to support HTML generation, and while they’re capable of creating the appropriate responses, they pull in a lot of additional functionality that we do not need or use, and if it gets in the way of our ability to scale, it’s a nonstarter. Making it easy for us to get to 30% is a great idea, as long as it doesn’t make it much harder to get to 50%.

The ultimate goal is 100%. This goal is impossible, as any computer programmer can tell you, and trying to achieve that produces the infinite horizon that makes this work seem so vast. But setting that as the goal misses the point – getting a user 75% of the way to having a stable, scalable backend application that does what they need would be an enormous game-changer in the application creation space.

The Roadmap

We’re still in the bootstrapping phase of the project, so we’re building some of the obvious pieces, and gathering feedback on the rest. Please see the wiki for the current set of features we’re looking to add to the platform.

— mconnor & telliott, on behalf of the Services team

Enabling Quotas for Firefox Sync

mconnor

10

This generally won’t affect users, but we wanted to be open about a change that will be deployed soon for our production servers. Essentially, we’re going to be defining a per-user disk quota for Firefox Sync, with the initial limit being 25 MB of data per user. For the vast majority of users, this will be completely invisible, since the typical user uses a lot less. (I’m a very heavy browser user, and I’m at around eight megabytes of usage.) We’ve attempted to answer some of the expected questions below, but anything else can go in the comments.

Why implement quotas?

We want to keep Firefox Sync as a free service for everyone. In order to make this viable, we have to take steps to control costs and prevent abuse. Implementing a reasonable default cap on disk usage is an obvious step in that direction. We believe the initial default will be sufficient for nearly all users.

How much of my data will I be able to sync?

In general, as much as you’re already syncing. Based on our current usage statistics, this quota will be enough for more than 99.9% of our users. The average user uses about 1 MB of data, with an average of 83 bookmarks, 11 passwords, and 1500 history items. Even around the 90th percentile of users, the disk usage is under 6 MB, with around 700 bookmarks, 12000 history items, and 100 passwords.

Can I get more space?

While the specifics have yet to be determined, one of the things we’re working on is a means to allow specific users to have a higher quota. In the meantime, it is already possible to request a quota increase via the Mozilla Services Account web page. We don’t yet have a timeline for if/when these requests will be fulfilled, but we’re using those inputs to help understand why users would need more quota to inform future designs.
Alternatively, for those more technically inclined, our server code is open source and can be installed on a server of your choice. For more information on this, please see the setup instructions.

What happens if I run out of space?

When a user passes the “warning” threshold (we expect to set this to 20 MB), the server will pass this information back to the client, so users get warned before they are unable to write to the server. If a user ignores these warnings and actually hits the cap, they will be unable to write any additional data to the server, and they will see Sync errors informing them of the problem. Firefox has UI that controls what type of data to sync, and how much each type of data is using. Users will be able to disable individual sync engines and free up space, if necessary.

— mconnor, on behalf of the Services team

Firefox Sync on your Android

ragavan

If you’ve just installed Firefox 4 on your Android phone, be sure to set up its must-have feature, Sync. Sync enables you to bring your bookmarks, passwords, history and even open tabs from all of your other devices. Once Sync is setup, you will no longer have to type in long URLs or remember passwords. Your Firefox will have all that and more ready for you, right at your fingertips.

If you are new to Sync, you will first need to set it up on your desktop computer running Firefox. After you have successfully set up Sync on your desktop, you can then proceed to set up Sync on your Android phone.

Firefox for Mobile Sync setup

Watch this video that takes you through the complete setup process, including setting up your phone.

To setup Sync on your computer and mobile, follow these easy instructions.

You’re done! Sync will continue working silently in the background and give you instant access to all of your Firefox data. With Sync, you can get up and go and have your data with you, right in your pocket.

If you need any help setting up Sync, head over to the Firefox Support site or ask a question on the forums.

Happy syncing!


The Firefox Sync team