Mozilla IT

Mozilla IT & Operations

Mozilla DB news, Friday March 16th

  • While adding a custom field to Bugzilla to track the newest SeaMonkey version, the script ran into a lock wait timeout and aborted. Some of the data needed to be manually inserted to finish adding the custom field.
  • We then needed to add database grants so our metrics team could access the new fields.
  • Added access so the Autoland staging server
  • We added the DBAs to what gets paged for our new backup server.
  • This seemed to be the week that a few of machines started having disk issues, though all of them were one-offs (as opposed to having to set expire_logs_days). I did run into a fascinating issue where binary logs for a machine were 7G even though the maximum size was supposed to be 1G.
  • This was also the week that some cron jobs did not get run, because we “sprung ahead”. Monday was a fun day, but luckily everything was easy to fix. Lesson learned: do NOT run anything via cron from 0200 to 0259 because if your server is set to a time zone that observes Daylight Saving Time, it will run twice in October/November and zero times in March.
  • The mozillians.org team wanted some data about group names so they could optimize their searching, so we gave them a data export.
  • We removed some company-sensitive comments from a bugzilla bug.
  • Due to machines being moved around from the old data center to the new one, we had a new location for the developers to pick up their nightly exports of the support.mozilla.com database.
  • Did you know I co-host a weekly podcast about MySQL? It’s called OurSQL Cast. You can find it on Feedburner and iTunes. Episode 83 is up, called “The NewSQL World”, and we interview Ori Herrnstadt, the CTO of Akiban.
  • We got several new database nodes kickstarted in our new data center.
  • We are preparing to upgrade MySQL on Bugzilla’s staging server, which will happen on Sunday.

IT Desktop Newsletter – March 2012 Edition

We have a new team member!

We are still growing! We would like to extend a warm welcome to Ryan Watson as the latest member to join the Desktop Support Team! He will be based in the London office, and supporting the needs in his surrounding area.

Ryan’s favorite part about being a Mozillian is, ”being around great people to learn from and having an open community environment… and of course the free snacks :D .” He also says, “I like to spend my free time either in the gym or with friends nerding out.” If he could travel anywhere in the world, he said, “I would like to check out Japan. I have been fortunate enough to have traveled pretty well in my life, but Japan is still very high on my to-do list. It seems like a very interesting and different place.”

Some Email Client Options

If you prefer to use a local email client we suggest Thunderbird. With Thunderbird you can sync up your email and folders so that you can work offline when you do not have an internet connection.

Thunderbird also has great add-ons, like Lightning. Lightning allows you to manage your calendar within Thunderbird. For further assistance setting up Thunderbird with Lightning please see:
Setting up Thunderbird with Lightning.

If you prefer a web-based email client then we suggest using Zimbra: https://mail.mozilla.com. Don’t forget that within Zimbra we have a list of office conference rooms so that you can conveniently book a room.

  • NOTE: After adding a conference room to the meeting invitation, make sure that you click the [Send] button. If you only use the [Save] button, then Zimbra will not reserve the room on your behalf.

Distribution Lists in Zimbra

Have you ever needed to see the full list of members that are included in a particular distribution list? Here’s how:
  1. In Zimbra mail click the “Address Book” tab.
  2. Change the drop-down search menu from “Contacts” to “Global Address List.”
  3. Type the distribution list name or a part of the name in the search field (e.g. typing London will show a list of employees based out of London, or at least those interested in London office news).
  4. Click the [Search] button.

Employee Purchase Program

Take advantage of our Employee Purchase Program with CDW!

  • For the U.S., go to http://www.cdw.com/epp, enter the EPP code (CF327EFB), and register for an EPP E-Account.
  • For Canada, go to http://www.cdw.ca/epp, enter the EPP code (A9D722B2), and register for an EPP E-Account.

Need computer equipment?

We are here to help! Have you seen the power adapters with pink duct tape and wondered why the flashy decor? These are placed in conference rooms for your convenience during meetings. We prefer these power bricks to stay in the conference room, so if you find that you need to replace the one at your desk, then please contact us. We are happy to get you your own power adapter!

Do you deal with some sensitive information on your computer? You may want a Privacy Screen to place over your external monitor. Submit a bug for IT Desktop or send a quick IRC message and we can get take care of you.

Computer MoseIs your mouse worn down and ready for retirement like this one?  Then we can get you a new one!

 

 

 

 

PGDay NYC, Austin and DC!

The talks for PGDay NYC 2012 have been announced. The full lineup of talks for the one-day conference on Monday, April 2, 2012, is available at http://pgday.nycpug.org/schedule/.

There is a $50 discount on tickets using the code PGMEETUP (http://pgdaynyc2012.eventbrite.com/?discount=PGMEETUP) from now through March 16th.

PGDay NYC is a one day PostgreSQL conference in NYC featuring both well-known PostgreSQL community members as speakers and local users sharing their practical experience of using PostgreSQL as a key part of their infrastructure. PGDay NYC 2012 is part of the “PG Corridor Days” series of one-day conferences to help promote PostgreSQL usage in their locales.

The other conferences are PGDay Austin on Wednesday, March 28 http://www.postgresql.org/about/event/1379/ and PGDay DC on Friday, March 30 (http://pgday.bwpug.org/). These are all non-profit events organized by local users.

MySQL Community Dinner at Pedro’s

Once again, this year there will be a community dinner at Pedro’s. Pythian is organizing this pay-your-own-way, informally fun way to meet and re-connect with colleagues and friends, old and new. It is open to everyone, and is right after the opening reception Percona is throwing (that’s from 4:30 to 6:30).

So on Tuesday, April 10th, meet at 6:30 pm in the lobby of the Hyatt to walk over (about 1 mile), or meet at 7 pm at Pedro’s – 3935 Freedom Circle, Santa Clara, CA. If you want to come, please RSVP by leaving a comment on the Pythian post, so they have an accurate headcount.

Seeking Articles About HTML5 and Python

Because I wrote a book and am somewhat visible in the community, I often get requests to write articles, and many of those requests are not completely appropriate for me. For example, today I got an e-mail from Lukas Rakowski of Software Press, who makes PHP Solutions Magazine. Basically, they’re looking for articles about HTML5 and Python – neither of which are subjects I could author an article on (other than “I know HTML5 is all sorts of awesome” and “I’ve been meaning to script in Python more”).

However, I know that there are many Mozillians that can write and can write about HTML5 and/or Python. To that end, here is the message I received:

We are creating a new version of popular magazine PHP Solutions, which is available online.
The magazine is about programming languages and software creation technologies.
Articles published in the our magazine provide ready-made solutions,
which the reader can use in their work.
We are looking for specialist in web development and programing, who will be
interested in writing some articles in this subject.
May you be interested in sharing your knowledge with us?
We currently seek for HTML5 and/or Python specialists.
Please send me more information about yourself
and your experience in this subject.

If you are interested, please contact Lukas at lukasz dot rakowski at software.com.pl – I have no further information about this; I don’t know what benefits/compensation are involved, how frequently the new magazine will come out, etc. I am just passing along the message.

Mozilla DB news, Week of Fri Mar 9th

This week I am in Mountain View to help with some physical data center moves. It is nice to get to see where the magic happens, and our new data center is pretty awesome. Our data center technicians, Derek Moore and Erica Muxlow, are really doing a stellar job, as are their interns.

At the office:
- Visited the San Francisco office. What a great view!
- A dependent subquery that had been running fine for weeks started acting up, causing 1000% more load than normal on a machine that hosted many sites, including the Firefox Flicks $10,000 Contest website and Tinderbox Push LoadLog [whoops, Freudian slip!]. We changed it to a join and added an appropriate index, and the query now runs much faster and is not causing any load.
- Created a user and database for the development site of Mozilla Labs
- Pointed our webdev team to some newer exports for some data they were looking for
- Enabled MySQL access for another buildbot host
- Cleaned up some older binary logs that were hanging around after a binlog name change in January, also set expire-logs-days for the same server
- Converted some data from latin1 to utf8 on a staging machine and debug’d a cron job that was duplicating data
- Helped assemble 32 new SSD drives for new database blades (and thanks to the rest of the systems team for installing the drives and kickstarting the machines!)

At the data center:
- Racked 16 blades, including scanning in and updating the inventory database with serial numbers and asset tags
- For 6 of those 16, had to physically dig out the toe tags that had the serial numbers on it (due to a manufacturing problem I’m sure)
- Attached rails to several new servers

Global Load Balancing at Mozilla

What is Global Load Balancing?

Global Load Balancing is the act of directing traffic to multiple locations over a wide area. It differs from normal load balancing in that the nodes you’re directing traffic to are not “local”… you can’t count on the link between the load balancer and the node being LAN-speed.

The main purpose of spreading your servers out is to avoid localized failures: datacenter outages, natural disasters, etc. Properly designed, there aren’t many external factors that can take down a site hosted in multiple datacenters. This is where global load balancing comes into play.

In order to make this work, your load balancer itself obviously ought to be available in multiple datacenters- you don’t have much redundancy if your global load balancer is in only one location! But this immediately creates another instance of the same problem… how do you route traffic to your global load balancers?

Luckily, the DNS protocol handles this type of thing for you- you specify multiple nameservers, and the client will (or is supposed to) query one of them, and fail over if it doesn’t work. Consequently, most global load balancers are DNS services.

This is very different from most “normal” load balancers, which are usually either proxies (layer 7) or NAT/IP-mangling devices (layer 4). In some scenarios either solution (LB or GLB) will work… but typically, you will have a global load balancer that directs traffic to multiple normal load balancers, ideally in different datacenters. The normal load balancers will in turn direct traffic to a local cluster of servers in the same datacenter.

Mozilla’s Current GLB Solutions

At Mozilla we’ve experimented with many different solutions for global load balancing. Several solutions have “stuck” over the years… and once they stick, they tend to hang around for quite a while. Consequently, as of March 1 we actually have a total of 5 GLB solutions actively in use, with a 6th under consideration!

  • Netscaler GSLB
    • Built-in functionality of some old Citrix Netscaler appliances that we are phasing out.
    • Removed this week! This handled various web traffic… notably, blog.mozilla.org used this.
      • Most things became non-GLB properties, because only one node/location was actually functional anyway. :)
  • geodns
    • This manages releases.mozilla.org.
    • It’s somewhat strongly embedded because it handles which mirrors are actively in use, depending on which ones are up-to-date. Switching means rewriting this logic on top of another platform.
    • Discussions are ongoing as to how we can replace this, or at least move it out of SJC1 during our datacenter migration.
  • Zeus GLB
    • This manages a few websites… notably, bugzilla.mozilla.org uses this to determine which datacenter is “active” and which is passive.
    • This is actually an end-of-life product. It’s direct replacement is Zeus Multi-Site-Manager, but an upgrade is non-trivial in our case, and we’ve ultimately decided to migrate away from this entirely.
      • Migrations are largely moving to Cedexis, or becoming non-GLB services.
  • 3crowd CrowdDirector
    • This manages 2 things currently: releases-rsync.mozilla.org and irc.mozilla.org.
    • This is a 3rd party service – we delegate certain names over to them, and then use their interface/software to set rules on when it should return which records.
  • Cedexis Openmix
    • This manages most of our multi-hosted websites now.
    • This is also a 3rd party service, similar to 3crowd. The main differences are:
      • 3crowd gets delegations, via NS records… Openmix gets CNAME records. This makes 3crowd a bit more complicated but also a bit more flexible.
      • Openmix is more complicated and more flexible in terms of how routing decisions are actually made- you define your own script. Consequently it can make decisions based on a wide variety of criteria.

Other Solutions Considered

Other solutions considered-and-rejected or actively under consideration:

  • Zeus Multi-Site-Manager
    • Replacement for Zeus GLB… integrated with normal Zeus Traffic Manager appliance.
    • This was rejected as too complicated- by tying together GLB and normal load balancing, it actually became rather confusing trying to maintain a large-ish installation. Ultimately it was easier to keep the two layers separate.
  • Dynect Managed DNS
    • This is another 3rd-party service. We actually used this for all of our DNS management in the past. As we grew it became financially infeasible, and we brought it all in-house. We are now considering sending certain things back to them, specifically for the GLB functionality.
    • The main benefit of Dynect vs Cedexis or 3crowd is that they are a full DNS management service. Neither Cedexis nor 3crowd are quite as feature-complete in cases where you want to host a full domain on them… they’re more aligned with handling certain individual records.

Moving Forward

As you can see, we’ve tried a lot of things on this front. Each system has its own benefits and drawbacks… however, there is a lot of overlap in functionality. We’re in the process of consolidating down to fewer systems. Specifically,

  • Netscaler GSLB is eliminated as of this week!
  • Zeus GLB will be eliminated, likely in Q2/Q3. Migrations will be to Cedexis, non-GLB services, and/or possibly 3crowd.
  • geodns may be eliminated, if we can effectively replace it with Dynect, 3crowd, or Cedexis. Time frame on this is undetermined.

This will leave us with 3 GLB services: Cedexis, 3crowd, and (presuming it passes trials) Dynect. Not quite ideal, but each has it’s unique strong points that we’re not quite willing to give up just yet. It’s certainly an improvement in any case. In time perhaps we can condense even further…

Percona Live Early Bird Pricing Ends Next Week!

Don’t miss your chance to get the early bird pricing for Percona Live:MySQL Conference & Expo! Early bird pricing ends Monday, March 12th, so you have a week left to get maximum savings. If you want even more savings, register with code PL-pod for 10% off. For example, the early bird pricing for the 2 days of the conference sessions is $595, or if you want all 3 days, the pricing is $795. With 10% off your price comes to $535 for 2 days, or $625 for all three days.

I will be doing a tutorial on MySQL Security, including White-hat Google Hacking with MySQL, on Tuesday at 9 am. There is a wonderful lineup of speakers and topics, so be sure to register while you can get the lowest price possible.

Also, Mozilla will be having a booth in the DotOrg Pavilion, so if you have questions about any of our products, like Firefox or Thunderbird, or newer products like Firefox on Mobile or Boot2Gecko, the new mobile hardware platform, stop on by during the conference!

DB Friday, March 2nd edition

Next week I will be in Mountain View to help with the data center move we are doing. My responsibilities will mostly be remote-style work anyway, but it will be nice to see some of my team members in person. So as I close out this week and reflect about what got done in the Mozilla DB world, I also wanted to make clear why I write these. These posts help answer the questions “What does a DBA do?” and “What does Mozilla use databases for?”

This week in Mozilla databases:

  • A database with some tables with a latin1 charset and some tables with a utf8 charset was converted so that all the tables used utf8. This involved exporting using mysqldump --default-character-set=utf8, dropping the table and re-creating it with a utf8 charset, and importing the data.
  • The default configuration for MySQL databases at Mozilla has been changed to set the default character set to utf8, and existing configurations have been changed to set the default charset to utf8. We use puppet to set up and maintain configuration files.
  • A plan to upgrade our Bugzilla databases has been made, approved, and is in progress. And before anyone asks – yes, it is fully my intention to work on query optimization so that future versions of Bugzilla will have awesome queries; however, there is plenty of internal work to do at Mozilla first!
  • Files were generated with the general log and prepared with pt-log-player for benchmarking a system with SSD’s.
  • In preparation for our data center move, we promoted a machine in another data center to be the master for our download.mozilla.org service.
  • We worked with our Infrastructure Security team gathering data on a security issue.
  • Created a new slave for a development web environment
  • Currently our physical backups are cold backups where we shut down our backup instance of MySQL and copy files. We have started to work on implementing Xtrabackup for these cases. This is a more long-term project, as backups work fine for now, but every step on the way to using Xtrabackup is a good one. We also use mysqldump for logical backups.
  • Created new databases for the ci.mozilla.org service
  • Added the custom field “cf_blocking_fennec10″ to the Bugzilla database
  • Updated data and schema for the development and staging databases for CaseConductor, a litmus replacement for the QA team
    Last week’s theme was:
  • There were some tweaks to be made with scripts that use the backups for ETL and updating data for development environments so they would function properly.

What does pt-show-grants look like?

The OurSQL Podcast did an episode on some of the lesser-known but very useful tools in the Percona Toolkit. pt-show-grants is one of those tools that I use pretty frequently. While the manual page has an explanation of all the features and a few examples, you don’t really see the output, and often you decide whether or not to use a tool based on what it gives you as output.

So here is a small example of an actual command I did today using pt-show-grants. I wanted to find the grants for a particular user. To do that without pt-show-grants, I’d have to login to MySQL, run

mysql> SELECT host FROM mysql.user WHERE user='aus4_dev';

And then use that host information in a SHOW GRANTS statement:

mysql> SHOW GRANTS FOR aus4_dev@HOST;

But I would have to do this for each HOST – if there were 2 hosts, I’d have to run the SHOW GRANTS command twice.

Happily, pt-show-grants has an option called –only, which will show you all user@host combinations for the username you specify. I have login information stored in a .my.cnf on this particular dev machine, and except for the password and host, this is an exact copy/paste of what I typed and the output:

[scabral@dev1.db ~]$ bin/pt-show-grants --only aus4_dev
-- Grants dumped by pt-show-grants
-- Dumped from server Localhost via UNIX socket, MySQL 5.1.52-log at 2012-03-01 08:52:01
-- Grants for 'aus4_dev'@'10.0.0.1'
GRANT USAGE ON *.* TO 'aus4_dev'@'10.0.0.1' IDENTIFIED BY PASSWORD '*1234567890ABCDEF1234567890ABCDEF12345678';
GRANT ALL PRIVILEGES ON `aus4_dev`.* TO 'aus4_dev'@'10.0.0.2';
-- Grants for 'aus4_dev'@'10.0.0.2'
GRANT USAGE ON *.* TO 'aus4_dev'@'10.0.0.2' IDENTIFIED BY PASSWORD '*1234567890ABCDEF1234567890ABCDEF12345678';
GRANT ALL PRIVILEGES ON `aus4_dev`.* TO 'aus4_dev'@'10.0.0.2';

By default, if I did not put in the –only, it would show me all the users that I was allowed to see. There is also an –ignore option, so if you want to show all users except a particular username, you can do that as well.

Being able to find all user@host users and their grants given a particular username is very handy and eliminates the need to go into the database to find the hostnames.