Sheriffing@Mozilla – Sheriffing and Backouts

April 3rd, 2017 by cbook

Hi,

Keeping the code trees [1] green (meaning free of build or test failures,
regressions, and minimizing intermittent test failures) is the daily
goal of sheriffing.

In order to reach this goal, this means we sometimes have to back out (revert)
changes made by developers. While this is a part of our job, we don’t do
it easily or without reason.

Backouts happen mostly for:
-> Bustage (i.e. Firefox no longer
successfully builds)
-> Test failures caused by a specific change
-> Issues reported by the community, like startup crashes or severe
regressions (these backouts often lead to new nightly builds being
created as well)
-> Performance regressions or memory leaks
-> Issues that block merges like merge-conflicts (like for a mozilla-inbound to mozilla-central merge)

For our primary integration repositories (where our developers land most
their changes), our workflow depends on which repository the problem is
on.

Mozilla-Inbound

-> Close Mozilla-Inbound if needed (preventing
developers from landing any further changes until the problem is
resolved)

-> Try to notify the responsible developer so that they
are  aware of the problem caused by their patch

-> If possible, we
accept follow-up patches to fix the problem. This allows us to fail
forward and avoid running extra jobs that require more CPU time and
therefore increase costs.

-> If we don’t get response from the developer within a short
timeframe like 5 minutes, we back out the change and comment in the
bug with a reason for the backout (for example, including a link to the
failure log) and a needinfo to the assigne, to make sure the bug don’t get lost.

Autoland

-> Changesets that cause problems are backed out immediately –
no follow-ups as described above are possible (only the sheriffs can push manually to
autoland)

In any case, backouts are never meant to be personal and it’s part of
our job to try our best to keep our trees open for developers. We also
try to provide as much information as possible in the bug for why we
backed out a change.

Of course, we also make mistakes and it could be that we backed out
changesets that were innocent (like in a case where its not 100% clear
what caused the problem), but we try our best.

If you feedback or ideas how we can make things better, let me know.

Cheers,
– Tomcat

 

[1] Trees: The tree contains the source code as well as the code required to build each project on supported platforms (Linux, Windows, macOS, etc) and tests for various areas. Sheriffs take care of Firefox Code Trees like mozilla-central, mozilla-inbound, autoland, mozilla-aurora, mozilla-beta and mozilla-esr45/52 – our primary tool is treeherder and can be found here

Sheriffing @ Mozilla – Sheriffing a Community Task!

February 27th, 2017 by cbook
Hi,
i was recently asked if volunteers can help with Sheriffing!
And the answer is very simple: Of course you can and you are very welcome! 
As every part of Mozilla, volunteers are very important. Our team is mixed of Full-Time Employees and Volunteers.
What is needed to join as Community Sheriff:
I think basically there are 3 things you need to have to participate as Community Sheriff:
-> Communication Skills and Teamwork – Sheriffing means a lot of communication – communication with the other sheriff Teams, developers and teams like Taskcluster and Release Engineering. 
-> Background Knowledge how Bugzilla works (commenting in bugs, resolving bugs and setting flags etc)
-> Ability to see context & relationships between failures (like the relation of a set of failures to a checkin) to identify the changeset that causes the regression.
All our tools are public accessible and you don’t need any specific access rights to get started.
Our main tool is Treeherder (https://treeherder.mozilla.org and the first task a Community Sheriff could do is to watch Treeherder and star failures.
We have described this task here  https://mzl.la/2l2T7NJ
That would help us a lot!
When you are curious how a day in Sheriffing looks then maybe https://blog.mozilla.org/tomcat/2015/07/03/a-day-in-sheriffing/ can help 🙂
Please let us know when you are interested in becoming a Sheriff! You can find us on irc.mozilla.org in the #sheriffs channel!
Cheers,
-Tomcat

Sheriffing @ Mozilla – checkin-needed

January 26th, 2017 by cbook
Hi,
Working as Sheriff @ Mozilla is much more than just monitoring our trees and doing things like backouts. In 2017 i wanted to start to blog more about what we do and here is:
Part 1 – Checkin-needed
A lot of checkins land everyday on the Mozilla repositories. Some are great new features and improvements and some are bugfixes of existing bugs etc.
While a lot of checkins are done by the developers themselves, also sheriffs are involved in this.
We not only monitor the Mozilla repositories (aka the tree) we also do checkins for people who don’t have the appropriate level of permission to check in changes (for example new community members).
In the past checkin-needed was used by developers to reduce load on our build systems with fewer pushes but with a more robust build system this isn’t relevant anymore. 
So checkin-needed is more and more important for developers without access-levels to do commits and, as mentioned, new community members who for example finished their first patch.
To request checkin-needed people use this keyword in Bugzilla or use the [checkin-needed-beta] or [checkin-needed-aurora] whiteboard entry for the patches in a bug.
For me personal is checkin-needed a very important task because you sometime check in a patch from someone who just started to contribute to Mozilla. So you are one of the people that are the first contacts to the new contributor and you help them getting the patch landed. Thats also a good opportunity to say “thanks for contributing to Mozilla” to the new community member, this is great motivation and recognition! 
How we work :
    
We have a wiki page with a bug query and some basic information for the Sheriff on Duty [1]. We use this query to get a overview what bugs need checkins.
For bugs with a patch attached that is not on mozreview we check the checkin-neeed request for:
    -> Has proper review before doing anything else
    -> Has a successful try run to avoid any bustage on checkin
and land the patch on mozilla-inbound.
For bugs with patches in mozreview we use the autoland tool to do the checkins. 
However we still check if the bug has review and check the try run.
This is the preferred way of doing checkin-neededs since autoland is an automated system.
How can you help 
  • ->  Use Mozreview – Autoland – it helps us to do more checkins in less time due to the automated tasks. Please make sure that there are no open issues in Mozreview when you request checkin-needed. In fact, you can land them yourself with autoland. In the future, checkin-needed will only be allowed on security bugs.
  • -> Make sure that you have a passing try run. It is a waste of the sheriffs time to come look at a checkin-needed and it has failures in the try run.
  • -> When you have multiple patches that need to land and the patches need to land in a specific order – please make a comment in the bug with the correct order.
  • -> When there are dependencies with other bugs – please state this in the bug.  
We try to do checkin-needed checks and checkins several times a day depending on sheriff workload etc so we cannot guarantee a turnaround time but trying to do our best.
When you have feedback/suggestions or idea how to do this task better let us know anytime!
Also as every part of the Mozilla Project we also depend on Community Members like you! So if you are interested to be become a Community Sheriff let me know!
Cheers,
– Tomcat
[1] https://wiki.mozilla.org/Sheriffing/How:To:Landing_checkin-needed_patches

Please take part in the Sheriff Survey!

June 8th, 2016 by cbook

Hi,

When we moved to the “inbound” model of tree management, the Tree Sheriffs became a crucial part of our engineering infrastructure. The primary responsibility of the Sheriffs is and will always be to aid developers to easily, quickly, and seamlessly land their code in the proper location(s) and ensure that code does not break our automated tests.

But of course there is always room for improvements and ideas how we can make things better. In order to get a picture from our Community (YOU!) how things went and how we can improve our day-to day-work we created a Survey!

You can find the Survey here:

Thanks for taking part in this survey!

Also you can find some of us also in London during the Mozilla All-hands if you want to talk to us directly!

Cheers,

– Tomcat

Sheriff Newsletter for January 2016

February 5th, 2016 by cbook
Hi,
To give a little insight into our work and make our work more visible to our Community we decided to create a  monthly report of what’s going on in the Sheriffs Team.
If you have questions or feedback, just let us know!
In case you don’t know who the sheriffs are, or to check if there are current issues on the tree, see:
https://public.etherpad-mozilla.org/p/sheriffing-notes
Topics of this month!
1. How-To article of the month
2. Get involved
3. Statistics for January
4. Orange Factor
5. Contact
1. How-To article of the month and notable things!
-> In the Sheriff Newsletter we mentioned the “Orange Factor” but what is this ?  It is simply the ratio of oranges (test failures) to test runs. The ideal value is, of course, zero.

Practically, this is virtually impossible for a code base of any substantial size,so it is a matter of policy as to what is an acceptable orange factor.

It is worth noting that the overall orange factor indicates nothing about the severity of the oranges. [4]

The main site where you can checkout the “Orange Factor” is at https://brasstacks.mozilla.com/orangefactor/  and some interesting info’s are here https://wiki.mozilla.org/Auto-tools/Projects/OrangeFactor
-> As you might be aware Firefox OS has moved into Tier 3 Support [5] – this means that there is no Sheriff Support anymore for the b2g-inbound tree.

Also with moving into tier-3 – b2g tests have also moved to tier 3 and this tests are by default “hidden” on treeherder. To view test results as example on treeherder for mozilla-central you need to click on the checkbox in the treeview “show/hide excluded jobs”.

2. Get involved!
Are you interested in helping out by becoming a Community Sheriff? Let us know!
3. Statistics
Intermittent Bugs filed in January  [1]: 667
and of those are closed: 107 [2]
For Tree Closing times and reasons see:
http://futurama.theautomatedtester.co.uk/
4. Orange Factor
Current Orangefactor [3]: 12.92
5.  How to contact us
There are a lot of ways to contact us. The fastest one is to contact
the sheriff on duty (the one with the |sheriffduty tag on their nick
🙂 or by emailing sheriffs @ mozilla dot org.
Cheers,

– Tomcat on the behalf of the Sheriffs
[1] http://mzl.la/1NOD4Zz
[2] http://mzl.la/1mas9TS
[3] https://brasstacks.mozilla.com/orangefactor/?display=OrangeFactor&endday=2016-02-01&startday=2016-01-01&tree=trunk
[4] http://people.mozilla.org/~mcote/war-on-orange/war-on-orange-paper-testistanbul.pdf
[5] https://groups.google.com/d/msg/mozilla.dev.platform/gF-kiJV21ro/qJRk1B-KAAAJ

7 Years at Mozilla!

July 3rd, 2015 by cbook

Hi,

since last month i’m now 7 years at Mozilla as full-time employee \o/

Of course I’m longer around because i started as Community Member in QA years before. And its a long way from my first steps at QA to my current role as Code Sheriff @ Mozilla.

I never actively planned to join the Mozilla Community it just happened 🙂 I worked back in 2001 at a German Email Provider as 2nd Level Support Engineer and as part of my Job (and also to Support Customers) we used different Email Programm’s to find out how to set-up the Programm and so.

Some Friends already involved into OpenSource (some linux fans) pointed me to this Mozilla Programm (at that time M1 or so) and i liked the Idea with this “Nightly”. Having everyday a brand new Program was something really cool and so started my way into the Community without even knowing that i’m now Part of the Community.

So over the years with Mozilla i finally filed my first bug and and was scared like hell (all this new fields in a non-native language) and not really knowing what i signed up when i clicked up this “submit” button in bugzillla 🙂  (was not even sure if i’m NOW supposed to fix the bug 🙂

And now i file dozens of Bugs every day while on Sheriffduty or doing other tasks 🙂

I learned a lot of stuff over the last years and still love being part of Mozilla  and its the best place to work for me! So on to the next years at Mozilla!

– Tomcat

a day in sheriffing

July 3rd, 2015 by cbook

Hi,

since i talked with a lot of people about Sheriffing and what we do here is what a typical day for me look:

We care about the Code Trees like Test Failures etc

I usually start the day with checkin the trees we are responsible for  for test failures using treeherder. This gives me first a overview of the current status and as well make sure that everything is ok for the Asian and European Community which is online at that time.

This Tasks is ongoing till the end of my duty shift. From time to time this means we have to do backouts for code/test regressions.
Beside this i do stuff like checkin-neededs, uplifts etc and other tasks and of course always availble for questions etc on irc 🙂

Also i was thinking about some parts of my day-to-day experience:

Backouts and Tree Closures:

While backouts of code for test failures/bustages etc is one important task of sheriffs (and the managing of tree closures related to this), its always a mixed feeling to backout work from someone (and no one wants to cause a bustage) but its important to ensure quality of our products.

Try Server!!!

Tree Closures due to backouts can have the side effect that others are blocked with checkins. So if in doubt if your patch compile or could cause test regressions, please consider a try run, this helps at lot to keep tree closures for code issues at a minimum.

And last but not least Sheriffing is a Community Task! So if you want to be part of the Sheriff Team as Community Sheriff please sent me a mail at tomcat at mozilla dot com

Thanks!

– Tomcat

Results of the Sheriff Survey

April 1st, 2015 by cbook

Hi,

we closed our Sheriff Survey on Monday and i wanted to share some highlights from the Results. Thanks for taking part in the Survey!

1.Communication with the Sheriffs

We got very good and positive Feedback about the Interaction/Communication with the Sheriffs. We know that backouts are never a good/positive thing and we sheriffs assume always the best intentions – nobody _wants_ to cause bustage, but it happens.

We also noticed a lot of comments of checkin-needed requestors and the hope we have at some time the autolander system (that lands patches automatically). There is work being done on this like as example in https://bugzilla.mozilla.org/show_bug.cgi?id=1128039

 

2. Trychooser and other Feedback

We got comments about trychooser and how this could be improved. That Feedback is very valuable and we will pass that Feedback over to the Releng Folks. For all Feedback and Suggestions we are looking at the survey what we can improve and realize. As example one result is now https://bugzilla.mozilla.org/show_bug.cgi?id=1145836 🙂

 

3. Getting Involved!

We got several Community Member with interest in helping out with Sheriffing! Thats really great and we will follow-up here soon. Also its not too late to get involved. Just drop me or the sheriff lists (sheriffs@mozilla.org) a note!

 

4. You can reach us at anytime!

While the Survey is closed now you can still contact us anytime for feedback, questions and when you want to be involved! Just drop us a note at sheriffs@mozilla.org or ping the Sheriff on duty (normally the one with the |sheriffduty tag in #developers on irc.mozilla.org).

Thanks!

 

– Tomcat

First overview from the sheriff survey!

March 24th, 2015 by cbook

Hi,

thanks for all the Reply’s we got for the Sheriff Survey! If you haven’t already took part in it, its still online and you can still take part in the survey!

While we close the Survey in a few days and i will provide a comprehensive overview of course, i was feeling i could already do some quick overview what we got so far.

One big take away is how important checkin-needed requests is and how many people depend on this. We are very sorry if there are delays with picking up checkin-needed requests but since its a human task it depend how much is ongoing with the trees etc.

But there is work being done on Autoland like on https://wiki.mozilla.org/Auto-tools/Projects/Autoland 🙂

Also to follow up on 2 concrete things (you might know or maybe not).

Question: How do i know why the tree is closed (when we have a tree closure) on Treeherder

Answer:  Just hover over the repo name in Treeherder (as example mozilla-inbound) or click on the info button right next to the repo name

Question: When i land something on like mozilla-inbound its a mess to manually copy and past the hg changeset url to bug

Answer: We have a tool called mcmerge its right next to every push in the drown-down arrow action menu and unlike the name says its not just to mark merges. During the survey we found out that the name is misleading so we trying to find a new name – https://bugzilla.mozilla.org/show_bug.cgi?id=1145836

Thanks,

 

– Tomcat

Please take part in the Sheriff Survey

March 17th, 2015 by cbook

Hi,

When we moved to the “inbound” model of tree management, the Tree Sheriffs became a crucial part of our engineering infrastructure. The primary responsibility of the Sheriffs is and will always be to aid developers to easily, quickly, and seamlessly land their code in the proper location(s) and ensure that code does not break our automated tests. In the service of this objective, the Sheriffs work closely with the larger engineering organization to create and enforce landing policies that increase productivity while maintaining an efficient and robust automated testing system. Beyond the policy role, they have also become shepherds of automation quality by monitoring intermittent failures, performing uplifts and merges, and identifying poorly performing automation machines. This role has proven successful, and so a formal module for the Tree Sheriffs in the larger context of the Activities Module was created.

But of course there is always room for improvements and ideas how we can make things better. In order to get a picture from our Community how things went and how we can improve our day-to day-work.

So we created the Sheriff Survey here -> http://goo.gl/forms/kRXZDtSjSj
Thanks for taking part in that!

– The Mozilla Tree Sheriffs!