Hi,
Keeping the code trees [1] green (meaning free of build or test failures,
regressions, and minimizing intermittent test failures) is the daily
goal of sheriffing.
In order to reach this goal, this means we sometimes have to back out (revert)
changes made by developers. While this is a part of our job, we don’t do
it easily or without reason.
Backouts happen mostly for:
-> Bustage (i.e. Firefox no longer
successfully builds)
-> Test failures caused by a specific change
-> Issues reported by the community, like startup crashes or severe
regressions (these backouts often lead to new nightly builds being
created as well)
-> Performance regressions or memory leaks
-> Issues that block merges like merge-conflicts (like for a mozilla-inbound to mozilla-central merge)
For our primary integration repositories (where our developers land most
their changes), our workflow depends on which repository the problem is
on.
Mozilla-Inbound
-> Close Mozilla-Inbound if needed (preventing
developers from landing any further changes until the problem is
resolved)
-> Try to notify the responsible developer so that they
are aware of the problem caused by their patch
-> If possible, we
accept follow-up patches to fix the problem. This allows us to fail
forward and avoid running extra jobs that require more CPU time and
therefore increase costs.
-> If we don’t get response from the developer within a short
timeframe like 5 minutes, we back out the change and comment in the
bug with a reason for the backout (for example, including a link to the
failure log) and a needinfo to the assigne, to make sure the bug don’t get lost.
Autoland
-> Changesets that cause problems are backed out immediately –
no follow-ups as described above are possible (only the sheriffs can push manually to
autoland)
In any case, backouts are never meant to be personal and it’s part of
our job to try our best to keep our trees open for developers. We also
try to provide as much information as possible in the bug for why we
backed out a change.
Of course, we also make mistakes and it could be that we backed out
changesets that were innocent (like in a case where its not 100% clear
what caused the problem), but we try our best.
If you feedback or ideas how we can make things better, let me know.
Cheers,
– Tomcat
[1] Trees: The tree contains the source code as well as the code required to build each project on supported platforms (Linux, Windows, macOS, etc) and tests for various areas. Sheriffs take care of Firefox Code Trees like mozilla-central, mozilla-inbound, autoland, mozilla-aurora, mozilla-beta and mozilla-esr45/52 – our primary tool is treeherder and can be found here