The Glean logo
Categories: Data Engineering

This Week in Glean: Reducing Release Friction

(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean. You can find an index of all TWiG posts online.)

 

One thing that I feel less confident in myself about is the build and release process behind the software components I have been working on recently.  That’s why I was excited to take on prototyping a “Better Build” in order to get a better understanding of how the build and release process works and hopefully make it a little better in the process.  What is a “Better Build”?  Well that’s what we have been calling the investigation into how to reduce the overall pain of releasing our Rust based components to consumers on Android, iOS, and desktop platforms.

 

Getting changes out the door from a Rust component like Glean all the way into Firefox for iOS is somewhat non-trivial right now and requires multiple steps in multiple repositories, each of which has its own different procedures and ownership.  Glean in Firefox for iOS currently ships via the Application Services iOS megazord, mostly because that allows us to compile most of the Rust code together to make a smaller impact on the app.  That means, if we need to ship a bug fix in Glean on iOS we need to:

  • Create a PR that fixes the bug in the Glean repo, get it reviewed, and land it.  This requires a Glean team member’s review to land.
  • Cut a release of Glean with the update which requires a Glean team member’s review to accomplish.
  • Open a PR in the Application Services repository, updating Glean (which is pulled in as a git submodule), get it reviewed, and land it.  This requires an Application Services team member for review, so now we have another team pulled into the mix.
  • Now we need a new release of the appservices megazord, which means a new release must be cut for the Application Services repo, again requiring the involvement of 1-2 members of that team.
  • Not done yet, now we go to Firefox for iOS and we can finally update the dependencies on Glean to get the fix!  This PR will require product team review since it is in their repo.
  • Oh wait…  there were other breaking changes in Application Services that we found as a side effect of shipping this update that we have to fix…   *sigh*

 

That’s a process that can take multiple days to accomplish and requires the involvement of multiple members of multiple teams.  Even then, we can run into unexpected hiccups that slow the process down, like breaking changes in the other components that we have bundled together.  Getting things into Fenix isn’t much easier, especially because there is yet another repository and release process involved with Android Components in the mix.

 

This creates a situation where we hold back on the frequency of releases and try to bundle as many fixes and changes as possible to reduce the number of times we have to subject ourselves to the release process.  This same situation makes errors and bugs harder to find, because, once they have been introduced into a component it may be days or weeks before they show up.  Once the errors do show up, we hope that they register as test failures and get caught before going live, but sometimes we see the results in crash reports or in data analysis.  It is then not a simple task to determine what you are looking for when there is a Glean release that’s in an Application Services release that’s in an Android Components release that’s in a Fenix release…  all of which have different versions.

 

It might be easier if each of our components were a stand-alone dependency of the consuming application, but our Rust components want and need to call each other.  So there is some interdependence between them which requires us to build them together if we want to take the best advantage of calling things in other crates in Rust.  Building things together also helps to minimize the size impact of the library on consuming applications, which is especially important for mobile.

 

So how was I going to make any of this part of a “Better Build”?  The first thing I needed to do was to create a new git repository that combined Application Services, Glean, Nimbus, and Uniffi.  There were a couple of different ways to accomplish this and I chose to go with git submodules as that seemed to be the simplest path to getting everything in one place so I could start trying to build things together.  The first thing that complicated this approach was that Application Services already pulls in Glean and Nimbus as submodules, so I spent some time hacking around removing those so that all I had was the versions in the submodules I had added.  Upon reflecting on this later, I probably should have just worked off of a fork of Application Services since it basically already had everything I needed in it, just lacking all the things in the Android and iOS builds.  Git submodules didn’t seem to make things too terribly difficult to update, and should be possible to automate as part of a build script.  I do foresee each component repository needing something like a release branch that would always track the latest release so that we don’t have to go in and update the tag that the submodule in the Better Builds repo points at.  The idea being that the combined repo wouldn’t need to know about the releases or release schedule of the submodules, pushing that responsibility to the submodule’s original repo to advertise releases in a standardized way like with a release branch.  This would allow us to have a regular release schedule for the Better Build that could in turn be picked up by automation in downstream consumers.

 

Now that I had everything in one place, the next step was to build the Rusty parts together so there was something to link the Android and iOS builds to, because some of the platform bindings of the components we build have Kotlin and Swift stuff that needs to be packaged on top of the Rust stuff, or at least need repackaged in a format suitable for consumers on the platform.  Let me just say right here, Cargo made this very easy for me to figure out.  It took only a little while to set up the build dependencies.  With each project already having a “root” workspace Cargo.toml, I learned that I couldn’t nest workspaces.  Not to fear, I just needed to exclude those directories from my root workspace Cargo.toml and it just worked.  Finally, a few patch directives were needed to ensure that everyone was using the correct local copies of things like viaduct, Uniffi, and Glean.  After a few tweaks, I was able to build all the Rust components in under 2 minutes from cargo build to done.

 

Armed with these newly built libs, I next set off to tackle an Android build using Gradle.  I had the most prior art to see how to do this so I figured it wouldn’t be too terrible.  In fact, it was here that I ran into a bit of a brick wall.  My first approach was to try and build everything as subprojects of the new repo, but unfortunately, there was a lot of references to rootProject that meant “the root project I think I am in, not this new root project” and so I found myself changing more and more build.gradle files embedded in the components.  After struggling with this for a day or so, I then switched to trying a composite build of the Android bits of all the components.  This allowed me to at least build, once I had everything set up right.  It was also at this point that I realized that having the embedded submodules for Nimbus and Glean inside of Application Services was causing me some problems, and so I ended up dropping Nimbus from the base Better Build repo and just using the one pulled into Application Services.  Once I had done this, the gradle composite build was just a matter of including the Glean build and the Application Services build in the settings.gradle file.  Along with a simple build.gradle file, I was able to build a JAR file which appeared to have all the right things in it, and was approximately the size I would expect when combining everything.  I was now definitely at the end of my Gradle knowledge, and I wasn’t sure how to set up the publishing to get the AAR file that would be consumed by downstream applications.

 

I was starting to run out of time in my timebox, so I decided to tinker around with the iOS side of things and see how unforgiving Xcode might be.  Part of the challenge here was that Nimbus didn’t really have iOS bindings yet, and we have already shown that this can be done with Application Services and Glean via the iOS megazord, so I started by trying to get Xcode to generate the Uniffi bindings in Swift for Nimbus.  Knowing that a build phase was probably the best bet, I started by writing a script that would invoke the call to uniffi-bindgen with the proper flags to give me the Swift bindings, and then added the output file.  But, no matter what I tried, I couldn’t get Xcode to invoke Cargo within a build phase script execution to run uniffi-bindgen.  Since I was now out of time in my investigation, I couldn’t dig any deeper into this and I hope that it’s just some configuration problem in my local environment or something.

 

I took some time to consolidate and share my notes about what I had learned, and I did learn a lot, especially about Cargo and Gradle.  At least I know that learning more about Gradle would be useful, but I was still disappointed that I couldn’t have made it a little further along to try and answer more of the questions about automation which is ultimately the real key to solving the pain I mentioned earlier.  I was hoping to have a couple of prototype GitHub actions that I could demo, but I didn’t quite get there without being able to generate the proper artifacts.

 

The final lesson I learned was that this was definitely something that was outside of my comfort zone.  And you know what?  That was okay.  I identified an area of my knowledge that I wanted to and could improve.  While it was a little scary to go out and dive into something that was both important to the project and the team as well as something that I wasn’t really sure I could do, there were a lot of people who helped me through answering the questions I had.