The Glean logo

How does the Glean SDK send gzipped pings

(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean.)

Last week’s blog post:This Week in Glean: mozregression telemetry (part 2) by William Lachance.

All “This Week in Glean” blog posts are listed in the TWiG index  (and on the Mozilla Data blog).

In the Glean SDK, when a ping is submitted it gets internally persisted to disk and then queued for upload. The actual upload may happen later on, depending on factors such as the availability of an Internet connection or throttling. To save users’ bandwidth and reduce the costs to move bytes within our pipeline, we recently introduced gzip compression for outgoing pings.

This article will go through some details of our upload system and what it took us to enable the ping compression.

How does ping uploading work?

Within the Glean SDK, the glean-core Rust component does not provide any specific implementation to perform the upload of pings. This means that either the language bindings (e.g. Glean APIs for Android in Kotlin) or the product itself (e.g. Fenix) have to provide a way to transport data from the client to the telemetry endpoint.

Before our recent changes (by Beatriz Rizental and Jan-Erik) to the ping upload system, the language bindings needed to understand the format with which pings were persisted to disk in order to read and finally upload them. This is not the case anymore: glean-core will provide language bindings with the headers and the data (ping payload!) of the request they need to upload.

The new upload API empowers the SDK to provide a single place in which to compress the payload to be uploaded: glean-core, right before serving upload requests to the language bindings.

gzipping: the implementation details

The implementation of the function to compress the payload is trivial, thanks to the `flate2` Rust crate:

/// Attempt to gzip the provided ping content.
fn gzip_content(path: &str, content: &[u8]) -> Option<Vec<u8>> {
let mut gzipper = GzEncoder::new(Vec::new(), Compression::default());
// Attempt to add the content to the gzipper.
if let Err(e) = gzipper.write_all(content) {
log::error!("Failed to write to the gzipper: {} - {:?}", path, e);
return None;
}
gzipper.finish().ok()
}

And an even simpler way to use it to compress the body of outgoing requests:

pub fn new(document_id: &str, path: &str, body: JsonValue) -> Self {
let original_as_string = body.to_string();
let gzipped_content = Self::gzip_content(path, original_as_string.as_bytes());
let add_gzip_header = gzipped_content.is_some();
let body = gzipped_content.unwrap_or_else(|| original_as_string.into_bytes());
let body_len = body.len();
Self {
document_id: document_id.into(),
path: path.into(),
body,
headers: Self::create_request_headers(add_gzip_header, body_len),
}
}

What’s next?

The new upload mechanism and its compression improvement is only currently available for the iOS and Android Glean SDK language bindings. Our next step (currently in progress!) is to add the newer APIs to the Python bindings as well, moving the complexity of handling the upload process to the shared Rust core.

In our future, the new upload mechanism will additionally provide a flexible constraint-based scheduler (e.g. “send at most 10 pings per hour”) in addition to pre-defined rules for products to use.