{"id":196,"date":"2020-05-26T09:13:34","date_gmt":"2020-05-26T09:13:34","guid":{"rendered":"https:\/\/blog.mozilla.org\/data\/?p=196"},"modified":"2020-05-26T09:13:34","modified_gmt":"2020-05-26T09:13:34","slug":"how-does-the-glean-sdk-send-gzipped-pings","status":"publish","type":"post","link":"https:\/\/blog.mozilla.org\/data\/2020\/05\/26\/how-does-the-glean-sdk-send-gzipped-pings\/","title":{"rendered":"How does the Glean SDK send gzipped pings"},"content":{"rendered":"<p>(\u201cThis Week in Glean\u201d is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean.)<\/p>\n<p>Last week\u2019s blog post:<a href=\"https:\/\/blog.mozilla.org\/data\/2020\/05\/08\/this-week-in-glean-mozregression-telemetry-part-2\/\">This Week in Glean: mozregression telemetry (part 2)<\/a> by William Lachance.<\/p>\n<p>All \u201cThis Week in Glean\u201d blog posts are listed in the<a href=\"https:\/\/mozilla.github.io\/glean\/book\/appendix\/twig.html\"> TWiG index<\/a> \u00a0(and on the<a href=\"https:\/\/blog.mozilla.org\/data\/category\/glean\/\"> Mozilla Data blog<\/a>).<\/p>\n<p>In the Glean SDK, when a ping is <a href=\"https:\/\/mozilla.github.io\/glean\/book\/appendix\/glossary.html#submission\">submitted<\/a> it gets internally persisted to disk and then queued for upload. The actual upload may happen later on, depending on factors such as the availability of an Internet connection or throttling. To save users\u2019 bandwidth and reduce the costs to move bytes within our pipeline, we recently introduced <a href=\"https:\/\/en.wikipedia.org\/wiki\/Gzip\">gzip<\/a> compression for outgoing pings.<\/p>\n<p>This article will go through some details of our upload system and what it took us to enable the ping compression.<\/p>\n<p><strong>How does ping uploading work?<\/strong><\/p>\n<p>Within the Glean SDK, the glean-core Rust component does not provide any specific implementation to perform the upload of <a href=\"https:\/\/mozilla.github.io\/glean\/book\/user\/pings\/index.html\">pings<\/a>. This means that either the language bindings (e.g. Glean APIs for Android in Kotlin) or the product itself (e.g. Fenix) have to provide a way to transport data from the client to the telemetry endpoint.<\/p>\n<p>Before our <a href=\"https:\/\/github.com\/mozilla\/glean\/pull\/872\">recent changes<\/a> (by Beatriz Rizental and Jan-Erik) to the ping upload system, the language bindings needed to understand the format with which pings were persisted to disk in order to read and finally upload them. This is not the case anymore: glean-core will provide language bindings with the headers and the data (ping payload!) of the request they need to upload.<\/p>\n<p>The new upload API empowers the SDK to provide a single place in which to compress the payload to be uploaded: glean-core, right before serving upload requests to the language bindings.<\/p>\n<p><strong>gzipping: the implementation details<\/strong><\/p>\n<p>The <a href=\"https:\/\/searchfox.org\/glean\/rev\/c8bb6ebf7a4b7dba43c53ec95326dd0efe062e64\/glean-core\/src\/upload\/request.rs#64\">implementation<\/a> of the function to compress the payload is trivial, thanks to the `flate2` Rust crate:<\/p>\n<blockquote><p><code>\/\/\/ Attempt to gzip the provided ping content.<br \/>\nfn gzip_content(path: &amp;str, content: &amp;[u8]) -&gt; Option&lt;Vec&lt;u8&gt;&gt; {<br \/>\nlet mut gzipper = GzEncoder::new(Vec::new(), Compression::default());<br \/>\n\/\/ Attempt to add the content to the gzipper.<br \/>\nif let Err(e) = gzipper.write_all(content) {<br \/>\nlog::error!(\"Failed to write to the gzipper: {} - {:?}\", path, e);<br \/>\nreturn None;<br \/>\n}<br \/>\ngzipper.finish().ok()<br \/>\n}<\/code><\/p><\/blockquote>\n<p>And an <a href=\"https:\/\/searchfox.org\/glean\/rev\/c8bb6ebf7a4b7dba43c53ec95326dd0efe062e64\/glean-core\/src\/upload\/request.rs#40-42,48\">even simpler way<\/a> to use it to compress the body of outgoing requests:<\/p>\n<blockquote><p><code>pub fn new(document_id: &amp;str, path: &amp;str, body: JsonValue) -&gt; Self {<br \/>\nlet original_as_string = body.to_string();<br \/>\nlet gzipped_content = Self::gzip_content(path, original_as_string.as_bytes());<br \/>\nlet add_gzip_header = gzipped_content.is_some();<br \/>\nlet body = gzipped_content.unwrap_or_else(|| original_as_string.into_bytes());<br \/>\nlet body_len = body.len();<br \/>\nSelf {<br \/>\ndocument_id: document_id.into(),<br \/>\npath: path.into(),<br \/>\nbody,<br \/>\nheaders: Self::create_request_headers(add_gzip_header, body_len),<br \/>\n}<br \/>\n}<\/code><\/p><\/blockquote>\n<p><strong>What\u2019s next?<\/strong><\/p>\n<p>The new upload mechanism and its compression improvement is only currently available for the iOS and Android Glean SDK language bindings. Our next step (currently <a href=\"https:\/\/github.com\/mozilla\/glean\/pull\/886\">in progress<\/a>!) is to add the newer APIs to the Python bindings as well, moving the complexity of handling the upload process to the shared Rust core.<\/p>\n<p>In our future, the new upload mechanism will additionally provide a flexible constraint-based scheduler (e.g. \u201csend at most 10 pings per hour\u201d) in addition to pre-defined rules for products to use.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>(\u201cThis Week in Glean\u201d is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release &hellip; <a class=\"go\" href=\"https:\/\/blog.mozilla.org\/data\/2020\/05\/26\/how-does-the-glean-sdk-send-gzipped-pings\/\">Read more<\/a><\/p>\n","protected":false},"author":1538,"featured_media":197,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[],"tags":[448297],"coauthors":[],"_links":{"self":[{"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/posts\/196"}],"collection":[{"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/users\/1538"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/comments?post=196"}],"version-history":[{"count":0,"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/posts\/196\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/media\/197"}],"wp:attachment":[{"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/media?parent=196"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/categories?post=196"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/tags?post=196"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/coauthors?post=196"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}