{"id":176,"date":"2020-02-28T18:58:22","date_gmt":"2020-02-28T18:58:22","guid":{"rendered":"https:\/\/blog.mozilla.org\/data\/?p=176"},"modified":"2020-03-30T19:00:42","modified_gmt":"2020-03-30T19:00:42","slug":"this-week-in-glean-mozregression-telemetry-part-1","status":"publish","type":"post","link":"https:\/\/blog.mozilla.org\/data\/2020\/02\/28\/this-week-in-glean-mozregression-telemetry-part-1\/","title":{"rendered":"This Week in Glean: mozregression telemetry (part 1)"},"content":{"rendered":"<p><em>(\u201cThis Week in Glean\u201d is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean. You can find <a href=\"https:\/\/mozilla.github.io\/glean\/book\/appendix\/twig.html\">an index of all TWiG posts online.<\/a>)<\/em><\/p>\n<p><em>This is a special guest post by non-Glean-team member William Lachance!<\/em><\/p>\n<p>As I <a href=\"https:\/\/wlach.github.io\/blog\/2019\/09\/mozregression-update-python-3-edition\/\">mentioned last time<\/a> I talked about <a href=\"https:\/\/mozilla.github.io\/mozregression\/\">mozregression<\/a>, I have been thinking about adding some telemetry to the system to better understand the usage of this tool, to justify some part of Mozilla spending some cycles maintaining and improving it (assuming my intuition that this tool is heavily used is confirmed).<\/p>\n<p>Coincidentally, the Telemetry client team has been working on a new library for measuring these types of things in a principled way called <a href=\"https:\/\/mozilla.github.io\/glean\/book\/index.html\">Glean<\/a>, which even has python bindings! Using this has the potential in saving a lot of work: not only does Glean provide a framework for submitting data, our backend systems are automatically set up to process data submitted via into Glean into <a href=\"https:\/\/cloud.google.com\/bigquery\">BigQuery<\/a> tables, which can then easily be queried using tools like <a href=\"https:\/\/docs.telemetry.mozilla.org\/tools\/stmo.html\">sql.telemetry.mozilla.org<\/a>.<\/p>\n<p>I thought it might be useful to go through some of what I\u2019ve been exploring, in case others at Mozilla are interested in instrumenting their pet internal tools or projects. If this effort is successful, I\u2019ll distill these notes into a tutorial in the Glean documentation.<\/p>\n<h2 id=\"initial-steps-defining-pings-and-metrics\">Initial steps: defining pings and metrics<\/h2>\n<p>The initial step in setting up a Glean project of any type is to define explicitly the types of pings and metrics. You can look at a \u201cping\u201d as being a small bucket of data submitted by a piece of software in the field. A \u201cmetric\u201d is something we\u2019re measuring and including in a ping.<\/p>\n<p>Most of the Glean documentation focuses on browser-based use-cases where we might want to sample lots of different things on an ongoing basis, but for mozregression our needs are considerably simpler: we just want to know when someone <em>has<\/em> used it along with a small number of non-personally identifiable characteristics of their usage, e.g. the mozregression version number and the name of the application they are bisecting.<\/p>\n<p>Glean has <a href=\"https:\/\/mozilla.github.io\/glean\/book\/user\/pings\/events.html\">the concept of event pings<\/a>, but it seems like those are there more for a fine-grained view of what\u2019s going on during an application\u2019s use. So let\u2019s define a new ping just for ourselves, giving it the unimaginative name \u201cusage\u201d. This goes in a file called <code>pings.yaml<\/code>:<\/p>\n<div class=\"brush: yaml\">\n<pre><code>---\r\n$schema: moz:\/\/mozilla.org\/schemas\/glean\/pings\/1-0-0\r\n\r\nusage:\r\n  description: &gt;\r\n    A ping to record usage of mozregression\r\n  include_client_id: true\r\n  notification_emails:\r\n    - wlachance@mozilla.com\r\n  bugs:\r\n    - http:\/\/bugzilla.mozilla.org\/123456789\/\r\n  data_reviews:\r\n    - http:\/\/example.com\/path\/to\/data-review<\/code><\/pre>\n<\/div>\n<p>We also need to define a list of things we want to measure. To start with, let\u2019s just test with one piece of sample information: the app we\u2019re bisecting (e.g. \u201cFirefox\u201d or \u201cGecko View Example\u201d). This goes in a file called <code>metrics.yaml<\/code>:<\/p>\n<div class=\"brush: yaml\">\n<pre><code>---\r\n$schema: moz:\/\/mozilla.org\/schemas\/glean\/metrics\/1-0-0\r\n\r\nusage:\r\n  app:\r\n    type: string\r\n    description: &gt;\r\n      The name of the app being bisected\r\n    notification_emails: \r\n      - wlachance@mozilla.com\r\n    bugs: \r\n      - https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=1581647\r\n    data_reviews: \r\n      - http:\/\/example.com\/path\/to\/data-review\r\n    expires: never\r\n    send_in_pings:\r\n      - usage<\/code><\/pre>\n<\/div>\n<p>The <code>data_reviews<\/code> sections in both of the above are obviously bogus, we will need to actually get data review before landing and using this code, to make sure that we\u2019re in conformance with Mozilla\u2019s <a href=\"https:\/\/wiki.mozilla.org\/Firefox\/Data_Collection\">data collection policies<\/a>.<\/p>\n<h2 id=\"testing-it-out\">Testing it out<\/h2>\n<p>But in the mean time, we can test our setup with the <a href=\"https:\/\/docs.telemetry.mozilla.org\/concepts\/glean\/debug_ping_view.html\">Glean debug pings viewer<\/a> by setting a special tag (<code>mozregression-test-tag<\/code>) on our output. Here\u2019s a small python script which does just that:<\/p>\n<div class=\"brush: py\">\n<pre><code>from pathlib import Path\r\nfrom glean import Glean, Configuration\r\nfrom glean import (load_metrics,\r\n                   load_pings)\r\n\r\nmozregression_path = Path.home() \/ '.mozilla2' \/ 'mozregression'\r\n\r\nGlean.initialize(\r\n    application_id=\"mozregression\",\r\n    application_version=\"0.1.1\",\r\n    upload_enabled=True,\r\n    configuration=Configuration(\r\n      ping_tag=\"mozregression-test-tag\"\r\n    ),\r\n    data_dir=mozregression_path \/ \"data\"\r\n)\r\nGlean.set_upload_enabled(True)\r\n\r\npings = load_pings(\"pings.yaml\")\r\nmetrics = load_metrics(\"metrics.yaml\")\r\n\r\nmetrics.usage.app.set(\"reality\")\r\npings.usage.submit()<\/code><\/pre>\n<\/div>\n<p>Running this script on my laptop, I see that a respectable JSON payload was delivered to and processed by our servers:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/wlach.github.io\/files\/2020\/02\/glean-debug-ping-viewer.png\" \/><\/p>\n<p>As you can see, we\u2019re successfully processing both the \u201cversion\u201d number of mozregression, some characteristics of the machine sending the information (my MacBook in this case), as well as our single measure. We also have a client id, which should tell us roughly how many distinct installations of mozregression are sending pings. This should be more than sufficient for an initial \u201cmozregression usage dashboard\u201d.<\/p>\n<h2 id=\"next-steps\">Next steps<\/h2>\n<p>There are a bunch of things I still need to work through before landing this inside mozregression itself. Notably, the Glean python bindings are python3-only, so we\u2019ll need to <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=1426766\">port the mozregression GUI to python 3<\/a> before we can start measuring usage there. But I\u2019m excited at how quickly this work is coming together: stay tuned for part 2 in a few weeks.<\/p>\n<p>(( This is a syndicated copy of <a href=\"https:\/\/wlach.github.io\/blog\/2020\/02\/this-week-in-glean-special-guest-post-mozregression-telemetry-part-1\/\">the original post<\/a>. ))<\/p>\n","protected":false},"excerpt":{"rendered":"<p>(\u201cThis Week in Glean\u201d is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release &hellip; <a class=\"go\" href=\"https:\/\/blog.mozilla.org\/data\/2020\/02\/28\/this-week-in-glean-mozregression-telemetry-part-1\/\">Read more<\/a><\/p>\n","protected":false},"author":1528,"featured_media":168,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[315988,448297],"tags":[448297,288565],"coauthors":[],"_links":{"self":[{"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/posts\/176"}],"collection":[{"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/users\/1528"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/comments?post=176"}],"version-history":[{"count":0,"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/posts\/176\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/media\/168"}],"wp:attachment":[{"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/media?parent=176"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/categories?post=176"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/tags?post=176"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/blog.mozilla.org\/data\/wp-json\/wp\/v2\/coauthors?post=176"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}