This Week in Glean: glean-core to Wasm experiment

(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean.)

All “This Week in Glean” blog posts are listed in the TWiG index.

In the past week Alessio, Mike, Hamilton and I got together for the Glean.js workweek. Our purpose was to build a proof-of-concept of a Glean SDK that works on Javascript environments. You can expect a TWiG in the next few weeks about the outcome of that. Today I am going to talk about something that I tried out in preparation for that week: attempting to compile glean-core to Wasm.

A quick primer

glean-core

The glean-core is the heart of the Glean SDK where most of the logic and functionality of Glean lives. It is written in Rust and communicates with the language bindings in C#, Java, Swift or Python through an FFI layer. For a comprehensive overview of the Glean SDKs architecture, please refer to Jan-Erik’s great blog post and talk on the subject.

wasm

From the WebAssembly website:

“WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine. Wasm is designed as a portable compilation target for programming languages, enabling deployment on the web for client and server applications.”

Or, from Lin Clark’s “A cartoon intro to WebAssembly”:

“WebAssembly is a way of taking code written in programming languages other than JavaScript and running that code in the browser.”

Why did I decide to do this?

On the Glean team we make an effort to move as much of the logic as possible to glean-core, so that we don’t have too much code duplication on the language bindings and guarantee standardized behaviour throughout all platforms.

Since that is the case, it was counterintuitive for me, that when we set out to build a version of Glean for the web, we wouldn’t rely on the same glean-core as all our other language bindings. The hypothesis was: let’s make JavaScript just another language binding, by making our Rust core compile to a target that runs on the browser.

Rust is notorious for making an effort to have a great Rust to Wasm experience, and the Rust and Webassembly working group has built awesome tools that make boilerplate for such projects much leaner.

First try: compile glean-core “as is” to Wasm

Since this was my first try in doing anything Wasm, I started by following MDN’s guide “Compiling from Rust to WebAssembly”, but instead of using their example “Hello, World!” Rust project, I used glean-core.

From that guide I learned about wasm-pack, a tool that deals with the complexities of compiling a Rust crate to Wasm and wasm-bindgen a tool that exposes, among many other things, the #[wasm_bindgen] attribute which, when added to a function, will make that function accessible from Javascript.

The first thing that was obvious, was that it would be much harder to try and compile glean-core directly to Wasm. Passing complex types to it has many limitations and I was not able to add the #[wasm_bindgen] attribute to trait objects or structs that contain trait objects or lifetime annotations. I needed a simpler API surface to make the connection between Rust and Javascript. Fortunately, I had that in hand: glean-ffi.

Our FFI crate exposes functions that rely on a global Glean singleton and have relatively simple signatures. These functions are the ones accessed by our language bindings through a C FFI. Most of the Rust complex structures are hidden by this layer from the consumers.

Perfect! I proceeded to add the #[wasm_bindgen] attribute to one of our entrypoint functions: glean_initialize. This uncovered a limitation I didn’t know about: you can’t add this attribute to functions that are unsafe, which unfortunately this one is.

My assumption that I would be able to just expose the API of glean-ffi to Javascript by compiling it to Wasm without making any changes to it was not holding up. I would have to go through some refactoring to make that work. But until now, I hadn’t gotten to the actual compilation step, the error I was getting was a syntax error. I wanted to go through compilation and see if that completed before diving into any refactoring work. I just removed the #[wasm_bindgen] attribute for now and made a new attempt at compiling.

Now I got a new error. Progress! If you clone the Glean repository, install wasm-pack, and run wasm-pack build inside the glean-core/ffi/ folder right now, you are bound to get this same error and here is one important excerpt of it:

<...>

fatal error: 'sys/types.h' file not found
cargo:warning=#include <sys/types.h>
cargo:warning=         ^~~~~~~~~~~~~
cargo:warning=1 error generated.
exit code: 1

--- stderr

error occurred: Command "clang" "-Os" "-ffunction-sections" "-fdata-sections" "-fPIC" "--target=wasm32-unknown-unknown" "-Wall" "-Wextra" "-DMDB_IDL_LOGN=16" "-o" "<...>/target/wasm32-unknown-unknown/release/build/lmdb-rkv-sys-5e7282bb8d9ba64e/out/mdb.o" "-c" "<...>/.cargo/registry/src/github.com-1ecc6299db9ec823/lmdb-rkv-sys-0.11.0/lmdb/libraries/liblmdb/mdb.c" with args "clang" did not execute successfully (status code exit code: 1)

One of glean-core’s dependencies is rkv a storage crate we use for persisting metrics before they are collected and sent in pings. This crate depends on LMDB which is written in C, thus the clang error.

I do not have extensive experience in writing C/C++ programs, so this was not familiar to me. I figured out that the file this error points to as “not found”, <sys/types.h>, is a header file that should be part of libc. This compiles just fine when trying to compile for our usual targets, so I had a hunch that maybe I just didn’t have the proper libc files for compiling to Wasm targets.

Internet searching pointed me to wasi-libc, a libc for WebAssembly programs. Promising! With this, I retried compiling glean-ffi to Wasm. I just needed to run the build command with added flags:

CFLAGS="--sysroot=/path/to/the/newly/built/wasi-libc/sysroot" wasm-pack build

This didn’t work immediately and the error messages told me to add some extra flags to the command, which I did without thinking much and the final command is:

CFLAGS="--sysroot=/path/to/wasi-sdk/clone/share/wasi-sysroot -D_WASI_EMULATED_MMAN -D_WASI_EMULATED_SIGNAL" wasm-pack build

I would advise the reader now not to get too excited. This command still doesn’t work. It will return yet another set of errors and warnings, mostly related to “usage of undeclared identifiers” or “implicit declaration of functions”. Most of the identifiers that were erroing started with the pthread_ prefix, which reminded me of something that I read on the wasi-sdk, a toolkit for compiling C programs to WebAssembly that includes wasi-libc, README section:

“Specifically, WASI does not yet have an API for creating and managing threads yet, and WASI libc does not yet have pthread support”.

That was it. I was done with trying to approach the problem of compiling glean-core to Wasm “as is” and I decided to try another way. I could try to abstract away our usage of rkv so that depending on it didn’t block compilation to Wasm, but that is way too big a refactoring task that I considered it a blocker for this experiment.

Second try: take a part of glean-core and compile that to Wasm

After learning that it would require way too much refactoring of glean-core and glean-ffi to get them to compile to Wasm, I decided to try a different approach and just get a small self contained part of glean-core and compile that to Wasm.

Earlier this year I had a small taste of trying to rewrite part of glean-core in Javascript for the distribution simulators that we added to The Glean Book. To make the simulators work I essentially had to reimplement histograms code and part of the distribution metrics code in Javascript.

The histograms code is very self contained so it was a perfect candidate to try and single out for this experiment. I did just that and I was actually able to get it to not error fairly quickly as a standalone thing (you can check out the histogram code on the glean-to-wasm repo vs. the histogram code on the Glean repo).

After getting this to work I created three accumulation functions that would mimic how each one of the distribution metric types work. These functions would then be exposed to Javascript. The resulting API looks like this:

#[wasm_bindgen]
pub fn accumulate_samples_custom_distribution(
    range_min: u32,
    range_max: u32,
    bucket_count: usize,
    histogram_type: i32,
    samples: Vec<u64>,
) -> String

#[wasm_bindgen]
pub fn accumulate_samples_timing_distribution(
    time_unit: i32,
    samples: Vec<u64>
) -> String

#[wasm_bindgen]
pub fn accumulate_samples_memory_distribution(
    memory_unit: i32,
    samples: Vec<u64>
) -> String

Each one of these functions creates a histogram, accumulates the given samples to this histogram and returns the resulting histogram as a JSON encoded string. I tried getting them to return HashMap<u64,u64> at first, but that is not supported.

For this I was still following MDN’s guide “Compiling from Rust to WebAssembly”, which I can’t recommend enough, and after I got my Rust code to compile to Wasm it was fairly straightforward to call the functions imported from the Wasm module inside my Javascript code.

Here is a little taste of what that looked like:

import("glean-wasm").then(Glean => {
    const data = JSON.parse(
        Glean.accumulate_samples_memory_distribution(
            unit, // A Number value between 0 - 3
            values // A BigUint64Array with the sample values
        )
    )
    // <Do something with data>
})

The only hiccup I ran into was that I needed to change my code to use the BigInt number type instead of the default Number type from Javascript. That is necessary because, in Rust, my functions expect a u64 and BigInt is the type that maps to that from Javascript.

This code can be checked out at: https://github.com/brizental/glean-wasm-experiment

And there is a demo of it working in: https://glean-wasm.herokuapp.com/

Final considerations

This was a very fun experiment, but does it validate my initial hypothesis:

Should we compile glean-core to Wasm and have Javascript be just another language binding?

We definitely can do that. Even though my first try was not concluded, if we abstract away all the dependencies that we have that can’t be compiled to Wasm, refactor the unsafe functions out and all other possible roadblocks that we find other than these, we can do it. The effort that would take though, I believe is not worth it. It would take us much less time to rewrite glean-core’s code in Javascript. Spoiler alert for our upcoming TWiG about the Glean.js workweek, but in just a week we were able to get a functioning prototype of that.

Our requirements for a Glean software for the web are different from our requirements for a native version of Glean. Different enough that the burden of maintenance for two versions of glean-core, one in Rust and another in Javascript, is probably smaller than the amount of work and hacks it would take to build a single version that attends both platforms.

Another issue is compatibility, Wasm is very well supported but there are environments that still don’t have support for it. It would be suboptimal if we went through the trouble of changing glean-core for it to compile to Wasm and then still had to make a Javascript only version for compatibility reasons.

My conclusion is that although we can compile glean-core to Wasm, it doesn’t mean that we should do that. The advantages of having a single source of truth for the Glean SDK are very enticing, but at the moment it would be more practical to rewrite something specific for the web.

Data@Mozilla

This Week in Glean: glean-core to Wasm experiment

A quick primer

glean-core

wasm

Why did I decide to do this?

First try: compile glean-core “as is” to Wasm

Second try: take a part of glean-core and compile that to Wasm

Final considerations

This Week in Glean: Page Load Data, Three Ways (Or, How Expensive Are Events?)

This Week in Glean: Your personal Glean data pipeline

This Week in Glean: What If I Want To Collect All The Data?

This Week in Glean: Migrating Legacy Telemetry Collections to Glean

This Week in Glean: How Long Must I Wait Before I Can See My Data?

Never Look at the Data: Why did we start getting so many pings from Korea?

This Week in Data: Python Environment Freshness

This Week in Glean: Reviewing a Book – Rust in Action

Crash Reporting Data Sprint

This Week in Glean: What Flips Your Bit?

Never Look at the Data: Why did we start getting so many pings from Korea?

This Week in Data: Python Environment Freshness

This Week in Glean: What Flips Your Bit?

This Week in Glean: Designing a telemetry collection with Glean

My first time experience at the SciPy conference

Documenting outages to seek transparency and accountability

Data and Firefox Suggest

Announcing Mozilla Rally

Data Publishing @ Mozilla

Understanding default browser trends

This Week in Glean: Data Reviews are Important, Glean Parser makes them Easy

This Week in Glean: What Flips Your Bit?

Detecting Internet Outages with Mozilla Telemetry Data

Making your Data Work for you with Mozilla Rally

This Week in Glean: Fantastic Facts and where to find them

Welcome (back) to Data@Mozilla

This Week in Data: Reading “The Manager’s Path” by Camille Fournier