End-to-end-encrypt WebRTC in all browsers!

February 21, 2024

End-to-end-encrypt WebRTC in all browsers!

Contributed by Jan-Ivar Bruaroey,

Hi all, we haven’t blogged since the WebRTC API was standardized back in January of 2021! But now in 2024 folks are asking us what’s up with all the new APIs? — We’re coming off an exciting second period of innovation and experimentation in WebRTC, so we’re brushing off the old blogging muscles to cover what’s new, in what’s promising to be a new series of posts.

First, don’t worry. WebRTC 1.0 has been tremendously successful and has not fundamentally changed. Instead, its success has led to developers wanting to use it for a variety of things. This demand has driven continued collaboration in the W3C to standardize new functionality on top of the 1.0 baseline. Each new API exposes exciting new functionality, and each one deserves its own blog post, so let’s start right away with one of them:

Today’s topic is End-to-End-Encryption (E2EE) in WebRTC.

First, we should note that WebRTC is already peer-to-peer encrypted by DTLS, so there’s no need to use additional APIs except to protect data against any middle-boxes your service might employ as WebRTC end-points. That said…

Did you know all major browsers have an API to encrypt WebRTC calls end-to-end? They do! The caveat is the API shape differs slightly between browsers right now. But don’t despair! We’re here to cover that gap. If you’ve been with us for a while you know we’ve been here before, and this is how the sausage is made.

A worker-first API

Chromium experimented early with shipping APIs for this. Unfortunately the early APIs exposed the media pipeline on main-thread, subjecting it to risks of jank given the nature of the JavaScript event model. The Working Group learned from these experiments, and settled instead on a “worker-first” API, which means an API that is simpler to use in a worker than from main-thread.

This standards-track API is RTCRtpScriptTransform, and lets you transform encoded frames before sending, and back again on reception. The most obvious use-case is E2EE, but it can also be used for other things like adding metadata.

See RTCRtpScriptTransform in action below, XOR-ing data on send and receive (runs in all browsers):

Click Start! and share your camera, and you’ll see two videos: what’s sent and what’s received by a peer
Uncheck ✅ descramble to skip XOR on receive for some spectacle (the garbage illustrates that the receiver was “decrypting” with XOR before!)

Replace XOR in a real application with crypto of course, using keys secret to your application. Disclaimer: This is not zero-knowledge E2EE since you hold the keys — Mozilla wanted the stronger SFrameTransform API, but that’s in limbo — so until something stronger is standardized, this is what’s available, and protects end-users mostly against middle-boxes, not against the content service provider (you).

How it works

Click on the “JavaScript” tab above to see the example code. Sender-side we add a transform like this:
sender.transform = new RTCRtpScriptTransform(worker, {side: "send"});

…and receiver-side it looks the same, since it’s XOR-ing (the side parameter is for the ✅ checkbox):
receiver.transform = new RTCRtpScriptTransform(worker, {side: "receive"});

And that’s (almost) it! Scroll down to see the worker in action. On the worker-side, things get exposed here:
onrtctransform = async ({transformer: {readable, writable, options}}) => {

…and the worker connects the redable to the writable through a custom (XOR) TransformStream:
await readable.pipeThrough(new TransformStream({transform})).pipeTo(writable);

And that’s it!

The rest is standard worker stuff. A "descramble" message is passed to the worker to flip the switch, and a clever worker-in-a-string trick is used to keep everything in one fiddle (hence the use of /* */ code comments over //). The same worker can be reused for any number of transforms, which scales well.

This remains an imperfect API however, requiring applications to know which bits to leave alone to not upset the codec-specific packetizer, with some codecs tougher to deal with than others — looking at you H.264 — which bits are they? Trial and error is required.

These changes are also not negotiated with the peer, something that is still being ironed out in the Working Group. But as of right now, this works in all browsers for any app that wants it, so that’s something!

The shim

The standards-track API is implemented natively in Safari and Firefox. It might not be obvious in the example — JSFiddle hides plug-ins well — but it relies on a small shim in Chrome. Also, two minor workarounds highlighted by the comment /* needed by chrome shim: */ were too hard to shim (like onrtctransform in a worker). Some of these are being fixed in Chromium already, so they may not be needed for long. Implementations should converge soon, which means this shim will hopefully be short-lived.

What about main-thread use cases?

There’s a WebRTC Samples example called Video Analyzer that dumps some metrics on the page. As of this writing, it is still written the old way, blocking video frame delivery on main-thread even though frames aren’t being modified. Jank may not be an issue in such a simple demo, but it still encourages developers to follow a pattern likely to scale poorly, and it also won’t work in Firefox or Safari. The pull-request webrtc/samples#1646 updates it to use the standards-track API with the shim mentioned earlier which may end up in adapter.js. Try it here in all browsers:

Click Start!, share your camera, and then press Call

The updated example shows video-frame delivery didn’t need to block on main-thread in the first place. Instead, simple worker messages post to main-thread to update the visible counters, which seems cleaner and more efficient.

Any application that needs the encoded frames on main-thread can still get them by using transferable streams. The burden of transfer is merely reversed.

Hopefully this blog post helps demystify the use of workers with this API! The need for a shim should hopefully diminish soon. In the meantime, I hope you’re able to put this API to good use, and feel free to ask any questions. There’s no comments section here, but you can reach me on twitter.