When I first heard about web workers using structured clone, I was nervous. The more I look into it, the more I think the whole idea of structured clone — regardless of what it’s used for — is problematic in and of itself.
Implicit copying is rarely what you want
When data is mutable, it needs to be managed by the programmer who created it, because they know what they’re doing with it. When the language or API implicitly copies the data, the programmer has no control over it. Granted, structured clone is only used in a few published places in HTML5, but it would be preferable to have explicit ways to construct immutable data, and only be able to send immutable data between workers. (Or ways to safely transfer ownership of mutable data, but that’s irrelevant to the question of structured clone.)
This raises the question of how to express immutable data in JavaScript. That’s something that Brendan has recently blogged about, and it’s worth adding to the language. But structured clone strikes me as a hack around the problem that we don’t currently have convenient ways of creating structured, immutable data.
Structured clone ignores huge swathes of JavaScript data
Structured clone is only defined on a handful of JavaScript built-in JavaScript and DOM object types. JavaScript objects are part of a deeply intertwined, deeply mutable object graph, and structured clone simply ignores most of that graph. An operation that uses structured clone will let you use any Object instance, regardless of what sorts of invariants it’s set up to expect based on its prototype chain, its getters or setters, its connectedness to the object graph… but structured clone will simply blithely disregard much of that structure.
Again, if we simply had some simple, immutable data structures like tuples and records, these would be totally reasonable things to share between workers.
Automatically traversing mutable data structures is a code smell
There’s a famous paper by Henry Baker that specifically argues that cloning mutable data structures rarely has a “one size fits all” solution, and that mutable data can’t be usefully traversed automatically by general purpose libraries. I have a sense that whenever some API is automatically, deeply traversing mutable data structures, it’s probably unlikely to be doing the right thing.
Structured clone is not future-proof
Structured clone is simply defined on a grab-bag of built-in datatypes, and the rest are treated as plain old objects. This means it’s going to behave very strangely on new data types that get introduced in future versions of ECMAScript, such as maps and sets, or in user libraries.
Alternatives?
A more adaptable approach might be for ECMAScript to specify “transmittable” data structures. As we add immutable data structures, they could be defined to be transmittable, and we could even specify custom internal properties of certain classes of mutable objects with transmission-safe semantics such as ownership transfer.
Doing these kinds of things well, in a way that’s simple, clear and predictable, deserves built-in language support.
I’m not entirely convinced I agree.
1. I’m not sure there is any implicit copying going on. All APIs that use structured cloning are pretty explicit. Things like Worker.postMessage and IDBObjectStore.put pretty explicitly creates a new copy.
2. I’m not sure we can afford to wait. It’ll take a long time before immutable types are
A) Specced
B) Implemented
C) Getting enough usage traction that we know that it’s something that will fit what we need for the APIs where we use structured clones.
3. Doesn’t JSON.serialize exhibit the same problem? Was this something that was debated in TC39 during the JSON objects standardization?
At first glance, I shared your concern. At a second look, I realized that my intuition of the natural path of evolution will be that as soon as immutable data types will exist, they’ll just be transfered.
Worst case, implementors will implement postMessage this way because it’s what makes sense and the spec is going to follow.
Best case, the spec adds a line right now to consider immutable data types.
“Granted, structured clone is only used in a few published places in HTML5, but it would be preferable to have explicit ways to construct immutable data, and only be able to send immutable data between workers.”
=> postMessage has been invented at a time when there was no way to create standardized immutable complex structures.
Serializing in JSON was probably the first one and it already makes plently of implicit choices ([[prototype]] when parsing, removing non-enumerable properties…).
Object.freeze (actually, freeze isn’t enough: getters/setters should be removed too) followed but it was too late.
Choices had to be made before these constructs exist.