Lazy value lowering

This isn't a fully-fleshed out idea, but it's coming up in #369 and so I thought it might be nice to split out here.  I'm tentatively excited about it, though (in that "this is probably what we should've done in the first place... sigh" sort of way).

## Motivation

When lowering a dynamically-sized value (`list` or `string`) or lowering more than `MAX_FLAT_PARAMS`/`RESULTS`, the Canonical ABI specifies calling a wasm-defined allocation function (indicated per import/export via the `realloc` immediate on `canon lift`/`lower`, but usually there's just one exported with the name `cabi_realloc`) to get the linear memory to lower the `list`/`string` into.  This works decently well, but sometimes it's constraining, leading to [various](https://github.com/WebAssembly/component-model/issues/148#issuecomment-1378072049) [ideas](https://github.com/WebAssembly/component-model/issues/369#issuecomment-2247088561) for further customizations, with more ideas brewing.  But the root problem seems to be the control flow inversion of the adapter controlling the calls to `realloc`, which limits wasm's control of the order and manner of allocation.

## Idea

In all the places where the CABI currently says to call `realloc` and store the resulting `i32` pointer, instead we could say that the CABI stores a fresh `i32` index that refers to the abstract value-to-be-lowered.  The `i32` length (of `list`s and `string`s) is stored eagerly as usual.  Control flow is then handed to wasm, passing it this shallowly-lowered tuple of params/results.  Wasm can now look at the `i32` length values to see how much to allocate and then acquire that memory however it wants; if there are multiple values-to-be-lowered, wasm can do smart things like batching allocations.

Once the destination address is selected, wasm then calls a new built-in to lower the value, e.g.:
* `canon value.lower $t: [validx:i32 dstp:i32] -> []`

For a `list<list<u8>>`, the first call to `value.lower` will generate N more `validx`s for each `list<u8>` element, which the calling wasm again gets to control the allocation and lowering of (and so on, recursively through the type structure).

We could also allow multiple partial reads of the same `validx` so that a single logical `list` value could be read into multiple non-contiguous segments (think `readv()`).  Once fully lowered, a `validx` will trap if lowered again.

`string` is the problem child as usual.  If the lifting and lowering side agree on `string-encoding`, the lifted number of code-units can be passed directly and tagged as being precise.  Otherwise, an "approximate length" can be spec-defined (derived from the lifted+lowered `string-encoding` and lifted number-of-code-units) and passed instead and the wasm code can use repeated partial reads (or perhaps a worst-case allocation and a single read).

At the end of the call (for lifted exports functions: when wasm returns, for lowered functions, when wasm calls some `canon finish-call: [] -> []` built-in that triggers the callee's `post-return`), the temporary table of values-to-be-lowered is reset, dropping any unlowered values (optimizing the case of unneeded values).

## Compatibility

To avoid breaking existing Preview 2 component binaries, we could require components to opt into this new ABI via a new `lazy` `canonopt`.  In the future, when we make our final breaking binary changes leading up to 1.0, if we think `lazy` is the Best Thing Ever, there should be a simple automatic conversion from non-`lazy` into `lazy` components (generating little typed-directed wrapper core functions that contain the `cabi_realloc`+`value.lower` calls), so that we can kill `realloc` and make `lazy` the only option (or not, if having `realloc` is convenient).

## Even better: value forwarding

We could also allow *lifting* a value-to-be-lowered directly from its `i32` index (using a tag bit to indicate whether the `i32` is a pointer or index).  This would give us the ability to cheaply "forward" a value through an intermediate component, the need for which I've seen come up on a number of occasions, especially when one component lightly wraps another.

(There's some delicate lifetime issues to work out here regarding return values and `post-return`, but I think with the explicit `finish-call`, it can all work out?)

## Relation to caller-supplied buffer types

This is an idea that I think is complementary to the new caller-supplied buffer types proposed #369; see [this comment](https://github.com/WebAssembly/component-model/issues/369#issuecomment-2251444752) and preceding comments for how the buffer types address a subtly different (and more specialized/advanced) use case.

(Apologies in advance for no replies over the next 2 weeks; I'll be on vacation.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Lazy value lowering #383

Motivation

Idea

Compatibility

Even better: value forwarding

Relation to caller-supplied buffer types

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Lazy value lowering #383

Description

Motivation

Idea

Compatibility

Even better: value forwarding

Relation to caller-supplied buffer types

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions