Stream: git-wasmtime

Topic: wasmtime / issue #3153 Implement the memory64 proposal in...


view this post on Zulip Wasmtime GitHub notifications bot (Aug 05 2021 at 22:16):

bjorn3 commented on issue #3153:

Heaps are always precisely sized. This means that every call to
memory.grow will incur a memcpy of memory from the old heap to the
new. We probably want to at least look into mremap on Linux and
otherwise try to implement schemes where dynamic heaps have some
reserved pages to grow into to help amortize the cost of
memory.grow.

I believe mmap has a flag that prevents overwriting already mapped memory. Combining this with MAP_FIXED would allow growing the heap without moving part of the time.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 05 2021 at 22:28):

github-actions[bot] commented on issue #3153:

Subscribe to Label Action

cc @fitzgen, @peterhuene

<details>
This issue or pull request has been labeled: "cranelift", "cranelift:meta", "cranelift:wasm", "fuzzing", "wasmtime:c-api"

Thus the following users have been cc'd because of the following labels:

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

view this post on Zulip Wasmtime GitHub notifications bot (Aug 06 2021 at 15:28):

alexcrichton commented on issue #3153:

Hm ok so I may need some assistance with that error. Currently we have:

enum InstructionData {
    HeapData(Opcode /* u16 */, Value /* u32 */, ir::Heap /* u32 */, imm: ir::immediates::Uimm32 /* u32 */),
    // ...
}

This results in a 16-byte enum size in total because the discriminant can be packed into the space preceding Opcode. This PR changes the Uimm32 there to Uimm64, which inflates the size of the enum to 24 bytes:

struct Before(u8, u16, u32, u32, u32);
// .. vs ..
struct After(u8, u16, u32, u32, u64);

and there's a test asserting that this enum is 16-bytes in size.

I'm having trouble figuring out how best to shrink this back to 16-bytes. Some things I've tried:

Do others have alternative ideas that could be implemented, or guidance on which of the above paths is the way to go here?

view this post on Zulip Wasmtime GitHub notifications bot (Aug 06 2021 at 16:48):

bjorn3 commented on issue #3153:

load, store, etc also use 32bit offsets even though they can already be used with 64bit pointers. In cg_clif if the offset overflows the Offset32, I emit a iadd_imm ptr, offset before loading or storing. The same could be done for heap_addr to keep a 32bit offset.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 06 2021 at 17:12):

cfallin commented on issue #3153:

I may be misunderstanding the use of that immediate, but it looks to me like the field is actually used to denote the size of the access, not of the underlying heap (link), on the heap_addr instruction (which appears to be the only one that uses that instruction format). GIven that individual memory accesses definitely don't access more than 4GiB-sized values, I think 32 bits is still OK here?

view this post on Zulip Wasmtime GitHub notifications bot (Aug 06 2021 at 17:23):

alexcrichton commented on issue #3153:

Heh unfortunately I think you've fallen into the same trap I've fallen into a number of times. The heap_addr instruction does indeed document its immediate as an access size, but that's not technically how the wasm backend is using it. When translating, assuming there's a guard page, we feign the access size as the fixed offset in the memory instruction. We don't technically actually access all those bytes, but for the purposes of validation it's sufficient.

Put another way, because wasm memory offsets can be 64-bits (with memory64), and because we use this offset as the "access size", that's the motivation for increasing the bit-width of the access size.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 06 2021 at 17:42):

cfallin commented on issue #3153:

Hmm, I see; unexpected but I guess that usage makes sense.

IMHO the best option here is probably a hybrid approach during translation: if the offset fits in 32 bits (which will be the case the overwhelming majority of the time, hopefully), do as we presently do; if larger, then emit an explicit add instruction to compute the addr+offset, as @bjorn3 suggests above. The reason I'm hesitant to just accept the 64-bit immediate and corresponding InstructionData growth is that I think the latter will have a surprisingly large impact on compile-time performance; allocating the space only when needed (via another instruction) should be more efficient, even though it's less theoretically clean.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 06 2021 at 17:47):

alexcrichton commented on issue #3153:

I considered doing the iadd_imm trick but the issue is that overflow needs to be detected (otherwise the 65-bit effective offset isn't correctly calculated). That would involve doing some sort of trap-if-overflow check, but that means that every memory access would have 2 branches instead of one as with this PR. I was hoping to avoid that because I'm assuming the perf hit is already quite large and making it even larger could be worse.

I think it would work for the time being, but I wanted to explore possibilities to update the cranelift instruction. I think it makes sense to not inflate 16 bytes to 24 bytes, but I wasn't having much luck brainstorming possibilities of how to keep it under 16 bytes (unless we want to change heap indexes to u8 and disallow 256+ heaps (which I'm not sure why we'd want anyway)

view this post on Zulip Wasmtime GitHub notifications bot (Aug 06 2021 at 18:00):

cfallin commented on issue #3153:

It would only be two checks for offsets >4GiB, no? I'd expect that most Wasm producers use offsets for things like structs and it would be very strange to have an offset that large, so in my thinking at least, it makes sense to treat this as a "needed for correctness but can be slow" corner. I may be misunderstanding the issue though?

Re: heap count, I'd be hesitant about limiting to 256 heaps; we might bump our heads on that in the future multi-module / component world (e.g. heap per Wasm component, with 300 components in a very large dependency graph -- at least, that's not totally implausible).

view this post on Zulip Wasmtime GitHub notifications bot (Aug 06 2021 at 18:03):

alexcrichton commented on issue #3153:

Hm nah that's a good point. We already incur a real bounds check on wasm32 for 2gb+ offsets so having an extra bounds check on wasm64 for 4gb+ bounds makes sense. I'll go ahead and implement that doing an addition with a overflow check if the offset doesn't fit in a u32

view this post on Zulip Wasmtime GitHub notifications bot (Aug 06 2021 at 18:45):

alexcrichton commented on issue #3153:

For reviewers, extra care when reviewing cranelift/wasm/src/code_translator.rs is likely in order. That's some tricky code that's pretty meticulously set up. It has large effects on performance and is the lynchpin of wasm heap correctness, so just want to triple-check I'm right there.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 11 2021 at 21:54):

alexcrichton commented on issue #3153:

@peterhuene oh I realize now I tagged you here without much context. Would you be ok reviewing the Wasmtime-specific pieces of this? If not no worries, I'm sure I can find another unwilling victim candidate.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 11 2021 at 21:54):

alexcrichton edited a comment on issue #3153:

@peterhuene oh I realize now I tagged you here without much context. Would you be ok reviewing the Wasmtime-specific pieces of this? If not no worries, I'm sure I can find another unwilling victim candidate.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 11 2021 at 21:56):

peterhuene commented on issue #3153:

:+1: I'll review it now.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 12 2021 at 14:40):

alexcrichton commented on issue #3153:

Thanks @peterhuene!


Last updated: Oct 23 2024 at 20:03 UTC