wasmtime / Issue #1453 Rejiggering the CodeSinks for Spid... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / Issue #1453 Rejiggering the CodeSinks for Spid...

Wasmtime GitHub notifications bot (Apr 01 2020 at 10:46):

Cranelift is used in Spidermonkey, as part of an integration called Baldrdash. In Baldrdash, we're currently barely using the RelocSink capabilities, because the information it provides us is incomplete:

when tracking call relocations, we'd need to get the source loc information. This is useful for stack frame iteration, e.g. displaying the wasm bytecode offset in stack traces, when an error occurs.

indirect calls don't have relocations, but Spidermonkey needs to track the call site during stack frame iteration, again (the return address is used as a key for a mapping from call sites to call metadata, including the wasm bytecode offset).

stack maps in Spidermonkey are keyed by the address of the instruction following the instruction to which the stack map relates. This is because when we want to get a stack map for a call/software interrupt, we're usually in the middle of stack frame iteration, and we only have access to the return address for calls, or the address of the next value of PC for software interrupts. Currently, the StackMapSink doesn't offer the information of the next instruction's address, so we have to do something very manual for this (iterate over every instruction, if it's a safepoint remember it for the next instruction, if there was a remembered safepoint add the stackmap with the instruction+size offset to the list of stackmaps). An example may help understanding here:
; Cranelift IR
   safepoint
   call $0; for instance
   iadd
; Machine code
   ; safepoint generates nothing
   call {reloc}
   iadd ; effectively the return address of the above call, so the address we have access to when looking up a stack map for the call during stack frame iteration.
For the new backend integration, we won't be able to do this anymore, for several reasons:

in general, the new backend will be much more likely to generate several machine instructions for a single Cranelift IR opcode, which means we can't use the InstOffsetIter correctly anymore, and rely on this to give us precise offset information.

the new backend doesn't implement encinfo, it doesn't have a notion of Diversions (regalloc does the right thing now!), etc.

My proposal is the following:

Pass the current SourceLoc to MemoryCodeSink::reloc_external, so it addresses item 1.

Add a method RelocSink::track_call_site (or create a new sink for this), that is called with: IR opcode, source location, address of next instruction (= return address). This will be sufficient to address item 2, and could be generally useful for all kinds of calls. At the limit, we don't even need to pass the SourceLoc to reloc_external as proposed in the previous bullet point, since the RelocSink impl could take care of realizing it's notified with both a direct call site here and an external reloc for this call site, but it feels less ergonomic an API to use.

Among wasmtime + cranelift crates and Spidermonkey, all users are NullStackmapSink, except for Spidermonkey. So I'd propose that we do something very ad-hoc here, by adding the address of the next instruction following the instruction to which the stack map applies. This would address item 3 in the above requirements.

Alternatives I've considered:

for all the bullet points, add a generic InstSink which would be called for each instruction with the IR opcode, machine function-start relative offset, machine instruction size, source location. It feels overly generic for the problem we have at hand, but then we have a very general mechanism that might help for more use cases. I'm not sure it'd deal well with the 1 -> N mapping between IR opcode and machine instructions (in the new backend), unless the client had some knowledge of the machine instructions (which doesn't seem very ergonomic), or we passed the machine encoding (which we don't have in the new backend).

for 3, spidermonkey could remember the stackmap, and when inserting it in our own Spidermonkey relocations, pattern-match the machine instruction's encoding + size. It means one pattern-matching for each possible machine instruction after the stack map (currently, this means 3: software trap, call, call indirect), times the number of different architectures (currently 1, x86_64). That seems awful.

I will implement my proposal locally, but I'd be happy to read what people think of it!

Wasmtime GitHub notifications bot (Apr 01 2020 at 10:46):

bnjbvr labeled Issue #1453:

Cranelift is used in Spidermonkey, as part of an integration called Baldrdash. In Baldrdash, we're currently barely using the RelocSink capabilities, because the information it provides us is incomplete:

when tracking call relocations, we'd need to get the source loc information. This is useful for stack frame iteration, e.g. displaying the wasm bytecode offset in stack traces, when an error occurs.

indirect calls don't have relocations, but Spidermonkey needs to track the call site during stack frame iteration, again (the return address is used as a key for a mapping from call sites to call metadata, including the wasm bytecode offset).

stack maps in Spidermonkey are keyed by the address of the instruction following the instruction to which the stack map relates. This is because when we want to get a stack map for a call/software interrupt, we're usually in the middle of stack frame iteration, and we only have access to the return address for calls, or the address of the next value of PC for software interrupts. Currently, the StackMapSink doesn't offer the information of the next instruction's address, so we have to do something very manual for this (iterate over every instruction, if it's a safepoint remember it for the next instruction, if there was a remembered safepoint add the stackmap with the instruction+size offset to the list of stackmaps). An example may help understanding here:
; Cranelift IR
   safepoint
   call $0; for instance
   iadd
; Machine code
   ; safepoint generates nothing
   call {reloc}
   iadd ; effectively the return address of the above call, so the address we have access to when looking up a stack map for the call during stack frame iteration.
For the new backend integration, we won't be able to do this anymore, for several reasons:

in general, the new backend will be much more likely to generate several machine instructions for a single Cranelift IR opcode, which means we can't use the InstOffsetIter correctly anymore, and rely on this to give us precise offset information.

the new backend doesn't implement encinfo, it doesn't have a notion of Diversions (regalloc does the right thing now!), etc.

My proposal is the following:

Pass the current SourceLoc to MemoryCodeSink::reloc_external, so it addresses item 1.

Add a method RelocSink::track_call_site (or create a new sink for this), that is called with: IR opcode, source location, address of next instruction (= return address). This will be sufficient to address item 2, and could be generally useful for all kinds of calls. At the limit, we don't even need to pass the SourceLoc to reloc_external as proposed in the previous bullet point, since the RelocSink impl could take care of realizing it's notified with both a direct call site here and an external reloc for this call site, but it feels less ergonomic an API to use.

Among wasmtime + cranelift crates and Spidermonkey, all users are NullStackmapSink, except for Spidermonkey. So I'd propose that we do something very ad-hoc here, by adding the address of the next instruction following the instruction to which the stack map applies. This would address item 3 in the above requirements.

Alternatives I've considered:

for all the bullet points, add a generic InstSink which would be called for each instruction with the IR opcode, machine function-start relative offset, machine instruction size, source location. It feels overly generic for the problem we have at hand, but then we have a very general mechanism that might help for more use cases. I'm not sure it'd deal well with the 1 -> N mapping between IR opcode and machine instructions (in the new backend), unless the client had some knowledge of the machine instructions (which doesn't seem very ergonomic), or we passed the machine encoding (which we don't have in the new backend).

for 3, spidermonkey could remember the stackmap, and when inserting it in our own Spidermonkey relocations, pattern-match the machine instruction's encoding + size. It means one pattern-matching for each possible machine instruction after the stack map (currently, this means 3: software trap, call, call indirect), times the number of different architectures (currently 1, x86_64). That seems awful.

I will implement my proposal locally, but I'd be happy to read what people think of it!

Wasmtime GitHub notifications bot (Apr 01 2020 at 10:46):

github-actions[bot] commented on Issue #1453:

Subscribe to Label Action

This issue or pull request has been labeled: "cranelift"

<details> <summary>Users Subscribed to "cranelift"</summary>

@bnjbvr

</details>

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.

Wasmtime GitHub notifications bot (Apr 01 2020 at 18:16):

pchickey commented on Issue #1453:

Sounds good! Thanks for writing it up.

Last updated: Apr 17 2025 at 03:17 UTC