wasmtime / PR #2143 Relocation of DW_OP_{bra, skip} instr... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #2143 Relocation of DW_OP_{bra, skip} instr...

Wasmtime GitHub notifications bot (Aug 19 2020 at 08:30):

ggreif opened PR #2143 from main to main:

Wasmtime GitHub notifications bot (Aug 19 2020 at 08:31):

ggreif edited PR #2143 from main to main:

Please don't look just yet.

Wasmtime GitHub notifications bot (Aug 19 2020 at 09:13):

ggreif updated PR #2143 from main to main:

Please don't look just yet.

Wasmtime GitHub notifications bot (Aug 19 2020 at 09:44):

ggreif updated PR #2143 from main to main:

Please don't look just yet.

Wasmtime GitHub notifications bot (Aug 19 2020 at 10:40):

ggreif updated PR #2143 from main to main:

Please don't look just yet.

Wasmtime GitHub notifications bot (Aug 19 2020 at 10:58):

ggreif updated PR #2143 from main to main:

Please don't look just yet.

Wasmtime GitHub notifications bot (Aug 19 2020 at 13:19):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 19 2020 at 13:19):

ggreif created PR Review Comment:

target is informational only, never used

Wasmtime GitHub notifications bot (Aug 19 2020 at 13:22):

ggreif updated PR #2143 from main to main:

Please don't look just yet.

Wasmtime GitHub notifications bot (Aug 19 2020 at 13:51):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 19 2020 at 13:51):

ggreif created PR Review Comment:

This is hard-coded to be little-endian for now.

Wasmtime GitHub notifications bot (Aug 19 2020 at 16:02):

ggreif updated PR #2143 from main to main:

Please don't look just yet.

Wasmtime GitHub notifications bot (Aug 19 2020 at 21:49):

ggreif edited PR #2143 from main to main:

Please don't look just yet.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 20 2020 at 17:29):

ggreif updated PR #2143 from main to main:

Please don't look just yet.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 20 2020 at 17:30):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 20 2020 at 17:30):

ggreif created PR Review Comment:

This I'll submit separately.

Wasmtime GitHub notifications bot (Aug 20 2020 at 17:31):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 20 2020 at 17:31):

ggreif created PR Review Comment:

Dito, refactoring.

Wasmtime GitHub notifications bot (Aug 20 2020 at 22:18):

yurydelendik created PR Review Comment:

Let's make jump_arcs as field of CompiledExpression (from_label will have empty map)

Wasmtime GitHub notifications bot (Aug 20 2020 at 22:18):

yurydelendik submitted PR Review.

Wasmtime GitHub notifications bot (Aug 21 2020 at 10:31):

ggreif updated PR #2143 from main to main:

Please don't look just yet.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 21 2020 at 10:34):

ggreif updated PR #2143 from main to main:

Please don't look just yet.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 21 2020 at 10:35):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 21 2020 at 10:35):

ggreif created PR Review Comment:

Can we avoid this clone?

Wasmtime GitHub notifications bot (Aug 21 2020 at 10:37):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 21 2020 at 10:37):

ggreif created PR Review Comment:

Due to different macro expansions, sometimes this becomes a dead write. Can we suppress the warning in this particular cases?

Wasmtime GitHub notifications bot (Aug 21 2020 at 10:37):

ggreif edited PR Review Comment.

Wasmtime GitHub notifications bot (Aug 21 2020 at 10:38):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 21 2020 at 10:38):

ggreif created PR Review Comment:

TODO: Go through all DW_OP_* and check if they are mentioned here.

Wasmtime GitHub notifications bot (Aug 21 2020 at 10:50):

ggreif updated PR #2143 from main to main:

Please don't look just yet.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 21 2020 at 10:50):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 21 2020 at 10:50):

ggreif created PR Review Comment:

See #2154.

Wasmtime GitHub notifications bot (Aug 21 2020 at 11:08):

ggreif edited PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new ExpressionPart constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record the LandingPad's translated positions into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 21 2020 at 11:08):

ggreif edited PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new ExpressionPart constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 21 2020 at 11:12):

ggreif edited PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new ExpressionPart constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is done.
Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 21 2020 at 11:16):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new ExpressionPart constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is done.
Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 21 2020 at 11:35):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new ExpressionPart constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is done.
Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 21 2020 at 11:47):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new ExpressionPart constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is done.
Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 21 2020 at 11:53):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new ExpressionPart constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is done.
Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 21 2020 at 12:55):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new ExpressionPart constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is done.
Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:05):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new ExpressionPart constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is done.
Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:07):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:07):

ggreif created PR Review Comment:

We now mention all possibilities, but bail out on some, below. We are strictly doing a better job now than before.

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:07):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:07):

ggreif created PR Review Comment:

Done, thanks for the tip, it simplified matters considerably!

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:19):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:19):

ggreif created PR Review Comment:

Actually these might be pass-through, as they would push a Wasm bytecode address, which can be DW_OP_deref-ed. Will leave them here until we have a real use case though. It is hard to predict heap addresses at DWARF generation time.

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:19):

ggreif edited PR Review Comment.

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:21):

ggreif edited PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is done.
Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:24):

ggreif edited PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:27):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:34):

yurydelendik submitted PR Review.

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:34):

yurydelendik created PR Review Comment:

it is hard code to read without comment, perhaps let found = old_to_new.iter().find(|(_, new)| new == code_buf.len()); if let Some((old, new)) = found.cloned() { // update old_to_new ...

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:45):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:45):

ggreif created PR Review Comment:

I tried to have in &mut old_to_new but the borrow checker didn't like it. Modifying an iterated-over container is a bad idea, I suppose :-)

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:46):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:46):

ggreif created PR Review Comment:

It causes CI failures, too.

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:52):

yurydelendik submitted PR Review.

Wasmtime GitHub notifications bot (Aug 21 2020 at 13:52):

yurydelendik created PR Review Comment:

I recommend removing unread_bytes as a local and add it as a "parameter" to the push! macro

Wasmtime GitHub notifications bot (Aug 21 2020 at 14:10):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 21 2020 at 14:13):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 21 2020 at 14:13):

ggreif created PR Review Comment:

Added better comment, since I cannot locate any encoding hint in the context that would let me decide cleanly. For that we would need a gimli-style writer. Another refactor to tackle sometime.

Wasmtime GitHub notifications bot (Aug 21 2020 at 14:13):

ggreif edited PR Review Comment.

Wasmtime GitHub notifications bot (Aug 21 2020 at 14:18):

yurydelendik submitted PR Review.

Wasmtime GitHub notifications bot (Aug 21 2020 at 14:18):

yurydelendik created PR Review Comment:

indeed. If I understand the logic, you are searching for new == code.buf.len(). Means you need to use find old_to_new.iter().find(|(_, new)| new == code_buf.len()).cloned(). Then you can modify old_to_new based on found (if) old. No?

Wasmtime GitHub notifications bot (Aug 21 2020 at 20:00):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 22 2020 at 00:19):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 22 2020 at 00:19):

ggreif created PR Review Comment:

As suggested (elsewhere?) this could be a find followed by inserts on success.

Wasmtime GitHub notifications bot (Aug 22 2020 at 00:20):

ggreif edited PR Review Comment.

Wasmtime GitHub notifications bot (Aug 22 2020 at 00:21):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 22 2020 at 00:21):

ggreif created PR Review Comment:

remove

Wasmtime GitHub notifications bot (Aug 22 2020 at 00:26):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 22 2020 at 08:14):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 22 2020 at 09:46):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 22 2020 at 09:47):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 22 2020 at 09:47):

ggreif created PR Review Comment:

Done.

Wasmtime GitHub notifications bot (Aug 22 2020 at 10:19):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation.

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 22 2020 at 10:20):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 22 2020 at 10:20):

ggreif created PR Review Comment:

Done!

Wasmtime GitHub notifications bot (Aug 22 2020 at 10:21):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 22 2020 at 10:21):

ggreif created PR Review Comment:

Done.

Wasmtime GitHub notifications bot (Aug 22 2020 at 10:22):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 22 2020 at 10:22):

ggreif created PR Review Comment:

This is a nice use case for https://doc.rust-lang.org/std/primitive.bool.html#method.then

Wasmtime GitHub notifications bot (Aug 22 2020 at 10:34):

ggreif has marked PR #2143 as ready for review.

Wasmtime GitHub notifications bot (Aug 22 2020 at 10:35):

ggreif edited PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation (i.e. an illegal jump target).

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 22 2020 at 10:36):

ggreif requested yurydelendik for a review on PR #2143.

Wasmtime GitHub notifications bot (Aug 22 2020 at 11:06):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 22 2020 at 11:06):

ggreif created PR Review Comment:

Leaving here the _nightly_-depending diff

$ git diff
diff --git a/crates/debug/src/lib.rs b/crates/debug/src/lib.rs
index 259c86975..901b171de 100644
--- a/crates/debug/src/lib.rs
+++ b/crates/debug/src/lib.rs
@@ -1,6 +1,7 @@
 //! Debug utils for WebAssembly using Cranelift.

 #![allow(clippy::cast_ptr_alignment)]
+#![feature(bool_to_option)]

 use anyhow::{bail, ensure, Error};
 use object::{RelocationEncoding, RelocationKind};
diff --git a/crates/debug/src/transform/expression.rs b/crates/debug/src/transform/expression.rs
index ceee274f3..002bacd33 100644
--- a/crates/debug/src/transform/expression.rs
+++ b/crates/debug/src/transform/expression.rs
@@ -340,10 +340,10 @@ impl CompiledExpression {
                         for part in &self.parts {
                             match part {
                                 CompiledExpressionPart::Code(c) => {
-                                    if let Some((old, new)) = old_to_new
-                                        .iter()
-                                        .find(|(_, new)| *new == &code_buf.len())
-                                        .map(|(old, new)| (*old, *new))
+                                    if let Some((old, new)) =
+                                        old_to_new.iter().find_map(|(old, new)| {
+                                            (*new == code_buf.len()).then(|| (*old, *new))
+                                        })
                                     {
                                         // when `new` comes from the preceding LandingPad
                                         for i in 1..c.len() {

for posterity :-)

Wasmtime GitHub notifications bot (Aug 22 2020 at 11:16):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation (i.e. an illegal jump target).

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 22 2020 at 12:02):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation (i.e. an illegal jump target).

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 22 2020 at 12:08):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 22 2020 at 12:08):

ggreif created PR Review Comment:

Removed.

Wasmtime GitHub notifications bot (Aug 22 2020 at 13:47):

yurydelendik submitted PR Review.

Wasmtime GitHub notifications bot (Aug 22 2020 at 13:47):

yurydelendik created PR Review Comment:

use to_le_bytes and copy_from_slice ?

Wasmtime GitHub notifications bot (Aug 22 2020 at 16:01):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 22 2020 at 16:01):

ggreif created PR Review Comment:

Does that allow me to position in the array?

Wasmtime GitHub notifications bot (Aug 22 2020 at 16:04):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 22 2020 at 16:04):

ggreif created PR Review Comment:

N/m, I have it: https://doc.rust-lang.org/book/ch04-03-slices.html

Wasmtime GitHub notifications bot (Aug 22 2020 at 16:07):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation (i.e. an illegal jump target).

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Wasmtime GitHub notifications bot (Aug 22 2020 at 16:31):

ggreif edited PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation (i.e. an illegal jump target).

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Example:
                  DW_AT_location    (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
                  DW_AT_location    (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 22 2020 at 16:32):

ggreif edited PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation (i.e. an illegal jump target).

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently?

Example:
DW_AT_location  (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
DW_AT_location  (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 22 2020 at 16:42):

ggreif edited PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation (i.e. an illegal jump target).

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently? Probably not a biggie as long as all control-flow paths end e.g. with DW_OP_stack_value (or without).

Example:
DW_AT_location  (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
DW_AT_location  (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 22 2020 at 16:42):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 22 2020 at 16:42):

ggreif created PR Review Comment:

Done.

Wasmtime GitHub notifications bot (Aug 24 2020 at 21:25):

yurydelendik submitted PR Review.

Wasmtime GitHub notifications bot (Aug 24 2020 at 21:25):

yurydelendik submitted PR Review.

Wasmtime GitHub notifications bot (Aug 24 2020 at 21:25):

yurydelendik created PR Review Comment:

is combined == to result of parts.pop()? so you don't have to clone

Wasmtime GitHub notifications bot (Aug 24 2020 at 21:26):

yurydelendik submitted PR Review.

Wasmtime GitHub notifications bot (Aug 25 2020 at 10:36):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 25 2020 at 10:36):

ggreif created PR Review Comment:

What I'd really like to do is:
            if let (CompiledExpressionPart::Code(cc2), Some(CompiledExpressionPart::Code(cc1))) =
                (&part, parts.last())
            {
        cc1.extend_from_slice(cc2);
            } else {
                parts.push(CompiledExpressionPart::LandingPad {
                    original_pos: (expr.0.len().into_u64() - $unread) as usize,
                });
                parts.push(part)
            }
But for that Code needs to carry a mutable vector.
I tried various other ways of getting rid of the .clone() but I guess my Rust-fu is not strong enough for that. I'd love to see your idea written out though. Indeed the .pop() removes the Code which has cc1 in it, and then it gets appended by cc2 and. pushed back as a fresh Code.

Wasmtime GitHub notifications bot (Aug 25 2020 at 10:39):

ggreif edited PR Review Comment.

Wasmtime GitHub notifications bot (Aug 25 2020 at 13:26):

yurydelendik submitted PR Review.

Wasmtime GitHub notifications bot (Aug 25 2020 at 13:32):

yurydelendik submitted PR Review.

Wasmtime GitHub notifications bot (Aug 25 2020 at 13:32):

yurydelendik created PR Review Comment:

You are not wrong above, just use parts.last_mut() instead of .last().

Wasmtime GitHub notifications bot (Aug 25 2020 at 15:04):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 25 2020 at 15:04):

ggreif created PR Review Comment:

Success!

Wasmtime GitHub notifications bot (Aug 25 2020 at 15:05):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation (i.e. an illegal jump target).

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently? Probably not a biggie as long as all control-flow paths end e.g. with DW_OP_stack_value (or without).

Example:
DW_AT_location  (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
DW_AT_location  (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 25 2020 at 15:12):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation (i.e. an illegal jump target).

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently? Probably not a biggie as long as all control-flow paths end e.g. with DW_OP_stack_value (or without).

Example:
DW_AT_location  (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
DW_AT_location  (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 25 2020 at 20:17):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation (i.e. an illegal jump target).

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently? Probably not a biggie as long as all control-flow paths end e.g. with DW_OP_stack_value (or without).

Example:
DW_AT_location  (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
DW_AT_location  (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 25 2020 at 22:11):

yurydelendik submitted PR Review.

Wasmtime GitHub notifications bot (Aug 25 2020 at 22:11):

yurydelendik created PR Review Comment:

I think this and above test cases are incorrect. Can you verify that?

Wasmtime GitHub notifications bot (Aug 25 2020 at 23:50):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 25 2020 at 23:50):

ggreif created PR Review Comment:

The parsing tests pass
test transform::expression::tests::test_debug_parse_expressions ... ok
The tests themselves are not intended to make sense, if that is what you mean. The example in the PR description is an actual location expression with its translation.

Wasmtime GitHub notifications bot (Aug 26 2020 at 00:22):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 26 2020 at 00:22):

ggreif created PR Review Comment:

The original_pos: 5 appearing twice is probably which is confusing. It confused me too. But actually it makes sense. The jump goes from 5 to 7. The LandingPad before Code just orients it and tells that there are potential jump targets in Code (pre-translation) pos 6: DW_OP_lit0, pos 7: DW_OP_stack_value.

Wasmtime GitHub notifications bot (Aug 26 2020 at 00:23):

ggreif edited PR Review Comment.

Wasmtime GitHub notifications bot (Aug 26 2020 at 00:26):

ggreif edited PR Review Comment.

Wasmtime GitHub notifications bot (Aug 26 2020 at 00:28):

ggreif edited PR Review Comment.

Wasmtime GitHub notifications bot (Aug 26 2020 at 16:13):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation (i.e. an illegal jump target).

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently? Probably not a biggie as long as all control-flow paths end e.g. with DW_OP_stack_value (or without).

Example:
DW_AT_location  (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
DW_AT_location  (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 26 2020 at 16:43):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at all relevant positions:

immediately after a Jump

at actual Jump targets

before Code chunks when the jump target goes properly into an inner bytecode position.

LandingPads are identified by the bytecode instruction offset _before_ translation.

Additionally we now parse the Jump instructions. Upon writeing we record every LandingPad's translated position into a relocation map and using that we are able to recompute the target distance for every jump-like instruction.

Since we don't know in advance if we will need LandingPads, we speculatively add all, and clean up after parsing is concluded. At this point the map from jump sources to destinations is complete.

Caveat: Currently we make no attempt to gracefully degrade when jump targets point into a multibyte operation (i.e. an illegal jump target).

Progress:

relocation of branch edges works

open problem: the need_deref flag. What when different branches want to set the flag differently? Probably not a biggie as long as all control-flow paths end e.g. with DW_OP_stack_value (or without).

Example:
DW_AT_location  (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
DW_AT_location  (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 26 2020 at 16:48):

ggreif edited PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at actual Jump targets.

LandingPads are identified by JumpTarget tokens which are generated on the fly.

Additionally we now parse the Jump instructions. These also get corresponding JumpTarget token.

We bail in two situations:

frame_ is too complicated (i.e. itself contains Jump)

some jump distance in the original expression is fishy.

Open problem: the need_deref flag. What when different branches want to set the flag differently? Probably not a biggie as long as all control-flow paths end e.g. with DW_OP_stack_value (or without).

Example:
DW_AT_location  (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
DW_AT_location  (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 26 2020 at 16:49):

ggreif edited PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at actual Jump targets.

LandingPads are identified by JumpTarget tokens which are generated on the fly.

Additionally we now parse the Jump instructions. These also get corresponding JumpTarget token.

We bail in two situations:

frame_base is too complicated (i.e. itself contains Jump)

some jump distance in the original expression is fishy.

Open problem: the need_deref flag. What when different branches want to set the flag differently? Probably not a biggie as long as all control-flow paths end e.g. with DW_OP_stack_value (or without).

Example:
DW_AT_location  (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
DW_AT_location  (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 26 2020 at 16:49):

ggreif edited PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at actual Jump targets.

LandingPads are identified by JumpTargetMarker tokens which are generated on the fly.

Additionally we now parse the Jump instructions. These also get corresponding JumpTargetMarker token.

We bail in two situations:

frame_base is too complicated (i.e. itself contains Jump)

some jump distance in the original expression is fishy.

Open problem: the need_deref flag. What when different branches want to set the flag differently? Probably not a biggie as long as all control-flow paths end e.g. with DW_OP_stack_value (or without).

Example:
DW_AT_location  (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
DW_AT_location  (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 26 2020 at 16:50):

ggreif edited PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at actual Jump targets.

LandingPads are identified by JumpTargetMarker tokens which are generated on the fly.

Additionally we now parse the Jump instructions. These also get their corresponding JumpTargetMarker token.

We bail in two situations:

frame_base is too complicated (i.e. itself contains Jump)

some jump distance in the original expression is fishy.

Open problem: the need_deref flag. What when different branches want to set the flag differently? Probably not a biggie as long as all control-flow paths end e.g. with DW_OP_stack_value (or without).

Example:
DW_AT_location  (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
DW_AT_location  (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 26 2020 at 16:51):

ggreif edited PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at actual Jump targets.

LandingPads are identified by JumpTargetMarker tokens which are generated on the fly.

Additionally we now parse the Jump instructions. These also get their corresponding JumpTargetMarker token.

We bail in two situations:

frame_base is too complicated (i.e. itself contains Jump)

some jump distance in the original expression is fishy.

Open problem: the need_deref flag. What when different branches want to set the flag differently? Probably not a biggie as long as all control-flow paths end e.g. with DW_OP_stack_value (or without).

Example:
DW_AT_location  (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
DW_AT_location  (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 26 2020 at 16:52):

ggreif requested yurydelendik for a review on PR #2143.

Wasmtime GitHub notifications bot (Aug 26 2020 at 17:23):

yurydelendik submitted PR Review.

Wasmtime GitHub notifications bot (Aug 26 2020 at 17:23):

yurydelendik submitted PR Review.

Wasmtime GitHub notifications bot (Aug 26 2020 at 17:23):

yurydelendik created PR Review Comment:

can we revert this change?

Wasmtime GitHub notifications bot (Aug 26 2020 at 17:23):

yurydelendik created PR Review Comment:

    // Looks hacky but it is fast; does not need to be really exact.

Wasmtime GitHub notifications bot (Aug 26 2020 at 17:23):

yurydelendik created PR Review Comment:

        // Create somewhat unique hash data -- using part of

Wasmtime GitHub notifications bot (Aug 26 2020 at 17:23):

yurydelendik created PR Review Comment:

        // Internal hash_data test (theoretically can fail intermittently).

Wasmtime GitHub notifications bot (Aug 26 2020 at 17:39):

yurydelendik updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at actual Jump targets.

LandingPads are identified by JumpTargetMarker tokens which are generated on the fly.

Additionally we now parse the Jump instructions. These also get their corresponding JumpTargetMarker token.

We bail in two situations:

frame_base is too complicated (i.e. itself contains Jump)

some jump distance in the original expression is fishy.

Open problem: the need_deref flag. What when different branches want to set the flag differently? Probably not a biggie as long as all control-flow paths end e.g. with DW_OP_stack_value (or without).

Example:
DW_AT_location  (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
DW_AT_location  (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 26 2020 at 17:39):

yurydelendik deleted PR Review Comment.

Wasmtime GitHub notifications bot (Aug 26 2020 at 17:39):

yurydelendik deleted PR Review Comment.

Wasmtime GitHub notifications bot (Aug 26 2020 at 17:39):

yurydelendik deleted PR Review Comment.

Wasmtime GitHub notifications bot (Aug 26 2020 at 18:35):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 26 2020 at 18:35):

ggreif created PR Review Comment:

                parts.push(CompiledExpressionPart::Code(code_chunk));
                code_chunk = Vec::new();

like this?

Wasmtime GitHub notifications bot (Aug 26 2020 at 18:35):

ggreif deleted PR Review Comment.

Wasmtime GitHub notifications bot (Aug 26 2020 at 18:36):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 26 2020 at 18:36):

ggreif created PR Review Comment:

                parts.push(CompiledExpressionPart::Code(code_chunk));
                code_chunk = Vec::new();

Wasmtime GitHub notifications bot (Aug 26 2020 at 18:37):

ggreif edited PR Review Comment.

Wasmtime GitHub notifications bot (Aug 26 2020 at 18:39):

ggreif edited PR Review Comment.

Wasmtime GitHub notifications bot (Aug 26 2020 at 18:40):

ggreif edited PR Review Comment.

Wasmtime GitHub notifications bot (Aug 26 2020 at 18:40):

ggreif edited PR Review Comment.

Wasmtime GitHub notifications bot (Aug 26 2020 at 18:44):

yurydelendik submitted PR Review.

Wasmtime GitHub notifications bot (Aug 26 2020 at 18:44):

yurydelendik created PR Review Comment:

Can push!( stay? I was thinking about code_chunk.clone() and code_chunk = Vec::new() stuff.

Wasmtime GitHub notifications bot (Aug 26 2020 at 18:46):

ggreif edited PR Review Comment.

Wasmtime GitHub notifications bot (Aug 26 2020 at 18:51):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 26 2020 at 18:51):

ggreif created PR Review Comment:

I hope to keep push!( (see above), but no idea how to fix the warning without artificially reading the code_chunk after the last macro invocation.

Wasmtime GitHub notifications bot (Aug 26 2020 at 18:51):

ggreif edited PR Review Comment.

Wasmtime GitHub notifications bot (Aug 26 2020 at 19:04):

yurydelendik submitted PR Review.

Wasmtime GitHub notifications bot (Aug 26 2020 at 19:04):

yurydelendik created PR Review Comment:

I see. Yeah, it just needs if !code_chunk.is_empty() { without code_chunk = Vec::new()

Wasmtime GitHub notifications bot (Aug 26 2020 at 20:17):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at actual Jump targets.

LandingPads are identified by JumpTargetMarker tokens which are generated on the fly.

Additionally we now parse the Jump instructions. These also get their corresponding JumpTargetMarker token.

We bail in two situations:

frame_base is too complicated (i.e. itself contains Jump)

some jump distance in the original expression is fishy.

Open problem: the need_deref flag. What when different branches want to set the flag differently? Probably not a biggie as long as all control-flow paths end e.g. with DW_OP_stack_value (or without).

Example:
DW_AT_location  (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
DW_AT_location  (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 26 2020 at 20:18):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 26 2020 at 20:18):

ggreif created PR Review Comment:

I did something that works. But I feel dense and am not sure if this is what you meant. Please feel free to fix.

Wasmtime GitHub notifications bot (Aug 26 2020 at 20:20):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at actual Jump targets.

LandingPads are identified by JumpTargetMarker tokens which are generated on the fly.

Additionally we now parse the Jump instructions. These also get their corresponding JumpTargetMarker token.

We bail in two situations:

frame_base is too complicated (i.e. itself contains Jump)

some jump distance in the original expression is fishy.

Open problem: the need_deref flag. What when different branches want to set the flag differently? Probably not a biggie as long as all control-flow paths end e.g. with DW_OP_stack_value (or without).

Example:
DW_AT_location  (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
DW_AT_location  (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 26 2020 at 20:22):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 26 2020 at 20:22):

ggreif created PR Review Comment:

            // Discarding out-of-bounds jumps (also some of falsely detected ops)

Wasmtime GitHub notifications bot (Aug 26 2020 at 20:22):

ggreif updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at actual Jump targets.

LandingPads are identified by JumpTargetMarker tokens which are generated on the fly.

Additionally we now parse the Jump instructions. These also get their corresponding JumpTargetMarker token.

We bail in two situations:

frame_base is too complicated (i.e. itself contains Jump)

some jump distance in the original expression is fishy.

Open problem: the need_deref flag. What when different branches want to set the flag differently? Probably not a biggie as long as all control-flow paths end e.g. with DW_OP_stack_value (or without).

Example:
DW_AT_location  (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
DW_AT_location  (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 26 2020 at 21:41):

yurydelendik updated PR #2143 from main to main:

The fundamental problem is that the target distance of jump-like operations may change in the DWARF expression translation process. Intervening DW_OP_deref will expand to about 10 bytes, for example.

So the jumps must be relocated. We approach this task by inserting artificial LandingPad markers (new CompiledExpressionParts constructors) into the parsed vector at actual Jump targets.

LandingPads are identified by JumpTargetMarker tokens which are generated on the fly.

Additionally we now parse the Jump instructions. These also get their corresponding JumpTargetMarker token.

We bail in two situations:

frame_base is too complicated (i.e. itself contains Jump)

some jump distance in the original expression is fishy.

Open problem: the need_deref flag. What when different branches want to set the flag differently? Probably not a biggie as long as all control-flow paths end e.g. with DW_OP_stack_value (or without).

Example:
DW_AT_location  (DW_OP_WASM_location 0x0 +1, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +3, DW_OP_plus_uconst 0x5, DW_OP_deref, DW_OP_stack_value)
gets transformed into:
DW_AT_location  (DW_OP_breg6 RBP-36, DW_OP_deref, DW_OP_dup, DW_OP_lit1, DW_OP_and, DW_OP_bra +5, DW_OP_lit1, DW_OP_shr, DW_OP_skip +19, DW_OP_plus_uconst 0x5, DW_OP_breg6 RBP-56, DW_OP_deref, DW_OP_consts +152, DW_OP_plus, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)

Wasmtime GitHub notifications bot (Aug 26 2020 at 21:47):

ggreif submitted PR Review.

Wasmtime GitHub notifications bot (Aug 26 2020 at 21:47):

ggreif created PR Review Comment:

I see, thanks!

Wasmtime GitHub notifications bot (Aug 26 2020 at 22:19):

yurydelendik merged PR #2143.

Last updated: Apr 18 2025 at 02:29 UTC