Stream: git-wasmtime

Topic: wasmtime / PR #8627 Use WASM function names in compiled o...


view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 17:16):

lpereira opened PR #8627 from lpereira:wasm-func-names-in-objs to bytecodealliance:main:

Instead of generating symbol names in the format
"wasm[$MODULE_ID]::function[$FUNCTION_INDEX]", generate (if possible) something more readable, such as "wasm[$MODULE_ID]::$FUNCTION_NAME". This helps when debugging or profiling the generated code.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 17:16):

lpereira requested alexcrichton for a review on PR #8627.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 17:16):

lpereira requested wasmtime-core-reviewers for a review on PR #8627.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 17:21):

bjorn3 commented on PR #8627:

Is it guaranteed that the names section only contains unique names?

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 17:23):

alexcrichton commented on PR #8627:

Thanks! This makes sense to me and I've often thought this would be nice as well. A hesitation I've had in the past though is that I'm a bit wary about exposing user-provided data to native tools like perf and objdump and such. The name section has completely arbitrary strings and I while I can't say for certainty that native tooling behaves well in the face of arbitrary strings I suspect that there might be a security issue waiting to happen along the lines of "if you run ldd over a binary it can run arbitrary code" or something like that.

(a good example being @bjorn3's suggestion where the name section has no guarantees of uniqueness and I'm not sure what tooling would do with same-named-symbols)

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 17:37):

cfallin commented on PR #8627:

I wonder if we could define suitable "cleanup" for names: (i) rewrite duplicates with numeric suffixes, (ii) replace or remove any characters not in the usual set ([A-Za-z0-9-_] -- maybe spaces too? can tools handle those?), (iii) cap the length at something reasonable like 32 or 64 characters?

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 17:46):

lpereira commented on PR #8627:

Could also add the function_index for the cases where uniqueness isn't guaranteed. Alright, I'll cook up a patch with the suggestions.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 17:47):

alexcrichton commented on PR #8627:

Yeah I think that'd work well (albeit bit a bit more involved in the implementation). While I can't say for certainty that spaces are not supported my guess is that tools would likely eventually choke on non-C-looking symbols so I'd be tempted to leave them out.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 18:02):

fitzgen commented on PR #8627:

How about

?

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 18:03):

fitzgen edited a comment on PR #8627:

How about

?

This avoids needing to either add numbers into the sanitized name and then worrying about removing those numbers when capping the length, or needing to account for the length of the numbers that might be added after sanitizing and capping the name.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 18:41):

lpereira updated PR #8627.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 18:46):

lpereira commented on PR #8627:

Alright, pushed some changes that:

I think this should be sufficient, although I'm not really sold on how good the string filtering code is. I wish there was a String::translate() that would work similarly to String::retain(), but replace the character with what the callback function returns. I'm a newbie in Rust, though, so maybe there's a way I couldn't figure out?

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 18:50):

lpereira edited a comment on PR #8627:

Alright, pushed some changes that:

I think this should be sufficient, although I'm not really sold on how good the string filtering code is (in fact, it's pretty bad, as it always goes back to the beginning of the string to find the next match; I'll force-push a new thing soon). I wish there was a String::translate() that would work similarly to String::retain(), but replace the character with what the callback function returns. I'm a newbie in Rust, though, so maybe there's a way I couldn't figure out?

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 19:05):

lpereira updated PR #8627.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 19:06):

lpereira edited a comment on PR #8627:

Alright, pushed some changes that:

I think this should be sufficient. Can anyone please take a look again?

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 19:07):

lpereira updated PR #8627.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 19:08):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 19:08):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 19:08):

cfallin created PR review comment:

I think you can do something better with iterators, and also we'll want to avoid allocating a string in the common case -- perhaps something like:

let name = if name.chars().any(|c| bad_char(c)) {
  Cow::Owned(name.chars().map(|c| if bad_char(c) { '?' } else { c }).collect::<String>())
} else {
  Cow::Borrowed(&name[..])
}

Thoughts?

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 19:09):

lpereira updated PR #8627.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 19:16):

lpereira updated PR #8627.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 19:18):

lpereira submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 19:18):

lpereira created PR review comment:

Yup, right after I pushed this I found about the map() method. Let me use Cow too, as this should reduce quite a bit of writes in the happy path.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 20:28):

lpereira updated PR #8627.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 20:29):

lpereira submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 20:29):

lpereira created PR review comment:

Done, thanks!

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 20:30):

lpereira updated PR #8627.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 20:37):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 20:37):

cfallin created PR review comment:

.take(N) in the iterator chain should allow the truncation to happen inline (and stop iteration early if so, since the whole chain is lazy)

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 20:47):

lpereira updated PR #8627.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 20:47):

lpereira submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 20:47):

lpereira created PR review comment:

Done.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 20:59):

cfallin submitted PR review:

Looks reasonable to me, thanks for the iterations!

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 20:59):

lpereira updated PR #8627.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 20:59):

lpereira submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 20:59):

lpereira created PR review comment:

Actually, no, I'm not sure how to do this; I want to truncate the string after trim_matches('?') removed repeated occurrences of '?' in the cleaned string, not before.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 20:59):

cfallin has enabled auto merge for PR #8627.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 21:00):

cfallin has disabled auto merge for PR #8627.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 21:00):

cfallin commented on PR #8627:

Ah, force-pushed after my review; could you do subsequent updates as additional commits? Let me know when stable and I can re-review.

view this post on Zulip Wasmtime GitHub notifications bot (May 15 2024 at 23:17):

lpereira commented on PR #8627:

It's stable now. (I'm not using the take() method for the reason I stated above, so I reverted to the previous version.)

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2024 at 05:11):

jameysharp submitted PR review:

This is going to be such a big usability improvement for people using tools like perf! I'm excited.

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2024 at 05:11):

jameysharp submitted PR review:

This is going to be such a big usability improvement for people using tools like perf! I'm excited.

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2024 at 05:11):

jameysharp created PR review comment:

Maybe you wanted is_ascii_alphanumeric here instead, rather than allowing all Unicode alpha/numeric codepoints?

        let bad_char = |c: char| !c.is_ascii_alphanumeric() && !r"<>[]_-:@$".contains(c);

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2024 at 05:11):

jameysharp created PR review comment:

I guess the name should still be truncated even if there are no bad characters, right?

            Cow::Borrowed(&name[..Self::MAX_SYMBOL_LEN])

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2024 at 05:11):

jameysharp created PR review comment:

You mention that you intended to turn any run of multiple ? into a single ?. Note that trim_matches doesn't do that. Instead, it deletes all copies of the given pattern from both ends of the string. That's why it can return a borrowed &str instead of a heap-allocated String: it doesn't change the middle of the string, so it can just adjust the start pointer and the length.

I think this should do more or less what you want (though I'm typing it in GitHub's web editor without testing it so who knows). It deletes all ? at the beginning of the string, then compresses any run of multiple ? into a single ?. It can leave a single ? at the end of the string but I couldn't think of a simple way to fix that and I don't think it matters. Also it uses take like Chris suggested.

            let mut last_char = '?';
            let name = name
                .chars()
                .map(|c| if bad_char(c) { '?' } else { c })
                .filter(|c| { let keep = last_char != c; last_char = c; keep })
                .take(Self::MAX_SYMBOL_LEN)
                .collect::<String>();

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2024 at 05:33):

jameysharp commented on PR #8627:

Ah, the CI test failures are because tests/disas.rs is specifically looking for symbol names which contain both wasm and function in their names, and you've shortened the latter to func. This is near the beginning of disas_elf.

I don't like that this test is relying on this implementation detail of Wasmtime, but I suppose it's fine. So you can either change this PR to continue to use function, or change disas.rs to search for func.

Either way, you'll also need to run WASMTIME_TEST_BLESS=1 cargo test --test disas to update the tests to reflect this change. Some of them do have function names, and those now show up in the expected test output, which is a fantastic bonus.

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2024 at 16:25):

lpereira created PR review comment:

Done!

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2024 at 16:25):

lpereira submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2024 at 16:26):

lpereira submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2024 at 16:26):

lpereira created PR review comment:

Yup, had this change locally and forgot to push. Good catch!

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2024 at 16:29):

lpereira updated PR #8627.

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2024 at 16:39):

lpereira updated PR #8627.

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2024 at 16:40):

lpereira commented on PR #8627:

Alright, thanks for the review! This should be the last round.

(@jameysharp, that filter wouldn't work -- words like "keep", which are fine, would be transformed to "kep" -- but I tweaked it to consider that case.)

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2024 at 17:42):

jameysharp commented on PR #8627:

(@jameysharp, that filter wouldn't work -- words like "keep", which are fine, would be transformed to "kep" -- but I tweaked it to consider that case.)

Oh, you're absolutely right. :sweat_smile:

Looks good to me, let's ship it!

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2024 at 17:58):

jameysharp merged PR #8627.


Last updated: Dec 23 2024 at 12:05 UTC