Custom hashers · wasmtime · Zulip Chat Archive

Would another option be to change the HashMap implementation used here? It doesn't seem to me like this would be a DoS attack vector, so perhaps a fast HashMap that'd be less robust against those would work here? @Alex Crichton, thoughts?

Alex Crichton (Nov 28 2023 at 14:50):

Ah yeah we likely don't need dos protection for most of our usage of hash maps, so rather than specifying custom hashers as an embedder it should be fine to hardcode many hash maps to a faster hasher.

That being said though I'd probably first go the route of trying to reduce the usage of APIs that trigger hashing. That's not always possible, though, for example if you have a single Linker and instantiate lots of small moudles through that there's not much we can do to reduce the hashing any more.

Josh Groves (Nov 28 2023 at 15:22):

Yeah I currently use separate Linkers per module but still spend a significant amount of time hashing during the initial inserts. I can try to submit a PR for some of the spots I notice, would you prefer any specific fast hasher (rustc-hash, ahash, something else)?

Till Schneidereit (Nov 28 2023 at 16:48):

we have ahash in our audits, so that'd probably be good to use. Though I think rustc-hash should also just work in that regard

Lann Martin (Nov 28 2023 at 16:50):

~~Isn't ahash the default rust hasher now (via being hashbrown's default)?~~ Ah no I see libstd opts out of ahash.

Alex Crichton (Nov 28 2023 at 17:08):

Alex Crichton (Nov 28 2023 at 17:09):

Josh Groves (Nov 28 2023 at 17:10):

Yeah I also want to try that soon, although I might have the opposite problem of having a large hashmap to search afterwards. I think a faster hasher would be beneficial either way

Alex Crichton (Nov 28 2023 at 17:12):

Alex Crichton (Nov 28 2023 at 17:13):

Also, are you doing repeated instantiations of the same module? Or instantiations of new modules primarily?

Alex Crichton (Nov 28 2023 at 17:13):

I ask b/c InstancePre can help with the former, repeated instantaitions of the same module, and skips hash maps entirely

Josh Groves (Nov 28 2023 at 17:19):

That's also a good idea, it's mostly new modules right now but I think I could probably reuse many of them. I need to spend some more understanding the implications of reusing modules and linkers, especially trade-offs (e.g., inlining different constants into a module from the host at compilation time with separate modules, vs. reusing one module but using different environments or something)

Josh Groves (Nov 28 2023 at 17:21):

FWIW if it helps with context, my use case is analogous to Excel/Google Sheets spreadsheets where each formula becomes its own tiny wasm module (using its own linker) in my current design

Josh Groves (Nov 28 2023 at 17:22):

So it's easy to end up with tens of thousands-ish modules if they're all completely unique, but a lot of the time the formulas are the exact same but just need slightly different environments passed to them

Alex Crichton (Nov 28 2023 at 17:23):

ah ok makes sense, well in any case switching hashers definitely makes sense, and any other optimizations you'd like to apply to Linker or the compilation process in general I think would make sense too

Alex Crichton (Nov 28 2023 at 17:23):

most of Wasmtime's optimizations have been around runtime and InstancePre-to-Instance so this part isn't quite so well optimized

scottmcm (Nov 28 2023 at 18:41):

How much is it specifically that one? It looks like ImportKey is two indexes into an interned strings list? If so, that's a great candidate for some type-specific hashing logic. Or maybe even making a struct InternKey(u32); instead of using raw usize to shrink ImportKey from 16 bytes down to 8 on 64-bit, which save a bunch of hashing work.

(This is totally a peanut-gallery comment because zulip put this thread in a digest and I've been looking recently at hashtables where the key is two integers. Feel free to ignore if I'm off-base.)

Optimize `Entity::eq` by scottmcm · Pull Request #10519 · bevyengine/bevy

(This is my first PR here, so I've probably missed some things. Please let me know what else I should do to help you as a reviewer!) Objective Due to rust-lang/rust#117800, the derive'd PartialEq:...

Save an instruction in `EntityHasher` by scottmcm · Pull Request #10648 · bevyengine/bevy

Objective Keep essentially the same structure of EntityHasher from #9903, but rephrase the multiplication slightly to save an instruction. cc @superdump Discord thread: https://discord.com/channels...

Josh Groves (Nov 28 2023 at 18:48):

Unfortunately I don’t have great stats right now, the sampling profiler in vtune makes it difficult to get detailed statistics for sub-millisecond code without setting up artificial microbenchmarks. I just noticed SipHash show up in a few hot paths for wasmtime so I figured it might be worth raising here. That partialeq optimization looks neat!

scottmcm (Nov 28 2023 at 18:54):

Entity in bevy shows up everywhere, so it's worth some uglification for even tiny speedups. I don't know nearly enough about wasmtime to know if ImportKey is anywhere close to as critical.

Poor codegen for derived `==` on simple 2-field struct · Issue #117800 · rust-lang/rust

Given a basic struct like this, #[derive(Copy, Clone, PartialEq, Eq)] pub struct Entity { g: u32, i: u32 } The generated == is suboptimal: #[no_mangle] pub fn derived_eq(x: &Entity, y: &Entity) -> ...

Josh Groves (Jan 26 2024 at 16:10):

Use `rustc-hash` for module exports by grovesNL · Pull Request #7828 · bytecodealliance/wasmtime

From a Zulip discussion a few months ago (https://bytecodealliance.zulipchat.com/#narrow/stream/217126-wasmtime/topic/Custom.20hashers) it was mentioned that custom hashers like rustc-hash should b...

Stream: wasmtime

Topic: Custom hashers

Josh Groves (Nov 28 2023 at 04:03):

Till Schneidereit (Nov 28 2023 at 13:39):

Alex Crichton (Nov 28 2023 at 14:50):

Josh Groves (Nov 28 2023 at 15:22):

Till Schneidereit (Nov 28 2023 at 16:48):

Lann Martin (Nov 28 2023 at 16:50):

Alex Crichton (Nov 28 2023 at 17:08):

Alex Crichton (Nov 28 2023 at 17:09):

Josh Groves (Nov 28 2023 at 17:10):

Alex Crichton (Nov 28 2023 at 17:12):

Alex Crichton (Nov 28 2023 at 17:13):

Alex Crichton (Nov 28 2023 at 17:13):

Josh Groves (Nov 28 2023 at 17:19):

Josh Groves (Nov 28 2023 at 17:21):

Josh Groves (Nov 28 2023 at 17:22):

Alex Crichton (Nov 28 2023 at 17:23):

Alex Crichton (Nov 28 2023 at 17:23):

scottmcm (Nov 28 2023 at 18:41):

Josh Groves (Nov 28 2023 at 18:48):

scottmcm (Nov 28 2023 at 18:54):

Josh Groves (Jan 26 2024 at 16:10):