tnachen commented on issue #5261:
@fitzgen
abrown commented on issue #5261:
Glancing at this, is there a test out there that checks the size of
SigData
? I have found that kind of test helpful in other parts of this project.
jameysharp commented on issue #5261:
I don't see an existing test for the size of
SigData
, no. I'm not entirely convinced we're doing the right thing with the existing tests—the fact that the most recent rustc upgrade broke one of the tests by doing a better job of laying out one of our structs makes me think maybe that's not exactly what we should be testing. But I don't have a better idea at the moment, and it's reassuring to know that at least we'll find out if the sizes change.
abrown commented on issue #5261:
... and it's reassuring to know that at least we'll find out if the sizes change.
Yeah, exactly. If such a test is too brittle across OSes or compiler versions for some reason then it doesn't make sense. But a general reassurance has been nice with, e.g., instruction data.
tnachen commented on issue #5261:
Thanks @jameysharp for the review! Updated based on comments, let me know if a test makes sense here (though as mentioned seems like the results will differ)
jameysharp commented on issue #5261:
I think it would be useful to have a test for
SigData
's size, but at the same time, I'm not expecting you to add it unless you want to.If you do want to write that test, I think we should replace the one use of
usize
(thestack_ret_arg
field) with a type whose size doesn't depend on the target's pointer size. That way the test doesn't depend on which target we're compiling for. Then it should only change when the Rust compiler changes its ABI, which is infrequent enough that we can deal with it.Given the other changes in this PR, I think giving that field the type
Option<u16>
ought to work fine, since it's an index into the arguments. As a bonus, that's a substantial improvement in size:
- On a 64-bit target,
Option<usize>
is 16 bytes, whileOption<u16>
is only 4 bytes.- Because
SigData
currently requires 8-byte alignment, it has 7 bytes of padding after the 1-bytecall_conv
field, so those 4 bytes can fit in the existing padding.- And that means this change would shrink
SigData
by 16 bytes, on top of the 8 bytes you've already cut with this PR.None of that is necessary for merging this PR. If you'd like to make these additional changes in this PR, feel free. If you want me to merge this PR as-is, let me know and I'm happy to do that (after the remaining
usize::from
change). These additional changes would make a good follow-up PR too.
tnachen commented on issue #5261:
@jameysharp sounds good I'll make these changes! seems like it's simple enough. In the mean time just ran the benchmark ->
Running `target/release/sightglass-cli benchmark -e /tmp/main.so -e /tmp/smaller-sig-data.so --stop-after compilation -- benchmarks/default.suite`
compilation :: cycles :: benchmarks/spidermonkey/benchmark.wasm
Δ = 460930934.38 ± 351410455.73 (confidence = 99%)
smaller-sig-data.so is 1.01x to 1.07x faster than main.so!
[11432848277 12881522406.46 16050235396] main.so
[11050769166 12420591472.08 15404617502] smaller-sig-data.socompilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm
No difference in performance.
[524369235 620830497.53 829897789] main.so
[492571419 606893937.78 868345984] smaller-sig-data.socompilation :: cycles :: benchmarks/bz2/benchmark.wasm
No difference in performance.
[324283951 399934581.50 681479969] main.so
[306842933 394721779.72 550360275] smaller-sig-data.so
fitzgen edited a comment on issue #5261:
@jameysharp sounds good I'll make these changes! seems like it's simple enough. In the mean time just ran the benchmark ->
Running `target/release/sightglass-cli benchmark -e /tmp/main.so -e /tmp/smaller-sig-data.so --stop-after compilation -- benchmarks/default.suite` compilation :: cycles :: benchmarks/spidermonkey/benchmark.wasm Δ = 460930934.38 ± 351410455.73 (confidence = 99%) smaller-sig-data.so is 1.01x to 1.07x faster than main.so! [11432848277 12881522406.46 16050235396] main.so [11050769166 12420591472.08 15404617502] smaller-sig-data.so compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm No difference in performance. [524369235 620830497.53 829897789] main.so [492571419 606893937.78 868345984] smaller-sig-data.so compilation :: cycles :: benchmarks/bz2/benchmark.wasm No difference in performance. [324283951 399934581.50 681479969] main.so [306842933 394721779.72 550360275] smaller-sig-data.so
fitzgen commented on issue #5261:
just ran the benchmark
Great results! 1-7% faster compilation times for
spidermonkey.wasm
is nothing to sneeze at!And this is before the further shrinking with the
stack_ret_arg
field, correct?
jameysharp commented on issue #5261:
The newest commit ("Change ret arg length to u16") looks good! CI says you need to run
cargo fmt
, but this is otherwise ready to merge.I think I did my math wrong before: the original version of the PR should have reduced the size of
SigData
by 4 bytes (2u32
-2u16
), not 8. This version replaces 2usize
with 2u16
, which means now we have 4u16
-aligned fields, so nothing fits into the 7 existing bytes of padding. Oh well.If you want to add a test for this now, the following should work, assuming I did my math right this time.
SigData
is currently 56 bytes (according to ddbug) and I believe you've reduced it to 40 bytes.#[test] fn sig_data_size() { // The size of `SigData` is performance sensitive, so make sure // we don't regress it unintentionally. assert_eq!(std::mem::size_of::<SigData>(), 40); }
I copied this from the test for
InstructionData
incranelift/codegen/src/ir/instructions.rs
, but you can find many examples withgit grep 'assert.*size_of::'
.
tnachen commented on issue #5261:
@jameysharp oh awesome! Thanks for the example, not sure what else I should add so I just copied what you have into the test code.
Ran the benchmark and got this ->
Running
target/release/sightglass-cli benchmark -e /tmp/main.so -e /tmp/smaller-sig-data.so --stop-after compilation -- benchmarks/default.suite
compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm
Δ = 108117568.68 ± 43286306.96 (confidence = 99%)
smaller-sig-data.so is 1.12x to 1.27x faster than main.so!
[499997365 671467058.89 1397732830] main.so
[483578441 563349490.21 708900294] smaller-sig-data.socompilation :: cycles :: benchmarks/bz2/benchmark.wasm
Δ = 44610760.31 ± 31993506.80 (confidence = 99%)
smaller-sig-data.so is 1.03x to 1.20x faster than main.so!
[304879457 426284282.91 661512654] main.so
[296608459 381673522.60 829079445] smaller-sig-data.socompilation :: cycles :: benchmarks/spidermonkey/benchmark.wasm
No difference in performance.
[10560613802 11918560571.18 15092693358] main.so
[9949435804 11698203046.57 14045075735] smaller-sig-data.so
fitzgen edited a comment on issue #5261:
@jameysharp oh awesome! Thanks for the example, not sure what else I should add so I just copied what you have into the test code.
Ran the benchmark and got this ->
Running `target/release/sightglass-cli benchmark -e /tmp/main.so -e /tmp/smaller-sig-data.so --stop-after compilation -- benchmarks/default.suite` compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 108117568.68 ± 43286306.96 (confidence = 99%) smaller-sig-data.so is 1.12x to 1.27x faster than main.so! [499997365 671467058.89 1397732830] main.so [483578441 563349490.21 708900294] smaller-sig-data.so compilation :: cycles :: benchmarks/bz2/benchmark.wasm Δ = 44610760.31 ± 31993506.80 (confidence = 99%) smaller-sig-data.so is 1.03x to 1.20x faster than main.so! [304879457 426284282.91 661512654] main.so [296608459 381673522.60 829079445] smaller-sig-data.so compilation :: cycles :: benchmarks/spidermonkey/benchmark.wasm No difference in performance. [10560613802 11918560571.18 15092693358] main.so [9949435804 11698203046.57 14045075735] smaller-sig-data.so
Last updated: Nov 22 2024 at 17:03 UTC