abrown opened issue #6293:
@alexcrichton sent me a [coverage report] for the ISLE-generated lowerings for x64. I noticed by searching for "sinkable_load" that various operations (FP?) are not using the case where the load is sunk. In the Wasmtime meeting, @alexcrichton pointed out that the fact that we don't know whether the address is aligned could be preventing load-sinking here, but recall ([here]) that VEX- and EVEX-encoded instructions do not have the alignment restriction. Since @alexcrichton has been adding VEX-encoded versions of many instructions, I would think that we should be able to load-sink in that case (i.e., "can load sink MINUS alignment restriction" AND "have AVX").
[coverage report]: https://storage.googleapis.com/oss-fuzz-coverage/wasmtime/reports/20230425/linux/src/wasmtime/target/debug/build/cranelift-codegen-d3be7d5f89248fcc/out/isle_x64.rs.html
[here]: https://github.com/bytecodealliance/wasmtime/issues/4767#issuecomment-1432149408I don't know what to do with this issue exactly, but @fitzgen thought I should record it as one. As mentioned in the Wasmtime meeting, it would be great to have better ways to find coverage gaps. This is at least one x64 lowering gap that I noticed (there are likely others) so here's an issue!
abrown labeled issue #6293:
@alexcrichton sent me a [coverage report] for the ISLE-generated lowerings for x64. I noticed by searching for "sinkable_load" that various operations (FP?) are not using the case where the load is sunk. In the Wasmtime meeting, @alexcrichton pointed out that the fact that we don't know whether the address is aligned could be preventing load-sinking here, but recall ([here]) that VEX- and EVEX-encoded instructions do not have the alignment restriction. Since @alexcrichton has been adding VEX-encoded versions of many instructions, I would think that we should be able to load-sink in that case (i.e., "can load sink MINUS alignment restriction" AND "have AVX").
[coverage report]: https://storage.googleapis.com/oss-fuzz-coverage/wasmtime/reports/20230425/linux/src/wasmtime/target/debug/build/cranelift-codegen-d3be7d5f89248fcc/out/isle_x64.rs.html
[here]: https://github.com/bytecodealliance/wasmtime/issues/4767#issuecomment-1432149408I don't know what to do with this issue exactly, but @fitzgen thought I should record it as one. As mentioned in the Wasmtime meeting, it would be great to have better ways to find coverage gaps. This is at least one x64 lowering gap that I noticed (there are likely others) so here's an issue!
abrown labeled issue #6293:
@alexcrichton sent me a [coverage report] for the ISLE-generated lowerings for x64. I noticed by searching for "sinkable_load" that various operations (FP?) are not using the case where the load is sunk. In the Wasmtime meeting, @alexcrichton pointed out that the fact that we don't know whether the address is aligned could be preventing load-sinking here, but recall ([here]) that VEX- and EVEX-encoded instructions do not have the alignment restriction. Since @alexcrichton has been adding VEX-encoded versions of many instructions, I would think that we should be able to load-sink in that case (i.e., "can load sink MINUS alignment restriction" AND "have AVX").
[coverage report]: https://storage.googleapis.com/oss-fuzz-coverage/wasmtime/reports/20230425/linux/src/wasmtime/target/debug/build/cranelift-codegen-d3be7d5f89248fcc/out/isle_x64.rs.html
[here]: https://github.com/bytecodealliance/wasmtime/issues/4767#issuecomment-1432149408I don't know what to do with this issue exactly, but @fitzgen thought I should record it as one. As mentioned in the Wasmtime meeting, it would be great to have better ways to find coverage gaps. This is at least one x64 lowering gap that I noticed (there are likely others) so here's an issue!
abrown commented on issue #6293:
cc: @cfallin, @jameysharp, @elliottt
abrown commented on issue #6293:
Here's another dimension to this issue: even when the sinkable loads are used, they are used very infrequently. Why is that?
E.g., here:
line count 12895 7.98k let v38 = C::unpack_value_array_2(ctx, v37); 12896 7.98k let v58 = &C::sinkable_load(ctx, v38.0); 12897 7.98k if let Some(v59) = v58 { 12898 2 let v1045 = constructor_put_in_xmm(ctx, v38.1); 12899 2 let v1056 = &constructor_sink_load_to_xmm_mem(ctx, v59); 12900 2 let v1081 = constructor_x64_mulss(ctx, v1045, v1056); 12901 2 let v1082 = constructor_output_xmm(ctx, v1081); 12902 2 // Rule at src/isa/x64/lower.isle line 2189. 12903 2 return Some(v1082); 12904 7.97k }
The sinkable load is only used a very low percentage of times.
alexcrichton commented on issue #6293:
As for the frequency issue, I think it may be because it's just so rare to generate the precise shape of load-then-op. That's a pretty wishy-washy excuse though and I wouldn't be confident in it.
As for lack of sinkable loads in some blocks this is one example which is
f32x4.add
doesn't do anything with sinkable loads. The conditions for this are:
- AVX is enabled
- SIMD is enabled
- SIMD is used with
fadd
plus a loadEven with all that in place it looks like
f32x4.add
alone was only translated 101 times meaning that if we go from the previous 2/7980 hit rate then we basically need to drastically increase thef32x4.add
frequence to get it to hit the sinkable load path.One aspect here I think is that SIMD is conditionally enabled, whereas
f32.add
is always enabled, which may end up causing simd to be a bit rarer.
Last updated: Dec 23 2024 at 12:05 UTC