songtianlei opened issue #10852:
.clifTest Casetest optimize set opt_level=none set preserve_frame_pointers=true set enable_multi_ret_implicit_sret=true target s390x function %main() -> i64, i64 , i16x8 fast { ss0 = explicit_slot 32 block0: v1 = iconst.i64 0x0011_0022_0033_0044 stack_store v1, ss0 stack_store v1, ss0+8 stack_store v1, ss0+16 stack_store v1, ss0+24 v80 = stack_addr.i64 ss0 v81 = load.i64 big v80 v82 = sload8x8 big v80 return v80, v81, v82 } ; print: %main()Description
When I specify the
loadandsload8x8instructions as big-endianv81 = load.i64 big v80 v82 = sload8x8 big v80the execution results are as follows.
%main() -> [274928428480, 4785220636311620, 0x00440000003300000022000000110000]Then I changed them to little-endian, and the results are as follows.
v81 = load.i64 little v80 v82 = sload8x8 little v80%main() -> [274928428480, 4899972470242545920, 0x00440000003300000022000000110000]The value of
v81changed, which is expected. But the value ofv82stayed the same, indicating that little/big-endian has no effect onsload8x8.Environment
qemu-s390xto emulate execution on an x86 machine.
songtianlei added the bug label to Issue #10852.
songtianlei added the cranelift label to Issue #10852.
alexcrichton added the cranelift:area:s390x label to Issue #10852.
alexcrichton commented on issue #10852:
cc @uweigand
uweigand commented on issue #10852:
There should be no effect - an array (or vector) of 8 single-byte elements has the same in-memory representation on both big-endian and little-endian systems.
alexcrichton closed issue #10852:
.clifTest Casetest optimize set opt_level=none set preserve_frame_pointers=true set enable_multi_ret_implicit_sret=true target s390x function %main() -> i64, i64 , i16x8 fast { ss0 = explicit_slot 32 block0: v1 = iconst.i64 0x0011_0022_0033_0044 stack_store v1, ss0 stack_store v1, ss0+8 stack_store v1, ss0+16 stack_store v1, ss0+24 v80 = stack_addr.i64 ss0 v81 = load.i64 big v80 v82 = sload8x8 big v80 return v80, v81, v82 } ; print: %main()Description
When I specify the
loadandsload8x8instructions as big-endianv81 = load.i64 big v80 v82 = sload8x8 big v80the execution results are as follows.
%main() -> [274928428480, 4785220636311620, 0x00440000003300000022000000110000]Then I changed them to little-endian, and the results are as follows.
v81 = load.i64 little v80 v82 = sload8x8 little v80%main() -> [274928428480, 4899972470242545920, 0x00440000003300000022000000110000]The value of
v81changed, which is expected. But the value ofv82stayed the same, indicating that little/big-endian has no effect onsload8x8.Environment
qemu-s390xto emulate execution on an x86 machine.
alexcrichton commented on issue #10852:
Oops yes indeed! I'll close this as "working as intended", but @songtianlei if you have follow-up questions please feel free to post them here too.
songtianlei commented on issue #10852:
@alexcrichton @uweigand
I also noticed thatsloadbehaves differently on x86 and s390x when using little or big endian.
.clifTest Casetest optimize set opt_level=none set preserve_frame_pointers=true set enable_multi_ret_implicit_sret=true function %main() -> i64 fast { ss0 = explicit_slot 32 block0: v1 = iconst.i64 0x0011_0022_0033_0044 stack_store v1, ss0 stack_store v1, ss0+8 stack_store v1, ss0+16 stack_store v1, ss0+24 v80 = stack_addr.i64 ss0 v81 = sload32 little v80 return v81 } ; print: %main()The
sloadinstruction is set to little-endian and run on both x86 and s390x.v81 = sload32 little v80The reslut:
[x86 ]%main() -> 3342404 [s390x]%main() -> 570429696Then changed it to big-endian.
v81 = sload32 big v80[x86 ]%main() -> 3342404 [s390x]%main() -> 1114146Why does the result change on s390x but stay the same on x86?
bjorn3 commented on issue #10852:
stack_store instructions are always using the native endian, so s390x and x86_64 have different data on the stack in your example, so using a fixed endianness for the load still results in different values.
songtianlei commented on issue #10852:
I know the stack data is different. I'm not asking why s390x and x86 give different results. My point is: when I switch between little and big endian, sload gives different results on s390x, but on x86, both cases produce the same result.
bjorn3 commented on issue #10852:
Right. I think the reason is that the s390x backend is the only one that supports endianness flags. It is the only architecture Cranelift supports where the wasm and native endianness are different. For all other architectures both Wasmtime and rustc_codegen_cranelift agree on which endianness to use, so probably nobody bothered to actually add big-endian support.
songtianlei commented on issue #10852:
Thank you very much for your detailed answer.
songtianlei deleted a comment on issue #10852:
Thank you very much for your detailed answer.
songtianlei commented on issue #10852:
@bjorn3 Thank you very much for your detailed answer.
bjorn3 commented on issue #10852:
Opened https://github.com/bytecodealliance/wasmtime/issues/10861 to track either implementing or emitting an error for the big-endian flag on little-endian targets.
Last updated: Dec 06 2025 at 06:05 UTC