abrown edited Issue #2432:
@jlb6740, I am opening this issue to track a failure I noticed with x64 SIMD: https://github.com/bytecodealliance/wasmtime/pull/2428/checks?check_run_id=1421136203#step:7:1582
jlb6740 commented on Issue #2432:
FYI, I am trying to see if I can pinpoint the patch that caused the issue. Basically running:
cargo test -p wasmtime-cli --features experimental_x64 wast::Cranelift::spec::simd::simd_
In a loop like so:
COUNTER=0 && while [ $COUNTER -lt 1000 ] && ./run_test.sh && COUNTER=$((COUNTER+1)) && echo -e "\n\nIteration: $COUNTER\n\n" ; do :; done | tee run_test.out
where the cargo test command is in ./run_test.sh.
It seems the failure usually occurs within 100 iterations but can take as long as 200-300.
abrown commented on Issue #2432:
That's some progress!
julian-seward1 commented on Issue #2432:
@jlb6740 Can you extract from that, the actual non-shell-script program invokation that ./run_test.sh` does, that is failing, and then run that on valgrind, to see if something is reading uninitialised memory?
jlb6740 commented on Issue #2432:
I've been attempting to both (1) isolate the patch where the issue starts and (2) simplify the reproducer but both are proving to be more difficult than anticipated. Some observations ..
W.r.t pinpointing the patch that introduced the issue, I am finding it impossible to reproduce this patch using the script above before https://github.com/bytecodealliance/wasmtime/pull/2365 . However I am not thinking this patch introduces this bug since it was seen as earlier as October here: https://github.com/bytecodealliance/wasmtime/runs/1323686030
W.r.t. simplifying the reproducer I am finding the error almost always occurs with the same tests involving negation or abs .. test such as:
(assert_return (invoke "add-neg" (v128.const f64x2 1.125 1.125)
(v128.const f64x2 0.125 0.125))
(v128.const f64x2 -1.0 -1.0))or
(assert_return (invoke "f64x2.abs" (v128.const f64x2 0x1.fffffffffffffp+1023 0x1.fffffffffffffp+1023))
(v128.const f64x2 0x1.fffffffffffffp+1023 0x1.fffffffffffffp+1023))But when I try to isolate just these tests and run, the problem is not reproduced. It is hard for me to tell if the issue is with something in the lowering or something in the wast testing path.
I did run valgrind against a277cf for example and there is a conspicuous error:
==4490== Conditional jump or move depends on uninitialised value(s)
==4490== at 0x48A056: core::iter::traits::iterator::Iterator::all::check::{{closure}} (iterator.rs:2114)
==4490== by 0x484EBD: <core::iter::adapters::Enumerate<I> as core::iter::traits::iterator::Iterator>::try_fold::enumerate::{{closure}} (mod.rs:1413)
==4490== by 0x46FECB: core::iter::traits::iterator::Iterator::try_fold (iterator.rs:1888)
==4490== by 0x484945: <core::iter::adapters::Enumerate<I> as core::iter::traits::iterator::Iterator>::try_fold (mod.rs:1420)
==4490== by 0x4842C0: core::iter::traits::iterator::Iterator::all (iterator.rs:2117)
==4490== by 0x4791DD: wasmtime_wast::wast::v128_matches (wast.rs:471)
==4490== by 0x478B77: wasmtime_wast::wast::val_matches (wast.rs:411)
==4490== by 0x47456B: wasmtime_wast::wast::WastContext::assert_return (wast.rs:185)
==4490== by 0x476693: wasmtime_wast::wast::WastContext::run_directive (wast.rs:272)
==4490== by 0x47530B: wasmtime_wast::wast::WastContext::run_buffer (wast.rs:229)
==4490== by 0x478201: wasmtime_wast::wast::WastContext::run_file (wast.rs:353)
==4490== by 0x333EF2: all::wast::run_wast (wast.rs:46)
==4490== Uninitialised value was created by a stack allocation
==4490== at 0x56B530: <core::iter::adapters::zip::Zip<A,B> as core::iter::adapters::zip::ZipImpl<A,B>>::get_unchecked (zip.rs:291)Which I think actually involves these lines here: https://github.com/bytecodealliance/wasmtime/blob/main/crates/wast/src/wast.rs#L507-L514 in head.
It is not obvious to me though how to correct this yet. I think this bug should be understood and remove this bug as a next step.
julian-seward1 commented on Issue #2432:
Just to be on the safe side .. is this with valgrind 3.16.0 or later? That has the lowest false-positive rate so far; I would recommend against any investigation based on output from a version before 3.16.0.
cfallin commented on Issue #2432:
So I was nerdsniped by this and chased it for a while -- with valgrind 3.16.1 (Fedora 33) I'm seeing the uninit value in the
values_vec
through which the trampoline passes the wasm func's return value (crates/wasmtime/src/func.rs) -- insert a println there and the uninit value is used right away, problem isn't in wast checks. The value's origin (valgrind--track-origins=yes
) is:==473121== Uninitialised value was created by a stack allocation ==473121== at 0xEFC9F0: wasmtime_runtime::traphandlers::setup_unix_sigaltstack::TLS::__init (local.rs:157)
So what I can gather from that is that we're reading uninitialized stack somewhere. I spent a while staring at the generated trampoline functions and am not seeing anything obviously wrong, but is it possible that maybe we're calling the wrong trampoline (too many/too few args or returns)?
Another oddity I noticed: there is no
WasmTy
impl foru128
. Most vector code seems to work fine without this, though perhaps this is A Problem anyway (@alexcrichton any thoughts on this, plus @jlb6740's note above that the uninit behavior seems to go away prior to #2365?).
julian-seward1 commented on Issue #2432:
insert a println there and the uninit value is used right away
That reduces the chance that it's a false positive. The clincher would however be if you can demonstrate that multiple runs print different values at that point.
alexcrichton commented on Issue #2432:
To confirm @jlb6740, are you testing recent master? #2365 had a use-after-free but if modules/stores were deallocated in the wrong order (later fixed though). That might explain why just after that commit valgrind finds issues.
As for
WasmTy for u128
, the reason we haven't implemented that yet is mainly that there's no native ABI for Rust to call a cranelift-generated function that takesu128
right now. Or at least no one's said 'yep this works let us add it'. Once that's possible we can addWasmTy for u128
, however.
jlb6740 commented on Issue #2432:
@julian-seward1 I have valgrind-3.15.0 installed. I will update to a 3.16 or higher. @alexcrichton the valgrind run is based on commit a277cf which was the last time I could readily reproduce. I've tried patches though since @cfallin disabled the tests and could reproduce up until then. Let me rerun the latest valgrind on the latest build with these tests enabled.
jlb6740 edited a comment on Issue #2432:
I've been attempting to both (1) isolate the patch where the issue starts and (2) simplify the reproducer but both are proving to be more difficult than anticipated. Some observations ..
W.r.t pinpointing the patch that introduced the issue, I am finding it impossible to reproduce this issue using the script above before https://github.com/bytecodealliance/wasmtime/pull/2365 . However I am not thinking this patch introduces this bug since it was seen as earlier as October here: https://github.com/bytecodealliance/wasmtime/runs/1323686030
W.r.t. simplifying the reproducer I am finding the error almost always occurs with the same tests involving negation or abs .. test such as:
(assert_return (invoke "add-neg" (v128.const f64x2 1.125 1.125)
(v128.const f64x2 0.125 0.125))
(v128.const f64x2 -1.0 -1.0))or
(assert_return (invoke "f64x2.abs" (v128.const f64x2 0x1.fffffffffffffp+1023 0x1.fffffffffffffp+1023))
(v128.const f64x2 0x1.fffffffffffffp+1023 0x1.fffffffffffffp+1023))But when I try to isolate just these tests and run, the problem is not reproduced. It is hard for me to tell if the issue is with something in the lowering or something in the wast testing path.
I did run valgrind against a277cf for example and there is a conspicuous error:
==4490== Conditional jump or move depends on uninitialised value(s)
==4490== at 0x48A056: core::iter::traits::iterator::Iterator::all::check::{{closure}} (iterator.rs:2114)
==4490== by 0x484EBD: <core::iter::adapters::Enumerate<I> as core::iter::traits::iterator::Iterator>::try_fold::enumerate::{{closure}} (mod.rs:1413)
==4490== by 0x46FECB: core::iter::traits::iterator::Iterator::try_fold (iterator.rs:1888)
==4490== by 0x484945: <core::iter::adapters::Enumerate<I> as core::iter::traits::iterator::Iterator>::try_fold (mod.rs:1420)
==4490== by 0x4842C0: core::iter::traits::iterator::Iterator::all (iterator.rs:2117)
==4490== by 0x4791DD: wasmtime_wast::wast::v128_matches (wast.rs:471)
==4490== by 0x478B77: wasmtime_wast::wast::val_matches (wast.rs:411)
==4490== by 0x47456B: wasmtime_wast::wast::WastContext::assert_return (wast.rs:185)
==4490== by 0x476693: wasmtime_wast::wast::WastContext::run_directive (wast.rs:272)
==4490== by 0x47530B: wasmtime_wast::wast::WastContext::run_buffer (wast.rs:229)
==4490== by 0x478201: wasmtime_wast::wast::WastContext::run_file (wast.rs:353)
==4490== by 0x333EF2: all::wast::run_wast (wast.rs:46)
==4490== Uninitialised value was created by a stack allocation
==4490== at 0x56B530: <core::iter::adapters::zip::Zip<A,B> as core::iter::adapters::zip::ZipImpl<A,B>>::get_unchecked (zip.rs:291)Which I think actually involves these lines here: https://github.com/bytecodealliance/wasmtime/blob/main/crates/wast/src/wast.rs#L507-L514 in head.
It is not obvious to me though how to correct this yet. I think this bug should be understood and remove this bug as a next step.
jlb6740 edited a comment on Issue #2432:
I've been attempting to both (1) isolate the patch where the issue starts and (2) simplify the reproducer but both are proving to be more difficult than anticipated. Some observations ..
W.r.t pinpointing the patch that introduced the issue, I am finding it impossible to reproduce this issue using the script above before https://github.com/bytecodealliance/wasmtime/pull/2365 . However I am not thinking this patch introduces this bug since it was seen as earlier as October here: https://github.com/bytecodealliance/wasmtime/runs/1323686030
W.r.t. simplifying the reproducer I am finding the error almost always occurs with the same tests involving negation or abs .. test such as:
(assert_return (invoke "add-neg" (v128.const f64x2 1.125 1.125)
(v128.const f64x2 0.125 0.125))
(v128.const f64x2 -1.0 -1.0))or
(assert_return (invoke "f64x2.abs" (v128.const f64x2 0x1.fffffffffffffp+1023 0x1.fffffffffffffp+1023))
(v128.const f64x2 0x1.fffffffffffffp+1023 0x1.fffffffffffffp+1023))But when I try to isolate just these tests and run, the problem is not reproduced. It is hard for me to tell if the issue is with something in the lowering or something in the wast testing path.
I did run valgrind against a277cf for example and there is a conspicuous error:
==4490== Conditional jump or move depends on uninitialised value(s) ==4490== at 0x48A056: core::iter::traits::iterator::Iterator::all::check::{{closure}} (iterator.rs:2114) ==4490== by 0x484EBD: <core::iter::adapters::Enumerate<I> as core::iter::traits::iterator::Iterator>::try_fold::enumerate::{{closure}} (mod.rs:1413) ==4490== by 0x46FECB: core::iter::traits::iterator::Iterator::try_fold (iterator.rs:1888) ==4490== by 0x484945: <core::iter::adapters::Enumerate<I> as core::iter::traits::iterator::Iterator>::try_fold (mod.rs:1420) ==4490== by 0x4842C0: core::iter::traits::iterator::Iterator::all (iterator.rs:2117) ==4490== by 0x4791DD: wasmtime_wast::wast::v128_matches (wast.rs:471) ==4490== by 0x478B77: wasmtime_wast::wast::val_matches (wast.rs:411) ==4490== by 0x47456B: wasmtime_wast::wast::WastContext::assert_return (wast.rs:185) ==4490== by 0x476693: wasmtime_wast::wast::WastContext::run_directive (wast.rs:272) ==4490== by 0x47530B: wasmtime_wast::wast::WastContext::run_buffer (wast.rs:229) ==4490== by 0x478201: wasmtime_wast::wast::WastContext::run_file (wast.rs:353) ==4490== by 0x333EF2: all::wast::run_wast (wast.rs:46) ==4490== Uninitialised value was created by a stack allocation ==4490== at 0x56B530: <core::iter::adapters::zip::Zip<A,B> as core::iter::adapters::zip::ZipImpl<A,B>>::get_unchecked (zip.rs:291)
Which I think actually involves these lines here: https://github.com/bytecodealliance/wasmtime/blob/main/crates/wast/src/wast.rs#L507-L514 in head.
It is not obvious to me though how to correct this yet. I think this bug should be understood and remove this bug as a next step.
jlb6740 edited a comment on Issue #2432:
I've been attempting to both (1) isolate the patch where the issue starts and (2) simplify the reproducer but both are proving to be more difficult than anticipated. Some observations ..
W.r.t pinpointing the patch that introduced the issue, I am finding it impossible to reproduce this issue using the script above before https://github.com/bytecodealliance/wasmtime/pull/2365 . However I am not thinking this patch introduces this bug since it was seen as earlier as October here: https://github.com/bytecodealliance/wasmtime/runs/1323686030
W.r.t. simplifying the reproducer I am finding the error almost always occurs with the same tests involving negation or abs .. test such as:
(assert_return (invoke "add-neg" (v128.const f64x2 1.125 1.125)
(v128.const f64x2 0.125 0.125))
(v128.const f64x2 -1.0 -1.0))or
(assert_return (invoke "f64x2.abs" (v128.const f64x2 0x1.fffffffffffffp+1023 0x1.fffffffffffffp+1023))
(v128.const f64x2 0x1.fffffffffffffp+1023 0x1.fffffffffffffp+1023))But when I try to isolate just these tests and run, the problem is not reproduced. It is hard for me to tell if the issue is with something in the lowering or something in the wast testing path.
I did run valgrind against a277cf for example and there is a conspicuous error:
==4490== Conditional jump or move depends on uninitialised value(s) ==4490== at 0x48A056: core::iter::traits::iterator::Iterator::all::check::{{closure}} (iterator.rs:2114) ==4490== by 0x484EBD: <core::iter::adapters::Enumerate<I> as core::iter::traits::iterator::Iterator>::try_fold::enumerate::{{closure}} (mod.rs:1413) ==4490== by 0x46FECB: core::iter::traits::iterator::Iterator::try_fold (iterator.rs:1888) ==4490== by 0x484945: <core::iter::adapters::Enumerate<I> as core::iter::traits::iterator::Iterator>::try_fold (mod.rs:1420) ==4490== by 0x4842C0: core::iter::traits::iterator::Iterator::all (iterator.rs:2117) ==4490== by 0x4791DD: wasmtime_wast::wast::v128_matches (wast.rs:471) ==4490== by 0x478B77: wasmtime_wast::wast::val_matches (wast.rs:411) ==4490== by 0x47456B: wasmtime_wast::wast::WastContext::assert_return (wast.rs:185) ==4490== by 0x476693: wasmtime_wast::wast::WastContext::run_directive (wast.rs:272) ==4490== by 0x47530B: wasmtime_wast::wast::WastContext::run_buffer (wast.rs:229) ==4490== by 0x478201: wasmtime_wast::wast::WastContext::run_file (wast.rs:353) ==4490== by 0x333EF2: all::wast::run_wast (wast.rs:46) ==4490== Uninitialised value was created by a stack allocation ==4490== at 0x56B530: <core::iter::adapters::zip::Zip<A,B> as core::iter::adapters::zip::ZipImpl<A,B>>::get_unchecked (zip.rs:291)
Which I think actually involves these lines here: https://github.com/bytecodealliance/wasmtime/blob/main/crates/wast/src/wast.rs#L507-L514 in head.
It is not obvious to me though how to correct this yet. I think this bug should be understood and remove as a next step.
alexcrichton commented on Issue #2432:
Ah ok in that case I'm not entirely sure what would be causing this unfortunately :(
jlb6740 commented on Issue #2432:
@alexcrichton yes, just needs more investigation. I've confirmed that use of unitialized values still occurs with the latest build with valgrind 3.16.1
==22915== Conditional jump or move depends on uninitialised value(s) ==22915== at 0x50D676: core::iter::traits::iterator::Iterator::all::check::{{closure}} (iterator.rs:2114) ==22915== by 0x52297D: <core::iter::adapters::Enumerate<I> as core::iter::traits::iterator::Iterator>::try_fold::enumerate::{{closure}} (mod.rs:1413) ==22915== by 0x50B1EB: core::iter::traits::iterator::Iterator::try_fold (iterator.rs:1888) ==22915== by 0x522405: <core::iter::adapters::Enumerate<I> as core::iter::traits::iterator::Iterator>::try_fold (mod.rs:1420) ==22915== by 0x51E660: core::iter::traits::iterator::Iterator::all (iterator.rs:2117) ==22915== by 0x4FEAB7: wasmtime_wast::wast::v128_matches (wast.rs:507) ==22915== by 0x4FE497: wasmtime_wast::wast::val_matches (wast.rs:451) ==22915== by 0x4FC757: wasmtime_wast::wast::WastContext::assert_return (wast.rs:186) ==22915== by 0x518A7F: wasmtime_wast::wast::WastContext::run_directive (wast.rs:287) ==22915== by 0x4FD713: wasmtime_wast::wast::WastContext::run_buffer (wast.rs:236) ==22915== by 0x4FDB01: wasmtime_wast::wast::WastContext::run_file (wast.rs:368) ==22915== by 0x404274: all::wast::run_wast (wast.rs:48) ==22915== Uninitialised value was created by a stack allocation ==22915== at 0x8244E0: <core::iter::adapters::zip::Zip<A,B> as core::iter::adapters::zip::ZipImpl<A,B>>::get_unchecked (zip.rs:291) ==22915==
jlb6740 edited a comment on Issue #2432:
@alexcrichton yes, just needs more investigation. I've confirmed that use of unitialized values still occurs with the latest build (e09b9400) with valgrind 3.16.1
==22915== Conditional jump or move depends on uninitialised value(s) ==22915== at 0x50D676: core::iter::traits::iterator::Iterator::all::check::{{closure}} (iterator.rs:2114) ==22915== by 0x52297D: <core::iter::adapters::Enumerate<I> as core::iter::traits::iterator::Iterator>::try_fold::enumerate::{{closure}} (mod.rs:1413) ==22915== by 0x50B1EB: core::iter::traits::iterator::Iterator::try_fold (iterator.rs:1888) ==22915== by 0x522405: <core::iter::adapters::Enumerate<I> as core::iter::traits::iterator::Iterator>::try_fold (mod.rs:1420) ==22915== by 0x51E660: core::iter::traits::iterator::Iterator::all (iterator.rs:2117) ==22915== by 0x4FEAB7: wasmtime_wast::wast::v128_matches (wast.rs:507) ==22915== by 0x4FE497: wasmtime_wast::wast::val_matches (wast.rs:451) ==22915== by 0x4FC757: wasmtime_wast::wast::WastContext::assert_return (wast.rs:186) ==22915== by 0x518A7F: wasmtime_wast::wast::WastContext::run_directive (wast.rs:287) ==22915== by 0x4FD713: wasmtime_wast::wast::WastContext::run_buffer (wast.rs:236) ==22915== by 0x4FDB01: wasmtime_wast::wast::WastContext::run_file (wast.rs:368) ==22915== by 0x404274: all::wast::run_wast (wast.rs:48) ==22915== Uninitialised value was created by a stack allocation ==22915== at 0x8244E0: <core::iter::adapters::zip::Zip<A,B> as core::iter::adapters::zip::ZipImpl<A,B>>::get_unchecked (zip.rs:291) ==22915==
jlb6740 commented on Issue #2432:
@abrown @cfallin Perhaps we can close this now? Please re-open if someone disagrees.
jlb6740 closed Issue #2432:
@jlb6740, I am opening this issue to track a failure I noticed with x64 SIMD: https://github.com/bytecodealliance/wasmtime/pull/2428/checks?check_run_id=1421136203#step:7:1582
Last updated: Nov 22 2024 at 17:03 UTC