Hi,
I have a cpp wasmtime that parse a jsonstring. My problem is that my cpp code runs well on both small and large json string input but fails with wasm on large string only. No error is printed. I suppose a memory limitation but I don't know how to change it. Could you help me please?
Is it possible to share a wasm module that reproduces this behavior? For example are you running the wasmtime
CLI? Are you using the Rust embedding API? Is the guest wasm module in C++? (e.g. details like this can help folks dig in, but a reproduction would be most useful)
I'm running the .wasm file with Rust API but I tried with wasmtime CLI first. Both work on small json but fail on large inputs. It's neither due to my C++ code nor the big json file, because when I build and run the C++ code, everything is fine. The problem appears when I switch to wasm.
I will ask if it possible to share a wasm module.
Here the example :
json-parsing.zip
You can run the cpp code on big json file via "make cpp".
I tried two different json lib in c++ (nlohmann and simdjson). You can test them with:
It's well a memory limitation. I just write a c++ code with a vector growing over iterations and the same thing occurred. What are wasm memory limitation? How to increase these limitations?
Can you share the actual wasm module you're running? I don't have emscripten installed myself so I compiled the example with wasi-sdk after some modifications and it works fine for the small/big examples you provided. One thing you can try is passing ---trap-on-grow-failure
to the CLI which will cause a trap to happen on OOM rather than returning -1 which can turn a failure that may be ignored into a loud failure
--trap-on-grow-failure does not change anything. I will try with wasi-sdk! What modification did you make?
I changed a few things like removing usage of <filesystem>
in json.hpp
and removing some try
/catch
and otherwise just fixing a few compile errors, mostly differences in runtimes I think between emscripten/wasi-sdk
Hm running wasmtime run ./test.wasm < input
doesn't seem to do anything with any input with the module you gave me. Could you perhaps open an issue on Wasmtime with detailed steps/reproduction/outputs/etc to help further debug this?
I tried to use wasi-sdk but it fails to compile.
export WASI_SDK_PATH=/home/celine/Téléchargements/wasi-sdk-20.0
${WASI_SDK_PATH}/bin/clang++ --sysroot=wasi-sdk-20.0/share/wasi-sysroot -o test.wasm -Wl,--export-all -Wl,--no-entry -Iinclude/ -isystem /usr/include/ -Wl,--export=test_nlohmann src/main.cpp -v
I get the "fatal error: 'iostream' file not found". I don't understand because I include the system headers too.
iostream
is a C++ header; it's in $sysroot/include/c++/v1
You need to specify the function name to invoke
wasmtime test.wasm --invoke test_simdjson < input_small.json
ok I can reproduce the failure, I see that something is calling an exit explicitly with a 1 argument, so something is explicitly exiting
I didn't pass --sysroot
to wasi-sdk myself, I let it find it itself
can you build the original wasm file with debug information with emscripten?
modifying Wasmtime I can get:
Caused by:
0: failed to invoke `test_simdjson`
1: error while executing at wasm backtrace:
0: 0x5acee - <unknown>!<wasm function 2012>
1: 0x5acbb - <unknown>!<wasm function 2005>
2: 0x7af39 - <unknown>!<wasm function 3327>
3: 0x7af4d - <unknown>!<wasm function 3329>
4: 0x7af55 - <unknown>!<wasm function 3330>
5: 0x1591e - <unknown>!<wasm function 295>
6: 0x152d9 - <unknown>!<wasm function 284>
7: 0x14f5b - <unknown>!<wasm function 283>
8: 0x14e98 - <unknown>!<wasm function 281>
9: 0x11722 - <unknown>!<wasm function 166>
10: 0x11213 - <unknown>!<wasm function 154>
11: 0x10f02 - <unknown>!<wasm function 151>
2: Exited with i32 exit status 1
but that's not too helpful without debug information
test.wasm Here the wasm compiled with the -g option
I recompiled the example keeping only the simdjson lib (removing nlohmann one) with wasi-sdk. Same problem: when I run it with wasmtime it works on small json file but failed on big file.
ah yeah this looks like OOM?
stdin length: 3544203
Try to parse3544203
Error: failed to run main module `/Users/alex/Downloads/test(1).wasm`
Caused by:
0: failed to invoke `test_simdjson`
1: error while executing at wasm backtrace:
0: 0x6dfb2 - _Exit
at /build/emscripten-buVz5q/emscripten-3.1.5~dfsg/system/lib/libc/musl/src/exit/_Exit.c:7:2
1: 0x6df7f - abort
at /build/emscripten-buVz5q/emscripten-3.1.5~dfsg/system/lib/standalone/standalone.c:33:3
2: 0x8e789 - operator new(unsigned long)
at /build/emscripten-buVz5q/emscripten-3.1.5~dfsg/system/lib/libcxx/src/new.cpp:84:13
3: 0x8e79d - operator new[](unsigned long)
at /build/emscripten-buVz5q/emscripten-3.1.5~dfsg/system/lib/libcxx/src/new.cpp:116:12
4: 0x8e7a5 - operator new[](unsigned long, std::nothrow_t const&)
at /build/emscripten-buVz5q/emscripten-3.1.5~dfsg/system/lib/libcxx/src/new.cpp:128:13
5: 0x1a134 - simdjson::dom::document::allocate(unsigned long)
at /home/celine/Téléchargements/json-parsing/include/simdjson.h:8430:21
6: 0x19988 - simdjson::dom::parser::ensure_capacity(simdjson::dom::document&, unsigned long)
at /home/celine/Téléchargements/json-parsing/include/simdjson.h:7627:87
7: 0x19570 - simdjson::dom::parser::parse_into_document(simdjson::dom::document&, unsigned char const*, unsigned long, bool) &
at /home/celine/Téléchargements/json-parsing/include/simdjson.h:7514:23
8: 0x1941f - simdjson::dom::parser::parse(unsigned char const*, unsigned long, bool) &
at /home/celine/Téléchargements/json-parsing/include/simdjson.h:7546:10
9: 0x150df - simdjson::dom::parser::parse(char const*, unsigned long, bool) &
at /home/celine/Téléchargements/json-parsing/include/simdjson.h:7550:10
10: 0x14ad4 - simdjson::dom::parser::parse(std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char> > const&) &
at /home/celine/Téléchargements/json-parsing/include/simdjson.h:7553:10
11: 0x1476f - test_simdjson
at /home/celine/Téléchargements/json-parsing/src/main.cpp:46:45
2: Exited with i32 exit status 1
I think that is a memory limitation from wasm. I added this c++ function:
void stress(){
std::string jsonString;
std::getline(std::cin, jsonString);
// create a vector growing over iteration
std::vector<std::string> dynamicVector;
for (size_t i = 0; i < 100; i++){
dynamicVector.push_back(jsonString);
std::cout << "i " << i << " - length: " << jsonString.length()*(i+1) << std::endl;
}
}
The wasm exit with an error 1 after some iteration while cpp program continu.
Alex Crichton said:
ah yeah this looks like OOM?
stdin length: 3544203 Try to parse3544203 Error: failed to run main module `/Users/alex/Downloads/test(1).wasm` Caused by: 0: failed to invoke `test_simdjson` 1: error while executing at wasm backtrace: 0: 0x6dfb2 - _Exit at /build/emscripten-buVz5q/emscripten-3.1.5~dfsg/system/lib/libc/musl/src/exit/_Exit.c:7:2 1: 0x6df7f - abort at /build/emscripten-buVz5q/emscripten-3.1.5~dfsg/system/lib/standalone/standalone.c:33:3 2: 0x8e789 - operator new(unsigned long) at /build/emscripten-buVz5q/emscripten-3.1.5~dfsg/system/lib/libcxx/src/new.cpp:84:13 3: 0x8e79d - operator new[](unsigned long) at /build/emscripten-buVz5q/emscripten-3.1.5~dfsg/system/lib/libcxx/src/new.cpp:116:12 4: 0x8e7a5 - operator new[](unsigned long, std::nothrow_t const&) at /build/emscripten-buVz5q/emscripten-3.1.5~dfsg/system/lib/libcxx/src/new.cpp:128:13 5: 0x1a134 - simdjson::dom::document::allocate(unsigned long) at /home/celine/Téléchargements/json-parsing/include/simdjson.h:8430:21 6: 0x19988 - simdjson::dom::parser::ensure_capacity(simdjson::dom::document&, unsigned long) at /home/celine/Téléchargements/json-parsing/include/simdjson.h:7627:87 7: 0x19570 - simdjson::dom::parser::parse_into_document(simdjson::dom::document&, unsigned char const*, unsigned long, bool) & at /home/celine/Téléchargements/json-parsing/include/simdjson.h:7514:23 8: 0x1941f - simdjson::dom::parser::parse(unsigned char const*, unsigned long, bool) & at /home/celine/Téléchargements/json-parsing/include/simdjson.h:7546:10 9: 0x150df - simdjson::dom::parser::parse(char const*, unsigned long, bool) & at /home/celine/Téléchargements/json-parsing/include/simdjson.h:7550:10 10: 0x14ad4 - simdjson::dom::parser::parse(std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char> > const&) & at /home/celine/Téléchargements/json-parsing/include/simdjson.h:7553:10 11: 0x1476f - test_simdjson at /home/celine/Téléchargements/json-parsing/src/main.cpp:46:45 2: Exited with i32 exit status 1
Yes! But I don't know how to allow more memory as explained above :)
the memory emscripten is generating is 256 pages large initially and additionally cannot grow beyond the 256 page limit, so my guess is that you're exceeding 256 wasm pages here and then it's hitting OOM
you'll need to see how to remove emscripten's upper bound on the memory, and I don't know how to do that
(I also don't know why --trap-on-grow-failure
didn't work)
Memory settings are set at the compile step? Does it depend on the compiler?
yes
Ok and do you know how to do that with wasi-sdk?
wasm-ld
has a --max-memory
option, so maybe something like clang -Wl,--max-memory=1073741824 ...
?
Last updated: Dec 23 2024 at 12:05 UTC