ettec opened issue #10084:
Hi wasmtime,
In our project we are seeing the time taken to deserialize modules go from single digit milliseconds upto sometimes around a minute, I've created a simple test project to demonstrate the behaviour we are seeing in our codebase and wondered if you could let me know if the behaviour seen is expected or its a bug, here's the test project:
https://github.com/ettec/wasm-load-tests-wasmtime
After cloning the above, run :
/generateWorkflowTestsFiles.sh 100
in the root directory.To see the behaviour, you can run the test
Test_LoadAllWorkflows_DeserializeFromFile_ParallelWithLock
, you will notice that the deserialize time (printed in the console) starts to slow sign significantly towards the end of the test even though it has loaded less modules than is loaded by theTest_LoadAllWorkflows_DeserializeFromFile_SingleThreaded
test for example.I am at a bit of a loss to explain why loading modules on a single thread versus loading on multiple threads (with a lock around the DeserializeFromFile call so it is only accessed in a single threaded fashion, not necessary afaik, but this was added 'just in case') should result in such different deserialization times and why the deserialisation time degrades so noticeably. Any insight much appreciated.
ettec commented on issue #10084:
....possibly related, just observed a single threaded case that exhibits the same behaviour:
Update the following test like so and I'm seeing it slowdown:
`func Test_LoadAllWorkflows_DeserializeFromFile_SingleThreaded(t *testing.T) {
//deserializeSingleThreaded(t, 100)
//deserializeSingleThreaded(t, 100)deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20)
}`
Switched back to the following and it works without any issue despite the number of workflows loaded being the same in both cases (200):
`func Test_LoadAllWorkflows_DeserializeFromFile_SingleThreaded(t *testing.T) {
deserializeSingleThreaded(t, 100)
deserializeSingleThreaded(t, 100)/* deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20)*/
}`
ettec edited a comment on issue #10084:
....possibly related, just observed a single threaded case that exhibits the same behaviour:
Update the following test like so and I'm seeing it slowdown:
`
func Test_LoadAllWorkflows_DeserializeFromFile_SingleThreaded(t *testing.T) {
//deserializeSingleThreaded(t, 100)
//deserializeSingleThreaded(t, 100)deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20)
}
`Switched back to the following and it works without any issue despite the number of workflows loaded being the same in both cases (200):
`
func Test_LoadAllWorkflows_DeserializeFromFile_SingleThreaded(t *testing.T) {
deserializeSingleThreaded(t, 100)
deserializeSingleThreaded(t, 100)/* deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20)*/
}
`
alexcrichton commented on issue #10084:
Thanks for the report! Before digging too much into this could you test out modifying Wasmtime's config here to call
cfg.SetNativeUnwindInfo(false)
? That's a gut hunch I'd have where the native host can exhibit quadratic behavior when loading/unloading modules, but if you're not integrating with debuggers or anything like that then native unwind info is unnecessary.
ettec edited a comment on issue #10084:
....possibly related, just observed a single threaded case that exhibits the same behaviour:
Update the following test like so and I'm seeing it slowdown:
func Test_LoadAllWorkflows_DeserializeFromFile_SingleThreaded(t *testing.T) { //deserializeSingleThreaded(t, 100) //deserializeSingleThreaded(t, 100) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) }
Switched back to the following and it works without any issue despite the number of workflows loaded being the same in both cases (200):
func Test_LoadAllWorkflows_DeserializeFromFile_SingleThreaded(t *testing.T) { deserializeSingleThreaded(t, 100) deserializeSingleThreaded(t, 100) /* deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20)*/ }
ettec edited a comment on issue #10084:
....possibly related, just observed a single threaded case that exhibits the same behaviour:
Update the following test like so and I'm seeing it slowdown:
func Test_LoadAllWorkflows_DeserializeFromFile_SingleThreaded(t *testing.T) { //deserializeSingleThreaded(t, 100) //deserializeSingleThreaded(t, 100) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) }
Switched back to the following and it works without any issue despite the number of modules loaded being the same in both cases (200):
func Test_LoadAllWorkflows_DeserializeFromFile_SingleThreaded(t *testing.T) { deserializeSingleThreaded(t, 100) deserializeSingleThreaded(t, 100) /* deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20) deserializeSingleThreaded(t, 20)*/ }
ettec closed issue #10084:
Hi wasmtime,
In our project we are seeing the time taken to deserialize modules go from single digit milliseconds upto sometimes around a minute, I've created a simple test project to demonstrate the behaviour we are seeing in our codebase and wondered if you could let me know if the behaviour seen is expected or its a bug, here's the test project:
https://github.com/ettec/wasm-load-tests-wasmtime
After cloning the above, run :
/generateWorkflowTestsFiles.sh 100
in the root directory.To see the behaviour, you can run the test
Test_LoadAllWorkflows_DeserializeFromFile_ParallelWithLock
, you will notice that the deserialize time (printed in the console) starts to slow sign significantly towards the end of the test even though it has loaded less modules than is loaded by theTest_LoadAllWorkflows_DeserializeFromFile_SingleThreaded
test for example.I am at a bit of a loss to explain why loading modules on a single thread versus loading on multiple threads (with a lock around the DeserializeFromFile call so it is only accessed in a single threaded fashion, not necessary afaik, but this was added 'just in case') should result in such different deserialization times and why the deserialisation time degrades so noticeably. Any insight much appreciated.
ettec commented on issue #10084:
Hi Alex, thanks for the quick response. That does indeed fix the problem, big relief as getting this working is critical for us, thanks again for your quick help.
Last updated: Jan 24 2025 at 00:11 UTC