Stream: general

Topic: wasmtime-go


view this post on Zulip Alex Vidal (May 07 2020 at 18:09):

Anyone here using wasmtime-go?

view this post on Zulip Alex Vidal (May 09 2020 at 03:34):

Getting an interesting issue where, having done nothing but create a store, module, and wasi instance and then starting an http server, I reliably get a segfault in 2 minutes. I'm guessing the 2 minutes is a GC pause. If I actually use an instance of the module every so often I don't get the segfault.

view this post on Zulip Alex Vidal (May 09 2020 at 03:39):

Ok yeah, so the runtime forces a GC cycle every 2 minutes.

view this post on Zulip Alex Crichton (May 09 2020 at 03:52):

@Alex Vidal oh dear, do you have code I could peek at? I see CI for the repo still segfaults on macOS, but I unfortunately haven't been able to track down as to why

view this post on Zulip Alex Crichton (May 09 2020 at 03:52):

Also, have you tried the master branch of the git repo?

view this post on Zulip Alex Vidal (May 09 2020 at 03:53):

I haven't. I'm running 0.16.0 atm. I'll see if I can get a minimal reproduction.

view this post on Zulip Alex Vidal (May 09 2020 at 03:54):

Interesting...v0.16.0 isn't on the master timeline at all?

view this post on Zulip Alex Vidal (May 09 2020 at 03:55):

Oh, but it's ahead of master anyway

view this post on Zulip Alex Crichton (May 09 2020 at 03:57):

ah yeah it's a separate commit b/c it has binaries

view this post on Zulip Alex Vidal (May 09 2020 at 03:58):

roger. let me see if i can reproduce without all my bullshit

view this post on Zulip Alex Vidal (May 09 2020 at 04:04):

really annoying to iterate on this when it takes 2 minutes each time you want to reduce your test case

view this post on Zulip Alex Vidal (May 09 2020 at 04:06):

ah yeah there it goes

view this post on Zulip Alex Vidal (May 09 2020 at 04:08):

about to upload a gist, going to see if i can repro with an inline wat

view this post on Zulip Alex Vidal (May 09 2020 at 04:15):

@Alex Crichton https://gist.github.com/avidal/4710a963fbc8a80feae84663473ea971

GitHub Gist: instantly share code, notes, and snippets.

view this post on Zulip Alex Vidal (May 09 2020 at 04:16):

Running another test without the wasi instance to see if it makes a difference. Wondering if a runtime.KeepAlive is missing in one of the wasi funcs?

view this post on Zulip Alex Vidal (May 09 2020 at 04:18):

Remove the wasi bits and it doesn't crash

view this post on Zulip Alex Crichton (May 09 2020 at 04:19):

oh I think I see what's wrong

view this post on Zulip Alex Vidal (May 09 2020 at 04:19):

ok, i can upload a segfault if you want it

view this post on Zulip Alex Crichton (May 09 2020 at 04:19):

nah that's ok

view this post on Zulip Alex Crichton (May 09 2020 at 04:20):

this is an ownership issue I forgot about

view this post on Zulip Alex Crichton (May 09 2020 at 04:20):

and didn't see it b/c I didn't write wasi tests :(

view this post on Zulip Alex Vidal (May 09 2020 at 04:20):

I noticed :)

view this post on Zulip Alex Vidal (May 09 2020 at 04:20):

But it follows the rust api very well so it wasn't difficult to figure out

view this post on Zulip Alex Vidal (May 09 2020 at 04:23):

Is it the wasi instance that needs a keep alive? I see that if you link the wasi instance in a linker, the linker calls keepalive on the wasi instance, but constructing the instance via NewWasiInstance doesn't

view this post on Zulip Alex Vidal (May 09 2020 at 04:24):

(I know very little about cgo or generally about ffi stuff so this part is a bit over my head)

view this post on Zulip Alex Crichton (May 09 2020 at 04:24):

ok should be fixed in https://github.com/bytecodealliance/wasmtime-go/commit/e23af2e4825f3d9c67bddc329e16c954f75620e9

The `wasi_instance_new` API takes ownership of the config passed into it, so no need to manually free it later!

view this post on Zulip Alex Vidal (May 09 2020 at 04:24):

Very nice, will test it out right now

view this post on Zulip Alex Vidal (May 09 2020 at 04:26):

Oh, I'll wait for the tag and then get it instead of futzing with the compile step

view this post on Zulip Alex Vidal (May 10 2020 at 01:54):

@Alex Crichton when you get a chance could you push a v0.17.0 tag?

view this post on Zulip Alex Crichton (May 11 2020 at 14:17):

@Alex Vidal I've published 0.16.1 now!

view this post on Zulip Alex Vidal (May 11 2020 at 14:53):

sweet, thanks!

view this post on Zulip Alex Vidal (May 12 2020 at 04:02):

Left my program running for 5ish hours and no segfaults :) Thanks again Alex

view this post on Zulip Alex Crichton (May 12 2020 at 04:32):

Nice, thanks again for the report!

view this post on Zulip Alex Vidal (May 13 2020 at 15:39):

Submitted a PR to fix a segfault in wasi...but I also realized I need to figure out how to make this work reliably on macOS since I'll be using it at work. I personally don't have a macbook, but my coworkers do..

view this post on Zulip Alex Crichton (May 13 2020 at 16:40):

@Alex Vidal yeah I've been trying to investigate it on and off as to what's going wrong, but I can't figure anything out yet :(

view this post on Zulip Alex Crichton (May 13 2020 at 16:40):

I can reliably reproduce the issue on macos and can't reproduce at all on linux

view this post on Zulip Alex Crichton (May 13 2020 at 16:40):

just have no idea where to start with debugging

view this post on Zulip Alex Vidal (May 13 2020 at 16:41):

does it reproduce in a docker container with a macos host?

view this post on Zulip Alex Vidal (May 13 2020 at 16:42):

otherwise yeah i guess you have a non-zero chance of hitting it on every gc cycle which is a minimum of 2 minutes

view this post on Zulip Alex Crichton (May 13 2020 at 16:43):

you can probably reproduce faster with -tags debug which forces a gc trigger way more often

view this post on Zulip Alex Crichton (May 13 2020 at 16:44):

(that's how I'm reproducing in tests)

view this post on Zulip Alex Crichton (May 13 2020 at 16:44):

and I'm not sure if docker would work b/c I don't think you virtualize macos, right?

view this post on Zulip Alex Vidal (May 13 2020 at 16:57):

yep, i ended up changing the actions file on my fork to run three passes with debug enabled to get it to repro more often

view this post on Zulip Alex Vidal (May 13 2020 at 16:57):

also, i'm not positive how docker for mac works. i know docker runs in a lightweight vm but past that...:shrug:

view this post on Zulip Alex Vidal (May 13 2020 at 18:13):

considering renting a mac from macincloud.com to try this out

view this post on Zulip Alex Vidal (May 13 2020 at 18:33):

I adjusted the workflow to only run mac, skip the download steps (inlined the dependencies), and run the debug test build a bunch of times to see if i can find a pattern in the failures...but nothing so far. https://github.com/avidal/wasmtime-go/pull/1

GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.

view this post on Zulip bjorn3 (May 13 2020 at 18:39):

Can you run it in lldb? lldb -o run -o bt --batch -- executable arg1 arg2 should run it and show a backtrace when it crashes.

view this post on Zulip Alex Vidal (May 13 2020 at 18:39):

I would but no mac on my end. Not sure what Alex has tried

view this post on Zulip Alex Vidal (May 13 2020 at 18:39):

oh maybe i can do that in the action though

view this post on Zulip bjorn3 (May 13 2020 at 18:40):

yes, that was what I meant.

view this post on Zulip Alex Crichton (May 13 2020 at 18:47):

FWIW I'm working to minimize this, so far I've got this program which will segfault:

package main

// #cgo CFLAGS:-I${SRCDIR}/build/include
// #cgo LDFLAGS:-lwasmtime -lm -ldl -L${SRCDIR}/build/macos-x86_64
// #include <shims.h>
import "C"
import "runtime"

func main() {
        engine := C.wasm_engine_new()
        store := C.wasm_store_new(engine)
        var tys C.wasm_valtype_vec_t
        ty := C.wasm_functype_new(&tys, &tys)
        f := C.c_func_new_with_env(store, ty, 0, 0)
        C.wasmtime_func_call(f, nil, 0, nil, 0, nil)
        runtime.GC()
}

view this post on Zulip Alex Vidal (May 13 2020 at 18:48):

oh very nice

view this post on Zulip Alex Vidal (May 13 2020 at 19:30):

fwiw, that minimal example segfaults for me on C.wasmtime_func_call

view this post on Zulip Alex Vidal (May 13 2020 at 19:30):

(on linux though)

view this post on Zulip Alex Vidal (May 13 2020 at 19:30):

well kind of, i didn't make it exactly the same so i should just shut up

view this post on Zulip Alex Crichton (May 13 2020 at 19:35):

@Alex Vidal I opened up https://github.com/bytecodealliance/wasmtime-go/issues/10 as a dedicated issue for this

CI has been segfaulting for some time on macOS and I've recently tried to reduce this. I can pretty reliably reproduce this locally as well. So far I've managed to shrink this to: package m...

view this post on Zulip Alex Vidal (May 13 2020 at 19:42):

very nice, thanks. hopefully we can get some help!

view this post on Zulip Alex Vidal (May 14 2020 at 16:03):

I'm going to try to run the minimal example in your ticket on MacOS but building and executing it in a docker container. I'm going to be shipping a wasmtime-go based local development proxy to my team in the next couple of weeks and it'd be nice to know I have a MacOS workaround..

view this post on Zulip Alex Vidal (May 14 2020 at 17:29):

@Alex Crichton looks like it was a regression in Go 1.14. I can't reproduce in 1.13.10. I setup a matrix of Go versions in my fork of wasmtime-go to confirm: https://github.com/avidal/wasmtime-go/pull/2

GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.

view this post on Zulip Alex Crichton (May 14 2020 at 17:31):

nice find!

view this post on Zulip Alex Vidal (May 14 2020 at 17:31):

I had a coworker run the test suite on MacOS Catalina (with the debug tag to trigger the GC often) using Go 1.13.10 about a hundred times and he never saw a segfault.

view this post on Zulip Alex Crichton (May 14 2020 at 17:31):

Perhaps we could update wasmtime-go CI and also add a note to the README?

view this post on Zulip Alex Vidal (May 14 2020 at 17:32):

I can open a PR for that. Add go 1.13.10 to CI yeah?

view this post on Zulip Alex Crichton (May 14 2020 at 17:32):

sounds good to me yeah

view this post on Zulip Alex Vidal (May 14 2020 at 17:42):

opened. not sure how much detail you want in the README

view this post on Zulip Alex Vidal (May 14 2020 at 18:26):

your comment in the README is a lot better :)

view this post on Zulip Alex Crichton (May 14 2020 at 18:34):

oh no worries just tweaking things :)

view this post on Zulip Alex Vidal (May 14 2020 at 19:13):

This is (probably) unrelated, but if you run the test suite using go 1.14 with -race, it'll enable the checkptr feature which says that func.go#319 has unsafe pointer arithmetic (reproducible on linux)

view this post on Zulip Alex Vidal (May 15 2020 at 17:00):

just wanted to say thanks to you @Alex Crichton and the rest of the wasmtime team. My local proxy can trivially handle over 100 rps sustained. It takes the incoming http request, executes a wasm program which modifies the request and calls back to Go, which sends the request to another service and returns the response back into wasm which then sends it back into Go and back to the user.

view this post on Zulip Alex Vidal (May 15 2020 at 17:01):

Effectively the Go proxy is a runtime for middleware implemented (currently) in Rust, compiled to wasm.

view this post on Zulip Alex Vidal (May 15 2020 at 17:01):

100 rps with a 90th percentile latency of 11ms

view this post on Zulip Alex Vidal (May 15 2020 at 17:02):

(once I put a mutex around instantiating the module that is)

view this post on Zulip Alex Vidal (May 15 2020 at 17:02):

oh and that's with creating a new instance of the module for every single request which is how the middleware api is designed

view this post on Zulip Alex Vidal (May 17 2020 at 03:38):

If I have Rust code that imports a function whose "extern C" decl has an arg for a u32, and I link that function via wasmtime-go, I have to provide a function that accepts an int32. What if, on the Rust side, I call that function with the max u32? That won't fit into the int32 on the Go side.

view this post on Zulip Alex Vidal (May 17 2020 at 03:57):

Ok. I messed with this a bit more. Looks like I can just cast the param to uint32 in Go, which makes sense.

view this post on Zulip Alex Vidal (Jun 04 2020 at 21:58):

Can you summarize the impact of the 0.17.0 change @Alex Crichton? The wasmtime change indicates module can be sent across threads. Do you still need to lock the store to a single thread at a time?

view this post on Zulip Alex Vidal (Jun 04 2020 at 21:59):

It looks like it's a change that could eventually unlock multiple instances on multiple threads, but we're not there yet, correct?

view this post on Zulip Alex Crichton (Jun 04 2020 at 21:59):

@Alex Vidal that's been fixed now! The API hasn't been updated yet but a Module is tied to an Engine now, not a Store

view this post on Zulip Alex Crichton (Jun 04 2020 at 22:00):

and Engine is thread-safe (as before)

view this post on Zulip Alex Crichton (Jun 04 2020 at 22:00):

Module is now thread safe as well

view this post on Zulip Alex Crichton (Jun 04 2020 at 22:00):

(I probably need to update the free code too...)

view this post on Zulip Alex Crichton (Jun 04 2020 at 22:00):

modules can be shared across threads, but instances cannot

view this post on Zulip Alex Crichton (Jun 04 2020 at 22:00):

instances will likely always be locked to one thread

view this post on Zulip Alex Vidal (Jun 04 2020 at 22:01):

Ok, so given a triple (module, store, engine) I can create (and use) N instances on N threads?

view this post on Zulip Alex Vidal (Jun 04 2020 at 22:02):

(and yeah, i don't mean to move an instance across threads)

view this post on Zulip Alex Vidal (Jun 04 2020 at 22:03):

I ended up rearchitecting to having a pool of instances that each have their own context (module, store, linker) and go back into the pool once they're done. But it sounds like I could instead go back to my original design where I have a single (module, store, linker) and create as many new instances as I want that are used on different threads?

view this post on Zulip Alex Crichton (Jun 04 2020 at 22:03):

you can have a single engine/module, but you still need a store-per-thread

view this post on Zulip Alex Crichton (Jun 04 2020 at 22:04):

and store represents the lifetime of an instance as well (in terms of memory allocations), so you likely want a store to be relatively short-lived

view this post on Zulip Alex Vidal (Jun 04 2020 at 22:04):

Ah, I didn't know that. I wasn't disposing the store.

view this post on Zulip Alex Crichton (Jun 04 2020 at 22:05):

note that this could all use better documentation as well

view this post on Zulip Alex Crichton (Jun 04 2020 at 22:05):

which is something I would like to get to!

view this post on Zulip Alex Vidal (Jun 04 2020 at 22:06):

Ok so. I can construct a module given an engine. And hold onto that module (I don't need to keep the engine around). And then I can construct a store, wasi instance, linker, and then an actual wasm instance on each request, using the same module

view this post on Zulip Alex Vidal (Jun 04 2020 at 22:06):

(didn't realize that about the store, probably because the module depended on it when i started, so i thought it was more of a general wasmtime context)

view this post on Zulip Alex Vidal (Jun 04 2020 at 22:08):

Just looked a bit more. I didn't realize your "That's been fixed!" comment referred to something past 0.17.0.

view this post on Zulip Yury Delendik (Jun 04 2020 at 22:12):

You may need to keep engine around to create stores though

view this post on Zulip Yury Delendik (Jun 04 2020 at 22:14):

it is uncharted territory when two modules from different engines with different settings are used (might work though :shrug: )

view this post on Zulip Alex Vidal (Jun 04 2020 at 22:19):

Yeah. It's just a bit confusing because the c-api change from your PR didn't change the signature for wasm_module_new, it still takes a store, it just grabs the engine from the store and uses that instead. So I suppose that PR was internal (from the module c-api perspective).

view this post on Zulip Alex Vidal (Jun 04 2020 at 22:20):

But I suppose, for my purposes, the store that I supply isn't actually important? I can just create one to use to compile the module and then throw it away after?

view this post on Zulip Yury Delendik (Jun 04 2020 at 22:20):

wasm_module_new will not change we are just API implementors, we can change wasmtime_module_new and I did, but last minute HostRef refactoring changed it back

view this post on Zulip Yury Delendik (Jun 04 2020 at 22:22):

wasm-c-api requires usage of wasm_module_share wasm_module_obtain when transferred between stores (hence threads)

view this post on Zulip Yury Delendik (Jun 04 2020 at 22:23):

see e.g. https://github.com/WebAssembly/wasm-c-api/blob/master/example/threads.c#L32

Wasm C API prototype. Contribute to WebAssembly/wasm-c-api development by creating an account on GitHub.

view this post on Zulip Yury Delendik (Jun 04 2020 at 22:24):

but we can do whatever we want with rust api :)

view this post on Zulip Alex Vidal (Jun 04 2020 at 22:24):

Roger. Getting pretty far out of my domain but let me know if there's something I can help with in the Go library @Alex Crichton

view this post on Zulip Alex Vidal (Jun 04 2020 at 22:25):

Thanks for the example, that makes it a bit more clear!

view this post on Zulip Alex Vidal (Jun 13 2020 at 03:23):

nice work on reducing the macos bug report @Alex Crichton !


Last updated: Dec 23 2024 at 13:07 UTC