Stream: git-wasmtime

Topic: wasmtime / issue #5035 Implement loading modules from pip...


view this post on Zulip Wasmtime GitHub notifications bot (Oct 09 2022 at 14:47):

cr0sh commented on issue #5035:

Piping tests now run on unix platforms only(as the - substitiution applies only on unix too). Double buffering issue is resolved.
What do you think about implicitly allowing access to /dev/tty on wasmtime - as a special case?

view this post on Zulip Wasmtime GitHub notifications bot (Oct 23 2022 at 14:37):

cr0sh commented on issue #5035:

I simplified the strategy to detect piped files(directly invokes is_fifo on Unix) and fixed tests incorrectly set up.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 25 2022 at 21:01):

jameysharp commented on issue #5035:

Here's the strategy I'd suggest:

Add a new method to wasmtime::Module that accepts both precompiled binaries and wasm/wat files, and uses mmap. I'd call it from_trusted_file but I'm open to suggestions. It should probably be unsafe, like Module::deserialize_file, for the same reasons.

In the CLI, if allow_precompiled is true, then call Module::from_trusted_file; otherwise, call Module::from_file just like it does today. Don't inspect the bytes of the file at all in the CLI, but put that logic in the wasmtime crate instead. I think that addresses @sunfishcode's comment on keeping the CLI as a simple demonstration of how to use the wasmtime public API, and improves on the current state where there's this magic ELF-header check in the CLI.

In your motivating use case (curl | wasmtime), users shouldn't be passing --allow-precompiled, so the CLI would call Module::from_file, resulting in a single fs::read that consumes the piped input exactly once, which solves the original bug you encountered.

This new method should read the file using MmapVec::from_file. Only the OS knows exactly which file types it's capable of mapping into memory, so this shouldn't check whether the file is a FIFO or anything along those lines; just let the syscall fail if someone uses --allow-precompiled with a pipe.

Once you have an MmapVec, you can check whether it begins with b"\x7fELF". If so, pass it to SerializedModule::from_mmap; otherwise, pass it to Module::new. Neither function copies the bytes it's given, addressing @bjorn3's concern.

I think the only downside to this plan is that the error message from the CLI isn't as friendly if someone tries to run a precompiled artifact without passing --allow-precompiled. We could make Module::build_artifacts return a more helpful error if the bytes it receives start with an ELF header, instead of just reporting that parsing failed.

One advantage to this approach is that wasmtime --allow-precompiled becomes potentially faster for .wat and .wasm files, because it'll use mmap for those cases too.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 26 2022 at 14:54):

cr0sh commented on issue #5035:

@jameysharp Thanks for your detailed guidance! I didn't think fixing this would not be hard this way and admittedly I couldn't do some sort of refactoring the project(and even adding a new public API for it) never saw before, tempting myself to bring somewhat hackish way. I'll try to reimplement this way.

I'd call it from_trusted_file but I'm open to suggestions.

This looks good.

I think the only downside to this plan is that the error message from the CLI isn't as friendly if someone tries to run a precompiled artifact without passing --allow-precompiled. We could make Module::build_artifacts return a more helpful error if the bytes it receives start with an ELF header, instead of just reporting that parsing failed.

I'm not sure that I can bring an excellent solution for this; just will do my best.

To recap I leave a tabular summary for your suggestion:

no --allow-precompiled <br /> (Module::from_file) --allow-precompiled <br /> (Module::from_trusted_file)
precompiled regular: opens OK and fails on some inner procedure but should be rejected via an additional check? TODO: better error handling <br /> fifo: same regular: opens OK w/ mmap <br /> fifo: mmap fails (intentionally left broken)
.wasm/.wat regular: opens OK <br /> fifo: opens OK (major change on this PR) regular: opens OK w/mmap <br /> fifo: same

view this post on Zulip Wasmtime GitHub notifications bot (Oct 26 2022 at 16:19):

jameysharp commented on issue #5035:

should be rejected via an additional check

Right, the key here is that a file that begins with the 4-byte ELF magic is not valid for .wat or .wasm format and will be rejected by those parsers. None of the characters a .wat file can start with (which I think are whitespace, (, or ;) include \x7f. And the binary .wasm format has its own different 4-byte magic (\0asm).

So from_file already rejects precompiled modules, but it does it by reporting a generic parse failure, which is not super friendly. We can be nicer to users by specifically checking for this case. But if you have any trouble with that, I don't think that has to be in this PR.

The existing check that the CLI does for the ELF magic serves two purposes: one is to provide a helpful error message, and the other is to decide whether to use deserialize_file or from_file. So one way to look at the approach I've outlined is that it does the same things, but in a different order.

I couldn't do some sort of refactoring the project(and even adding a new public API for it) never saw before, tempting myself to bring somewhat hackish way.

Absolutely! I had to stare at the existing code for a long time, and think about a bunch of options that don't work, to come up with this suggestion. So of course I don't expect somebody new to the project to think of it.

I think the project will be better off with this additional API. In my opinion, it would be an improvement even if it didn't help with reading from pipes: the CLI shouldn't need to know about ELF magic. But I also think having support for reading from pipes is a great idea, so I like that this fixes multiple issues at the same time.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 29 2022 at 16:11):

cr0sh commented on issue #5035:

Rebased and implemented the @jameysharp 's guidance. (I'll move to a new pull request as the topic has changed, so please leave this PR not reviewed)


Last updated: Oct 23 2024 at 20:03 UTC