cr0sh commented on issue #5035:
Piping tests now run on unix platforms only(as the
-
substitiution applies only on unix too). Double buffering issue is resolved.
What do you think about implicitly allowing access to /dev/tty onwasmtime -
as a special case?
cr0sh commented on issue #5035:
I simplified the strategy to detect piped files(directly invokes
is_fifo
on Unix) and fixed tests incorrectly set up.
jameysharp commented on issue #5035:
Here's the strategy I'd suggest:
Add a new method to
wasmtime::Module
that accepts both precompiled binaries and wasm/wat files, and usesmmap
. I'd call itfrom_trusted_file
but I'm open to suggestions. It should probably beunsafe
, likeModule::deserialize_file
, for the same reasons.In the CLI, if
allow_precompiled
is true, then callModule::from_trusted_file
; otherwise, callModule::from_file
just like it does today. Don't inspect the bytes of the file at all in the CLI, but put that logic in thewasmtime
crate instead. I think that addresses @sunfishcode's comment on keeping the CLI as a simple demonstration of how to use thewasmtime
public API, and improves on the current state where there's this magic ELF-header check in the CLI.In your motivating use case (
curl | wasmtime
), users shouldn't be passing--allow-precompiled
, so the CLI would callModule::from_file
, resulting in a singlefs::read
that consumes the piped input exactly once, which solves the original bug you encountered.This new method should read the file using
MmapVec::from_file
. Only the OS knows exactly which file types it's capable of mapping into memory, so this shouldn't check whether the file is a FIFO or anything along those lines; just let the syscall fail if someone uses--allow-precompiled
with a pipe.Once you have an
MmapVec
, you can check whether it begins withb"\x7fELF"
. If so, pass it toSerializedModule::from_mmap
; otherwise, pass it toModule::new
. Neither function copies the bytes it's given, addressing @bjorn3's concern.I think the only downside to this plan is that the error message from the CLI isn't as friendly if someone tries to run a precompiled artifact without passing
--allow-precompiled
. We could makeModule::build_artifacts
return a more helpful error if the bytes it receives start with an ELF header, instead of just reporting that parsing failed.One advantage to this approach is that
wasmtime --allow-precompiled
becomes potentially faster for .wat and .wasm files, because it'll use mmap for those cases too.
cr0sh commented on issue #5035:
@jameysharp Thanks for your detailed guidance! I didn't think fixing this would not be hard this way and admittedly I couldn't do some sort of refactoring the project(and even adding a new public API for it) never saw before, tempting myself to bring somewhat hackish way. I'll try to reimplement this way.
I'd call it from_trusted_file but I'm open to suggestions.
This looks good.
I think the only downside to this plan is that the error message from the CLI isn't as friendly if someone tries to run a precompiled artifact without passing --allow-precompiled. We could make Module::build_artifacts return a more helpful error if the bytes it receives start with an ELF header, instead of just reporting that parsing failed.
I'm not sure that I can bring an excellent solution for this; just will do my best.
To recap I leave a tabular summary for your suggestion:
no --allow-precompiled
<br /> (Module::from_file
)--allow-precompiled
<br /> (Module::from_trusted_file
)precompiled regular: opens OK and fails on some inner procedure but should be rejected via an additional check? TODO: better error handling <br /> fifo: same regular: opens OK w/ mmap <br /> fifo: mmap fails (intentionally left broken) .wasm
/.wat
regular: opens OK <br /> fifo: opens OK (major change on this PR) regular: opens OK w/mmap <br /> fifo: same
jameysharp commented on issue #5035:
should be rejected via an additional check
Right, the key here is that a file that begins with the 4-byte ELF magic is not valid for .wat or .wasm format and will be rejected by those parsers. None of the characters a .wat file can start with (which I think are whitespace,
(
, or;
) include\x7f
. And the binary .wasm format has its own different 4-byte magic (\0asm
).So
from_file
already rejects precompiled modules, but it does it by reporting a generic parse failure, which is not super friendly. We can be nicer to users by specifically checking for this case. But if you have any trouble with that, I don't think that has to be in this PR.The existing check that the CLI does for the ELF magic serves two purposes: one is to provide a helpful error message, and the other is to decide whether to use
deserialize_file
orfrom_file
. So one way to look at the approach I've outlined is that it does the same things, but in a different order.I couldn't do some sort of refactoring the project(and even adding a new public API for it) never saw before, tempting myself to bring somewhat hackish way.
Absolutely! I had to stare at the existing code for a long time, and think about a bunch of options that don't work, to come up with this suggestion. So of course I don't expect somebody new to the project to think of it.
I think the project will be better off with this additional API. In my opinion, it would be an improvement even if it didn't help with reading from pipes: the CLI shouldn't need to know about ELF magic. But I also think having support for reading from pipes is a great idea, so I like that this fixes multiple issues at the same time.
cr0sh commented on issue #5035:
Rebased and implemented the @jameysharp 's guidance. (I'll move to a new pull request as the topic has changed, so please leave this PR not reviewed)
Last updated: Jan 24 2025 at 00:11 UTC