I'm implementing the custom import section proposal in wasm-tools this week and just have a few questions about the implementation.
The first and biggest question is: how should I update the binary parser to handle these groups of imports? Right now we have:
pub type ImportSectionReader<'a> = SectionLimited<'a, Import<'a>>;
But now the entries in the import section can contain multiple imports. So naively I would think I need to change Import<'a> to Imports<'a> (and declare that Imports type). But that would presumably cascade through the entire wasm-tools library in a pretty unpleasant way, no?
In the main spec (and SpiderMonkey) I was able to have the imports be a flat list in the module AST. Is that possible here? (Sorry for the noob question, I am not that familiar with the overall architecture of wasm-tools and how all the parts connect)
Or do we want to plumb that information all the way through so that text encodings like (import "foo" (item "bar" type) (item "baz" type)) map to the new binary encoding and vice versa?
Also, this skip stuff from issue 188 really is a huge pain here, unless I'm missing something.
Yeah unfortunately wasm-tools is probably going to be more difficult than spidermonkey here because wasmparser tries to reflect (roughly) the AST of the module as opposed to specializing on just one embedding (e.g. "just read all imports" vs "give me a parsed view of what all the imports look like")
Given that I think, yes, this'll result in s/Import/Imports/. We can add helper methods though to assist with "just ignore the binary encoding and give me all the imports", for example a method that auto-flat_map's the result or something like that. Basically if you find yourself having to update a lot of code I'm happy to help out and come up with a way to reduce the boilerplate.
For skip, yeah :(. At this point I feel that it's sort of just a fundamental design property (flaw?) of how wasmparser works and there'se not really any way around it unfortunately
Well, hopefully maybe I'm doing this right
image.png
From a possibly naive point of view though I would agree and say that we do want to plumb the binary structure through wasmparser so, for example, wasmprinter could print the (item ...) form instead of the multi-(import ...) form
yeah that looks reasonable to me w.r.t. skip
that one's definitely on us to fix that at some point
IIRC the only "solution" I could think of to skip was completely inverting control flow to instead of "pull things from an iterator" it's "wasmparser pushes things into a trait you provide"
Hopping back here for an open-ended question, I'm trying to figure out how to write the reencoder stuff to map from wasmparser::Imports to wasm_encoder::core::Imports
in a very copy-pastey sort of way, I have set up the types like so:
/// TODO
#[derive(Clone, Debug)]
pub struct ImportCompact<'a> {
/// TODO
pub item: &'a str,
/// TODO
pub ty: EntityType,
}
/// TODO
#[derive(Clone, Debug)]
pub struct ImportGroupCompact1<'a> {
/// TODO
pub module: &'a str,
/// TODO
pub items: Cow<'a, [ImportCompact<'a>]>,
}
But a) I have no idea what the Cow is doing for me, and b) I have no idea how to map over to that
ah, I see some other code in here just collects everything into a vec?
(in reencode.rs)
maybe I'm getting it
image.png
that all looks reasonable to me yeah :+1:
Ok, I think I really do need some broad guidance for the text parser. At the moment the accepted formats are:
(import "foo" "bar" (global i32))
(import "foo" (item "bar" (global i32)) (item "baz" (global i32))
(import "foo" (item "bar") (item "baz") (global i32))
When it comes to impl Parse for Imports, I really hardly know where to begin. I have to first distinguish between the single-item format and the compact formats, but then also if I don't see a type in the first item, I need to start buffering up names until the very end where we expect to see a type before the closing parentheses.
Roughly that sounds about right to me, the results need to be buffered up no matter what as they're getting inserted into the AST. I can see how the parsing of the first item will be a bit wonky though, I think that you'll need to basically parse a full item before you go down one of the 3 parsing paths and then handle the first thing that was parsed a bit specially
So for example in Parse<'a> for Import<'a> before parsing field you could peek to see if there's a string or a parenthesis, and that'll disambiguate "single" vs "compact", and if it's a paren you'd then parse (item ...) in a loop. That'd then get joined together into a similar enum that wasmparser has
Sure, seems reasonable
How do I peek for a paren? I assume peek::<something>
parser.peek::<LParen>()?
And for parsing items, should I just make a type that has Option<ItemSig> or something?
hm, so before I answer that, question for you, where does the $foo go in the (item ...) form?
e.g. (import "foo" "bar" (global $foo i32)) is the "single" form
It doesn't
I originally thought it could go in the (item) but Rossberg said no
if you use this syntax you just don't get to use identifiers
shrug
I'm not thrilled about it but it might make for a good stage 4 bikeshed
I think you'll skip ItemSig and use ItemKind directly in that case, since the sig is about containing an identifier
so you'll want an enum of Single(ItemSig), Compact1(Vec<&'a str, ItemKind>), Compact2(ItemKind, Vec<&'a str>)
ish
Single would probably also have &'a str for the field
this'll cascade through the text resolver where you'll need to account for the compact forms introducing multiple items in the sense of adding to index spaces, but they'll all use None for their $foo name
How do I conditionally parse either an (item "foo") or an ItemSig though?
Hm ok actually, let me revise. You probably want to reuse ItemSig since it has the nitty-gritty of parsing types. You'll want to verify after-the-fact though that id is None.
Also, one change you'll need to make (or file an issue for) is that wasmprinter will need special logic to not print $foo names and instead print (@name ...) annotations
For (item ...) vs ItemSig, you can use parser.peek2::<kw::item>()?
and if that's false you do an ItemSig
wasmprinterwill need special logic to not print$foonames and instead print(@name ...)annotations
I'm not following, what will the output need to look like?
to clarify, I'm imagining a name section in a binary which indeed has a name for a compact-imported-thing and ideally wasmprinter would still reflect that somehow so the name section could get faithfully represented. In lieu of $foo the other option is (@name "...") so what I'd naviely expect is (item "foo" (@name "bar"))
Although this could also be an open issue and for now names on the items could just get dropped
I see, I think I will have to open an issue about that
it's probably best to just drop names yeah and file an issue
What is this text "resolver" thing?
That'll be in this folder (crates/wast/src/core/resolve/*.rs) which figures out indices for $foo and such
You'll end up needing to modify this for example
Thanks for the continued help here...for the iter_item_sigs_mut, would you recommend implementing it similarly to how you did the iterator for the wasmparser types?
idiomatically, yeah that's what I'd recommend, but code-ease-wise it's probably easier to do something like fn try_for_each_sig_mut(&mut self, f: impl FnMut(&mut ItemSig<'a>) -> Result<()>) -> Result<()> { ... }
whichever you feel most comfortable with I think is reasonable
To be honest I'm not really sure how to structure either one. Is this thing going to take mutable ownership of the Imports object? Or does it need to clone something? (Surely it can't do that if it needs to mutate things)
yeah no cloning here, and the method would likely be &mut self on Imports yeah
I'm just wondering how that works if I need to return some kind of MutItemSigsIterator
the easiest actually is probably fn sigs_mut(&mut self) -> Vec<&mut ItemSig<'a>>
and if we want to speed it up we can deal with that later (and it might well never be necessary)
hm that is appealing
I'll give that a try
one part about that to be aware of is that it's not suitable for "iterate over all import items introduced" because for the type-mentioned-once-names-mentioned-many-times compact form there's just one ItemSig and you can't push that onto a vector multiple times
ah right
well now I'm trying to think how that even works with the resolver or whatever
when I implemented this in spidermonkey I didn't actually have all the imports share a type, I would just go back and parse the common type bytes over and over again
seemingly that is not how it would work here
yeah for the text format at least I think we'll want to maintain the original AST to faithfully translate it to binary
for the resolver I think you'll want a plain old match on the imports and handle each case differently
but my hope is that many other locations, like "expansion" can be "just gimme the sigs" since they don't care about indices and it's more about AST structure there
More Rust noob questions: is it fundamentally wrong for me to iterate over &Import when I am synthesizing these Import objects on the fly?
image.png
if I change the iterator implementation back to this:
impl<'a> IntoIterator for Imports<'a> {
type Item = Import<'a>;
type IntoIter = Box<dyn Iterator<Item = Self::Item> + 'a>;
then that fixes the errors shown there, but then various uses of said iterator no longer seem to work well
image.png
Of course, even this doesn't work because that would mean returning Vec<Import<'a>>, not Vec<&Import<'a>> :rolling_eyes:
ugh! and doing all that doesn't work because we push the names into other data structures
image.png
I think I need to step back and figure out the right shape for all this
Like what types should we even have in the text AST? Should I even have a unified Import type any more when that literally doesn’t appear in many places?
Hm ok yeah I agree you probably don't want &Import or Import here since that doesn't exist for the compact forms. Some options I can think of are:
&ItemSig primarily which does exist in the compact formsThe former sounds better if I can swing it
I guess that's all I can think of heh
and sigs_mut does seem to be good in several places
nice yeah
things like wast/src/somponent/binary.rs though I dunno
that might be a case where I should just have a big match on the text AST
one option here, since this is semi-similar to resolve, is to have something like fn ids(&self) -> Vec<...>
where that expands to a vec-the-length-of-items-introduced and that has the m.id and such, which would all be None for compact imports but Some for the single import case
that way the binary and resolve pass would do something like for (id, name) in import.ids() { ... }
maybe something like:
impl Imports<'a> {
pub fn all<'b>(&'b self) -> Vec<(&'b Option<Id<'a>>, &'b Option<NameAnnotationThing>, &'b ItemSig<'a>)>;
}
just so I'm clear, the resolve pass resolves identifiers to indices, and the binary pass converts the text AST to binary?
right yeah, although the binary pass is also dealing with indices sort of -- it has to build up the name section and it additionally does a thing with dwarf where it needs to know the number of imported functions
I think I'll start with some matches and see where that takes me
Alex Crichton said:
maybe something like:
impl Imports<'a> { pub fn all<'b>(&'b self) -> Vec<(&'b Option<Id<'a>>, &'b Option<NameAnnotationThing>, &'b ItemSig<'a>)>; }
This could be reasonable but I just feel like it's surely going to need the name info in the mix somewhere and who even knows
sounds reasonable to me yeah
Surely this is not what you have to do just to check if you have a feature?
match (
single_item_name,
discriminator,
#[cfg(feature = "features")]
reader.features().compact_imports(),
#[cfg(not(feature = "features"))]
false,
) {
But when I run cargo check --no-default-features, reader.features() doesn't exist
you'll want to use reader.compact_imports(), there's some subtle handling there but the per-feature-methods are reexported on the BinaryReader itself
ah I see
What is the strategy for features in the text parser? Is there one?
none, it just accepts everything and produces what it's given
how in the world do I see where this is coming from? it seems like the text parse actually worked, but now I have a mystery error with no backtrace, and --verbose does absolutely nothing
$ cargo run -- parse test.wat -o test.wasm
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.06s
Running `target/debug/wasm-tools parse test.wat -o test.wasm`
error: unexpected end-of-file (at offset 0x3f)
huh that's odd!
the source for that command is here and you might need to add some printfs around there
that looks like a wasmparser-style error but I'm not entirely sure where wasmparser would be activated in the text-to-binary transform...
there's no way I can, like, use a real debugger?
the surface area of wasm-tools is enormous
oh sure you can use gdb and set breakpoints and such too
the usual rust debugging instructions for vs code seemingly don't work at all, so that's nice
back to printfs for me
surely you are not just putting printfs everywhere whenever you see a bug like this? surely you have some kind of stack trace at least?
depends on the error, most errors I personally debug have a pretty clear source. I'm not sure where this is coming from so I'd resort to printfs for this since I don't know off the top of my head where it's coming from
oh...it's doing weird things because for some reason wasm-tools parse will run the resulting wasm through the binary parser without features enabled
why in the world? can I not do that please?
or can I turn features on somehow?
the parse command doesn't seem to accept features in any way I can tell
do you have the line the error is being generated from?
I think it's literally just that the new compact import encoding, produced by the new text format, is being parsed as if it was not the compact import encoding, and doing Mysterious Things (reading too far in the resulting binary)
which seems baffling
I just want it to produce the binary, not produce the binary and then explode because it produced a binary with new features
this is why I was asking about features in the text format...it would be ok if the text parsing failed on the new syntax, or if the text parsing succeeded but it _didn't_ do this broken validation step, but it seems we have the worst of both worlds and I'm confused
Do you have a branch/repro I can poke at? I can try to help hunt this down and see if this is uncovering a preexisting bug in wasm-tools or something that can be improved
I'll push what I have + a test file
looks like you want to comment out this line and running dump on the produced *.wasm yields:
0x0 | 00 61 73 6d | version 1 (Module)
| 01 00 00 00
0x8 | 01 09 | type section
0xa | 02 | 2 count
--- rec group 0 (implicit) ---
0xb | 60 01 7f 00 | [type 0] SubType { is_final: true, supertype_idx: None, composite_type: CompositeType { inner: Func(FuncType { params: [I32], results: [] }), shared: false, descriptor_idx: None, describes_idx: None } }
--- rec group 1 (implicit) ---
0xf | 60 01 7e 00 | [type 1] SubType { is_final: true, supertype_idx: None, composite_type: CompositeType { inner: Func(FuncType { params: [I64], results: [] }), shared: false, descriptor_idx: None, describes_idx: None } }
0x13 | 02 2a | import section
0x15 | 04 | 4 count
0x16 | 03 66 6f 6f | import [func 0] Import { module: "foo", name: "bar", ty: Func(0) }
| 03 62 61 72
| 00 00
0x20 | 03 66 6f 6f | import [func 1] Import { module: "foo", name: "baz", ty: Func(0) }
| 03 62 61 7a
| 00 00
--- import group (common module) ---
0x2a | 03 66 6f 6f | module: "foo", count: 2
| 00 7f 02
0x31 | 04 62 65 65 | import [func 2] ImportItemCompact { name: "beep", ty: Func(1) }
| 70 00 01
0x38 | 04 62 6f 6f | import [func 3] ImportItemCompact { name: "boop", ty: Func(1) }
| 70 00 01
error: unexpected end-of-file (at offset 0x3f)
ah, it was actually an incorrect count in the import section :upside_down:
@Alex Crichton Is there a timeline for the next release? I'm hoping to present the proposal in CG next week and would like to have our SpiderMonkey implementation landed by then.
No release cadence, just done on an as-needed basis
I'll try to get around to pushing the buttons today
Thanks!
ok a new release has been made
Last updated: Jan 10 2026 at 20:04 UTC