Starting a new topic for general chatting, as a possible starting point for beginners and newcomers in the Cranelift stream, and so we can get a link to it and put it in README etc.
Has anyone successfully cross-compiled wasmtime / Cranelift with cargo --target ...
? I'm trying to build an aarch64 binary and run it with qemu-aarch64, to avoid slow CPU on my physical aarch64 machine (RPi4). I'm running into an issue where some part of the build system is invoking native (x86) cc
. I've never cross-compiled with the Cargo toolchain in general and I'm curious if this is supposed to be supported...
What crate invokes cc?
cc @Alex Crichton: see cross compilation question above ^
@Chris Fallin in theory it should work just fine with the right configuration, but compiling for aarch64 won't succeed for wasmtime at least because it has x86/x86_64-specific things and will liekly fail to link/resolve/etc
cranelift however should be comipleable to aarch64 (I think?), but can you gist the error you're seeing?
Ah, I just discovered I needed some config in ~/.cargo/config
(as per japaric/rust-cross), namely
[target.aarch64-unknown-linux-gnu] linker = "aarch64-linux-gnu-gcc"
I'm now just seeing a missing symbol __chkstk
in wasmtime_jit::link::link_module
, which I suppose is some of the platform-specific glue you mentioned. How difficult do you think it would be to stub things out? (I just want to get far enough to try a "hello world" and shake out all my unimplemented!()
's in my new arm64 backend...)
@Chris Fallin to get things working at all you can use stubs like in https://github.com/bytecodealliance/wasmtime/pull/943
which will likely merge soon anyway
and yeah there you need to configure the linker for cargo to pass to rustc to use
Excellent, will take a look -- thanks very much!
is "v0 -> v1" a regmove? I can't find where that syntax is printed
hard to search for -> in rust code :P
its what happens when v0
becomes an alias for v1
I see, I guess those are skipped when iterating over insts in a block?
With for inst in f.layout.block_insts(bb) {
or maybe there is something which replaces the aliases with the actual value
@Joey Gouly Right. They're not instructions, they're just aliases.
see https://docs.rs/cranelift-codegen/0.59.0/cranelift_codegen/ir/dfg/struct.DataFlowGraph.html#method.change_to_alias and related methods
is there a reason they need to stay around, rather than the instructions being updated?
I think the main reason is that this design avoids the need for def-to-use linkage -- otherwise one needs to find all uses to do the rewrite.
ah good point. @Chris Fallin I think some of the new backend code is wrong then, because it is not following the aliases
@Joey Gouly I see, details welcome if you've found a bug :-)
You can call DataFlowGraph::resolve_aliases
on a value if you want to look through aliases
@Joey Gouly ah, I've found at least one case in LowerCtx
where it doesn't resolve aliases. I'll audit for this, but specifics of what you ran into would be useful as well, thanks!
The one I found was in the edge-blocks phi parts
@Chris Fallin the best way to do this would be to replace all value_regs[] lookup with a function that calls DataFlowGraph::resolve_aliases
Yup, on it -- thanks for finding this!
@Chris Fallin DataFlowGraph::value_def()
already resolves aliases, so you did it eagerly in a few places, not a big deal though
Does clif-util wasm
translate the entire wasm module to CLIF and then compile it? Does it translate per-function or?
edit: I think it's eager, I found src/wasm.rs
.
@Dan Gohman why are the aliases printed per block?
@Joey Gouly The aliases don't have an inherent location, so they could go anywhere.
Ok, that's what I expected. Just surprising to see them in random blocks
It's not always at the top of a block, although that's common because the SSA algorithm oftgen emits aliases for block parameters.
For aliases of instruction values, we print them immediately after the instructions whose values they're aliasing
Ah I see that now. I was mostly seeing them at the start of the block, due to large numbers of block arguments
Why does rust code compiled to wasm, have vmctx params in the clif?
The vmctx
is necessary to get for example the location of the heap
Ah, for example wasmtime will setup something to pass there?
Yes
With fn0 = u0:827 sig0
, is that u0:827 actually an entity reference to another function? I thought clif functions were insulated from all other functions
u0:827
is a number that doesn't have any meaning during compilation. After compilation the user of cranelift-codegen (for example cranelift-module, or wasmtime) interprets it to determine what the target of a relocations is.
Aha, so the user has to keep track of which functions CLIF functions are which numbers
Yes, except when using cranelift-module, in which case cranelift-module does it for you.
Can anyone online give me a quick review of a refactoring PR? No functionality changed... https://github.com/bytecodealliance/wasmtime/pull/1246
Hello. Could someone review/merge my PRs (https://github.com/bytecodealliance/wasmtime/pull/1382, https://github.com/bytecodealliance/wasmtime/pull/1534, https://github.com/bytecodealliance/wasmtime/pull/1510)?
I'll take a look at #1382
What are valid values for b8/16/32/64
types? According to Cranelift IR docs, they are either all zeros or all ones, but when I tried bconst.b8 true
I got 1
, not 0xFF
.
On the other hand, bnot
for b32
negates all bits and icmp
for vector types yields masks filled with zeroes or ones on x86, as boolean vector type.
Several larger boolean types are also defined, primarily to be used as SIMD element types. They can be stored in memory, and are represented as either all zero bits or all one bits.
So, I read that as: the SIMD types do have a defined representation but the scalar types may not
i.e. you have to use bint
to get an integer value out of a b8
but if you extracted a lane of a b8x16
it would be either 0x00
or 0xff
Hm, if you create a boolean vector with vconst
it's using 0
/1
: https://github.com/bytecodealliance/wasmtime/blob/master/cranelift/filetests/filetests/isa/x86/simd-vconst-rodata.clif#L13-L19
:slight_smile: then something is not right: perhaps the documentation should change or perhaps vconst.BxN
should change... since I don't do a lot with the BxN
types I haven't noticed any issues
And, thinking about it more, it is the documentation that is correct: it should match the Wasm SIMD spec which says
The comparison operations all compare two vectors lane-wise, and produce a mask vector with the same number of lanes as the input interpretation where the bits in each lane are 0 for false and all ones for true.
https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#comparisons
This should be a relatively simple fix but feel free to open an issue if it wasn't the main point of what you were getting at... your question was more about scalar types?
No, I was interested in vector types, as I was looking into #1187.
There's a vselect
instruction in Cranelift that currently has no encoding and it could be encoded as blend
on x86. We could then use vselect
instead of bitselect
directly in wasm code translator or replace it in preopt.
I'll make fix for vconst
Sounds good; just a heads up that at least PBLENDW should be coming soon-ish: https://github.com/bytecodealliance/wasmtime/pull/1765/commits/66216fc1f5f133ebffdba42bf075da6b5cc3c634
does anyone know why cranelift got merged into wasmtime? Just curious really.
@Carlo Kok I wrote a comment explaining the reasoning here: https://github.com/bytecodealliance/wasmtime/issues/1185#issuecomment-591050388
thanks! note that I wasn't complaining. Just curious
good read. Thanks
note that I wasn't complaining. Just curious
Understood, yes, but thank you for emphasizing it! :heart:
is there a list of supported triples ?
So feature wise, things I currently use in LLVM that cranelift doesn't have yet: Exception handling, COMDAT; So I'd say cranelift already has a fairly impressive set of features.
x86_64 is the most complete target. aarch64 comes next. x86 is a little worse. risc-v and arm32 are just stubs.
oh arm32 isn't done at all? IT's a bit hard to see from the codegen repo
yes, there are literally no instruction encodings: https://github.com/bytecodealliance/wasmtime/blob/c91a9313b532067bb53eea44a182bb0c5bd2ebf4/cranelift/codegen/meta/src/isa/arm32/mod.rs#L73-L77
risc-v at least has some: https://github.com/bytecodealliance/wasmtime/blob/master/cranelift/codegen/meta/src/isa/riscv/encodings.rs
So feature wise, things I currently use in LLVM that cranelift doesn't have yet: arm(32), Exception handling, COMDAT; So I'd say cranelift already has a fairly impressive set of features.
updated :)
cranelift is currently transitioning to a new framework for backends. the aarch64 backend has been created using that new framework. the x86_64 backend is currently using the old framework. a new x86_64 backend is being developed, but it is still in a relatively early stage.
dwarf unwinding support would be useful for https://github.com/bjorn3/rustc_codegen_cranelift/ too. there is already an issue for it: https://github.com/bytecodealliance/wasmtime/issues/1677
I read that yes (new framework) but i do know x86-32 works too (old backend though)
yeah. Just unwinding has some concerns though: win32 is vastly different than win64; and there's exception handling itself too
Exception handling and unwinding isn't that different though. At least not on the targets I saw.
Cranelift can already generate SEH and .eh_frame tables. They are just treated as not requiring any cleanup.
ah cool
Wasmtime uses them to unwind back to the caller when the wasm code traps or a called rust function panics.
And this works with win32 seh too?
That is impressive
I don't know if it works on win32 SEH. As far as I know Wasmtime doesn't work on 32bit yet.
Ah
I want to tackle COMDAT myself once it's supported in the object code
would a feature reqeust for SEH and arm32 be welcome? so there's some place to track and gauge interest?
hi! I think there's already an issue open for supporting arm32: https://github.com/bytecodealliance/wasmtime/issues/1173
thanx! Missed that one
a boolean vector's elements are meant to be all 0 or all 1 right?
@Joey Gouly they at least encode false or true (i.e., are not vectors of false/true):
/// Boolean types: `B1`, `B8`, `B16`, `B32`, `B64`, and `B128`. These all encode 'true' or 'false'. The
/// larger types use redundant bits.
(from codegen/src/ir/types.rs)
I'm not sure whether a true
B128 is meant to be 0....001 or 1...111 though
I can't seem to find anything in the old x86 backend that would imply either way... continuing to look
Sorry what I meant was that each element was 0x00... or 0xff...
Ah, OK, it seems that it must be numeric 0
or 1
, because we use a simple zero-extension to go from e.g. B8
to I8
and the latter must be 0 or 1
Look at https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/codegen/src/simple_preopt.rs#L956 for example
Ah, well that is interesting indeed
@Benjamin Bouvier or @Dan Gohman do you know more?
I suppose it has to be 0 / -1 (ff...) given the masking above -- but then that means we have to be careful with bint
of the result of that must be 0 / 1
I seem to recall that true lanes are all ones, while false lanes are all zeroes, i don't know if it's guaranteed though
@Andrew Brown you might know ^?
Yeah, I think what Benjamin said is correct: the vector booleans have to be all 0s or all 1s... There was an issue about this, let me dig it up.
Here it is: https://github.com/bytecodealliance/wasmtime/pull/1769. This fixed a bug where boolean vector lanes were not all 0s or all 1s. The Cranelift IR documentation describing this is here: https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/docs/ir.md#boolean-types.
Thanks for the links Andrew!
@Chris Fallin looks like aarch64's Bint lowering is wrong?
Yes, given the above, it should mask down to the LSB
I'll put together a patch unless you want to include this in yours
Also icmp / fcmp -- they use cset
which yields 0/1. If dest type is > b1
then we need a negation
Vector comparisons don't use cset
Two separate issues I guess -- there's the question of vector lane values (your immediate question) but then there's also the representation of scalar b8
/ b16
/ ...
For the latter, we should also use the 0 / -1 representation
Oh silly me, I was thinking you meant bNxM
when you said >b1
Yup no worries! I'll patch the scalar case
Thanks! And the original reason I went looking into this was a mistake on my side anyway, but Im happy we found this issue due to it :)
@Alex Crichton Not sure where this message has gone, but: yesterday i had to pin the version of rustc nightly because staticvec doesn't compile in rustc nightly anymore, and this was breaking the build.
opened a PR for static vec, but it's not sufficient as staticvec's CI relies on miri, thus the latest rustc nightly version that supports miri, which is currently 2 weeks old... https://github.com/slightlyoutofphase/staticvec/pull/39
Bah, I'll work on updating our CI to be green soon
:+1:
@Alex Crichton since you're around, can i bother you with more CI questions?
Sure yeah what's up?
I'm trying to mark the cranelift filetests as should_panic with the experimental_x64 feature, but it doesn't seem to be taken into account when i run the tests, unfortunately.
What's the marker look like?
i've added #[cfg_attr(feature = "experimental_x64", should_panic)
above the test in cranelift/tests/filetests.rs, but it does nothing.
I'm running the tests with cargo --features experimental_x64 --all --exclude lightbeam
(+ peepmatic exclusions)
Is it a #[test] function?
yes it is
And that crate has that feature in cargo.toml?
yes
Hm weird... Wanna gist your patch and I'll take a look?
Sure, thanks!
https://paste.mozilla.org/ryemqRPh
Hm and you're not getting compile.failures?
Nope
If so then cranelift/cargo.toml may not have the feature?
I'll dig a bit more in in a sec
it is there, though: https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/Cargo.toml#L51
ok, so new fact: if i run the tests from wasmtime root dir, it doesn't happen. If i run from the wasmtime/cranelift directory, the compile error happens, and the test isn't run if i remove the body addends
aha
if you do cargo test -p cranelift-tools --feature ...
that doesn't actually do what you would expect
it's basically a bug in Cargo
it applies the features to w/e is in the current directory, not the target of -p
aha! and if i run in wasmtime with --all, it'll work? let me try this!
you'll need to do --manifest-path cranelift/Cargo.toml
or change the cwd
nah it won't workwith --all
either
there's a bit more info here about the unstable feature to fix this
oh well, thanks it works with this!
i'll split the CI steps then
@Alex Crichton well, does it mean that we should be able to run the tests with experimental_x64 in CI? i'm a bit lost regarding how to do this, then
I'd probably limit this to a nightly rustc which executes Cargo with the relevant -Z
flag
so that way we don't have to bend over backwards for this
great idea!
but it causes new compile time errors that i don't quite understand :confused:
https://paste.mozilla.org/BtuHrkxb
hm interesting
so what's happening in cargo is that the behavior of the feature resolver is different, where features are less eagerly unified
I think that means the wiggle crate depends on syn
but doesn't activate the right set of features
er, no
hm I'll have to dig into that later
that may actually be a bug in cargo? unsure
oops, sorry to cause more work
heh no worries
@Benjamin Bouvier btw it looks like wasmtime CI is green, what was the failure you were seeing again?
@Alex Crichton oh yeah i added a commit that pinned the version of rustc nightly, let me find that
https://github.com/bytecodealliance/wasmtime/commit/2c1d3704653f094a93848a87f79361bca511392b was the commit, but i can see there was another CI failure
https://github.com/bytecodealliance/wasmtime/runs/918219496?check_suite_focus=true was one of the CI failures with staticvec
hm did you bump a staticvec version?
b/c CI for the main branch came back green a few hour sago
no, i did pin the version of rustc nightly running in wasmtime (first link above)
hm ok well I can try to help with a PR
but it sounds like local changes may have necessitated a pin
I don't know what to debug if main CI isn't failing
see also https://github.com/slightlyoutofphase/staticvec/pull/39, but it seems that it requires a miri in rustc nightly to work
it makes sense that it'd break, but I don't know why main CI isn't failing in tha tcase
Reverting 2c1d3704653f094a93848a87f79361bca511392b should make the CI fail
oooh wait I see
you landed that on main
sorry I see now
no worries
I thought this was just in your PR
i did it a bit preemptively because it was blocking a PR landing in spidermonkey
ah ok, in any case that's all I would have done, we can't do much until staticvec updates
nah no worries
just got mixed up what was landed where
@Benjamin Bouvier I think it'd be good to do these kinds of things in their own PR, or at least mention the pinning in the PR they're included in
@Till Schneidereit yeah, i mentioned it here on zulip with a mention to Alex, but this message disappeared in the meanwhile, not sure what happened...
i'll put a message in the PR next time
ah, yeah, the vanished message is very strange. But documenting things like this in a PR is good practice anyway, I think. Though really I'd prefer it to be its own PR, and an issue opened about reverting it at the same time
Right. There's also a speed vs process tradeoff here: opening a new PR would have taken two more CI runs (one for the PR, one to rebase the other PR on top of the first one). And, this is not the first time this is happening after the merger of Wasmtime and Cranelift repositories (i can recall at least two other instances, last being last week, so third time this is happening to me), and because of time zones, people working in europe are more likely to hit the "unrelated CI failures because of a rustc nightly change" before anyone else...
@Benjamin Bouvier ok I opened a Cargo issue for the wiggle problem you're seeing
in the meantime you can use doctest = false
in crates/wiggle/macro/Cargo.toml
and still use -Zfeatures=all
i think
@Alex Crichton Thanks a lot! I can run all the tests! (with a few annotations to skip some)
that being said, if i try to run a particular test by adding its name to the CLI command, i run into a similar compile error as above, again
it's probably a later problem, though, as long as we can run something in the meanwhile
Just a heads up: it seems the test_x64 CI has an intermittent failure, not sure yet what causes it, and I won't be around or have time today to investigate it too much, unfortunately.
I observed something similar in #2060 but thought it was due to rebasing. It's a bit hard to figure out what is going wrong because all the should_panic
tests emit backtraces so there are a lot errors in the logs--is there a way to mute those somehow?
@Chris Fallin, @Alex Crichton: I think #2071 looks like a good change if we overlook the intermittent CI issues--with Benjamin out, should we merge it or look into the CI issues first?
seems fine by me, but I don't know much about the components in play here
It seems this is good to merge, but I also think we should address the CI intermittent failure somehow; slippery slope to "manually ignore" tests
i managed to catch the crash in rr, that i packed so one can reproduce it on a similar CPU (machine cpuinfo is there: https://benj.me/pub/cpuinfo)
Here's the packed trace directory, to be unzipped and then run with rr replay $trace_dir
: https://benj.me/pub/rr-trace.zip
We should probably disable the x64 CI if it's too buggy, though
/me disappears
I see someone restarted the CI and this time the new backend step passed; I'll merge #2071 since I can use Benjamin's changes and we can talk more about troubleshooting or disabling the new backend step if it keeps happening (any objections?)
Fwiw, Cranelift has been shipped in Firefox and will be used for wasm compilation on aarch64 devices! Thanks to all who contributed making this possible! cc @Joey Gouly @Anton Kirilov etc.
(Also: these two days are company-wide PTO for Mozillians, so Mozillians' activity might be low!)
@Benjamin Bouvier awesome!
:rocket: :confetti: :tada: :ship: :tada: :confetti: :rocket:
@fitzgen (he/him) Hi, I think changing from String -> anyhow broke some of the usefulness of clif-utils errors
previously when it failed, it printed out what it matched / what it missed, but I dont seem to get that now
oh? which ones? anyhow
has lots of ability to annotate errors with extra context, and its possible I accidentally forgot to translate some string concatenation into annotated context
for example: cargo run -p cranelift-tools -- test cranelift/filetests/filetests/vcode/aarch64/reftypes.clif
Joey Gouly said:
previously when it failed, it printed out what it matched / what it missed, but I dont seem to get that now
for filetests? its possible that I messed up the if <condition> { print() }
logic when translating CLI flags, which should be unrelated to anyhow, although it was the same PR. some of them used flag_print
and others used flag_verbose
, and I might have messed some of them up
if you delete one of the nextln
checks, the output you get is: FAIL reftypes.clif: compile
let me look into it
thanks!
Fix over here: https://github.com/bytecodealliance/wasmtime/pull/2207
Thanks!
@fitzgen (he/him) Hi, I was wondering in general what the plans for peepomatic were? Is the aim to turn it on by default?
@Joey Gouly
The most immediate plan is to demonstrate better codegen while not sacrificing compilation time (very much, since peepmatic is currently a little slower than the hand-written simple preopt pass) by creating peephole passes from superoptimizations synthesized by Souper. See the recent work to add clif-util souper-harvest
and clif-util souper-to-peepmatic
commands.
After that is successful, we can start talking about when it makes sense to turn on custom Peepmatic passes for which Cranelift embeddings. For example, it likely makes sense for Lucet before it makes sense for SpiderMonkey.
Eventually I would like to replace the hand written simple preopt with the peepmatic version. There are a few open questions here. First is the question of compile times: peepmatic is currently a little slower than the hand written pass. There's certainly more room for improvement here. But it is also easier to add new optimizations via peepmatic DSL than via open coding matches on InstructionData
, and peepmatic is designed such that its performance scales as the number of optimizations in a peephole pass grows. When do we cross the line from "not worth it" to "worth it" when balancing compile times and code quality? I'm not sure. That's kind of why I've been declining to add anything to the peepmatic version of simple-preopt that isn't already in the hand-written version. At least this way we are comparing apples to apples, even if we are leaving peepmatic's biggest strengths on the floor.
This sort of ties into having a better benchmark suite as well, so we can have concrete numbers to work with. A benchmark suite that doesn't require a Web embedding. I want to start work on this soon because it is so hard to evaluate any decisions we make related to this stuff without it.
More things I'd like Peepmatic to do eventually:
vcode peephole optimizations (it is already generic over the IR it is optimizing in anticipation of this, but we need a simple reaching defs analysis so we can determine whether two uses of some virtual register are the same value or not)
clif -> vcode lowering
perhaps even instruction selection (the go compiler uses their equivalent DSL for instruction selection and I find it very nice, but other folks might have other feelings on this topic)
perhaps port LLVM's inst combine pass over to peepmatic
@fitzgen (he/him) thanks for the explanation!
hi! i was wondering if there is documentation somewhere for cranelift_frontend's FuncInstBuilder ? it seems to be where the logic for instructions is, but docs.rs doesn't seem to have generated documentation for it (indeed, searching a function used in the example on https://docs.rs/cranelift-frontend/0.69.0/cranelift_frontend/index.html such as iconst
doesn't yield any results)
(FuncInstBuilder being what is returned by cranelift_frontend::FunctionBuilder::ins()
)
FuncInstBuilder implements the InstBuilder trait: https://docs.rs/cranelift-codegen/0.66.0/cranelift_codegen/ir/trait.InstBuilder.html
@carado
oh cool, thanks!
Hello!
I am getting (exit code: 0xc0000005, STATUS_ACCESS_VIOLATION) when trying to Linkage::Import
#[no_mangle]
pub extern "C" fn ello_world() -> i64 {
println!("Ello World!");
0
}
Nvm I fixed it. Is there a way to do overloaded functions in module.symobl() as the docs say that the one last created is accepted
Last updated: Dec 23 2024 at 12:05 UTC