Stream: git-wasmtime

Topic: wasmtime / Issue #2681 newBE: value_labels_ranges is very...


view this post on Zulip Wasmtime GitHub notifications bot (Feb 23 2021 at 21:05):

bjorn3 opened Issue #2681:

It literally took more time than the actual compilation on one profile.

![image](https://user-images.githubusercontent.com/17426603/108908024-257fd680-7623-11eb-92ff-405eb4f6fe27.png)

view this post on Zulip Wasmtime GitHub notifications bot (Feb 23 2021 at 21:24):

cfallin commented on Issue #2681:

It would be interesting to know more about this workload: why was the label-location dataflow analysis particularly slow in this case? Was there a higher than usual density of labels? Many basic blocks?

Considering alternative approaches (against the baseline "regalloc tells us everything directly" approach which regalloc.rs does not currently support):

I actually kind of favor the latter, all other things being equal. As mentioned here (and in the same spirit as the "simpler GC without stackmaps" proposal), I'd like to bias toward better factorization of complexity and less reliance on complex analyses and maintenance of metadata; the "post-hoc analysis" is partway there (the core compiler pipeline only sees blackbox value-label instructions) but this would be further so. Thoughts?

view this post on Zulip Wasmtime GitHub notifications bot (Feb 23 2021 at 21:25):

cfallin edited a comment on Issue #2681:

It would be interesting to know more about this workload: why was the label-location dataflow analysis particularly slow in this case? Was there a higher than usual density of labels? Many basic blocks?

Considering alternative approaches (against the baseline "regalloc tells us everything directly" approach which regalloc.rs does not currently support):

I actually kind of favor the latter, all other things being equal. As mentioned here (and in the same spirit as the "simpler GC without stackmaps" proposal #2459), I'd like to bias toward better factorization of complexity and less reliance on complex analyses and maintenance of metadata; the "post-hoc analysis" is partway there (the core compiler pipeline only sees blackbox value-label instructions) but this would be further so. Thoughts?

view this post on Zulip Wasmtime GitHub notifications bot (Feb 23 2021 at 21:36):

bjorn3 commented on Issue #2681:

This particular workload is compiling simple-raytracer with all of of its dependencies.

I would prefer not allocting a stackslot for every value for three reasons: I don't think it is acceptible to regress the already poor debugmode performance of rust even more. I don't want the choice to generate debuginfo or jot to influence the generated code. Gcc also doesn't let it influence the generated code. This has the advantage that enabling debuginfo doesn't change the behaviour of a program in case of UB or miscompilations, thus making them easier to debug. Finally value debuginfo may be useful for on stack replacement in case of a tiered JIT. Regressing performance in this case is unacceptable.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 23 2021 at 21:43):

cfallin commented on Issue #2681:

Yeah, perhaps not, though I'd be interested to measure how large the regression would be.

The "right" answer here, I think, is to rely on regalloc.rs to provide us location info per vreg per program-point. Unfortunately without that we're forced to do an analysis of some sort to recover the info.

It's possible the analysis data structures could be improved: I notice that a lot of time is spent in cloning HashMaps for example; perhaps a delta-based scheme with some structure sharing could be designed. If someone has the time to experiment with this (I unfortunately don't at the moment, first priority is completeness for the switchover) I'd be happy to discuss or review...

view this post on Zulip Wasmtime GitHub notifications bot (Feb 23 2021 at 22:39):

froydnj commented on Issue #2681:

FWIW, GCC does (or at least used to) go with the second approach suggested above of giving everything a fixed stack slot.


Last updated: Jan 24 2025 at 00:11 UTC