nmattia opened PR #9580 from nmattia:nm-out-dir-relative
to bytecodealliance:main
:
Fixes #9553
When generating code,
cranelift-isle
records the path to the source files. These paths were sometimes absolute, meaning it kept a trace of the system it was run on.These changes here add helper functions in
Files
to try and strip the value of$OUT_DIR
(if set) from the file names.This also removes a log line (which included the full path of files) from
cranelift-codegen-meta
.
nmattia requested wasmtime-compiler-reviewers for a review on PR #9580.
nmattia requested cfallin for a review on PR #9580.
nmattia commented on PR #9580:
@cfallin here are some changes related to our discussion here.
This also removes a log line (which included the full path of files) from cranelift-codegen-meta.
About this :wait_one_second: I'm happy to remove it/submit it separately/make it relative.
@fitzgen you mentioned that, in the past, files were not printed with absolute paths. Is it worth investing some time to ensure this doesn't happen again in the future?
nmattia edited a comment on PR #9580:
@cfallin here are some changes related to our discussion here.
This also removes a log line (which included the full path of files) from cranelift-codegen-meta.
About this :wait_one_second: I'm happy to remove it/submit it separately/make it relative.
@fitzgen you mentioned that, in the past, files were not printed with absolute paths. Is it worth investing some time to ensure this doesn't happen again in the future?
Note: I originally changed the code to strip
CARGO_MANIFEST_DIR
, but it looks like the files were actually not in there; instead they were inOUT_DIR
. This might be an artifact of the build system used in my project; let me know if this makes sense.
bjorn3 submitted PR review.
bjorn3 created PR review comment:
I wonder if the build script that invokes cranelift-isle can read this instead. Cranelift-isle can technically be used to build arbitrary code.
Files::from_paths
could get a set of paths to make make input paths relative to, or alternatively a set of paths against which to resolve relative paths and then have the build script only pass the relative paths for individual isle files instead.
nmattia submitted PR review.
nmattia created PR review comment:
Sorry, not too familiar with the code/cranelift. Are you suggesting that we should make the paths relative here instead?
bjorn3 created PR review comment:
Yes. There is already a
make_isle_source_path_relative
function which gets used, but that only affects isle files that are part of the source rather than those that get generated by the build script itself. You did still have to separately pass a (set of) prefix(s) to join with the relative path to get a path that can be passed tofs::read
, but this prefix can be ignored when printing the source locations in the generated file.
bjorn3 submitted PR review.
nmattia submitted PR review.
nmattia created PR review comment:
only affects isle files that are part of the source rather than those that get generated by the build script itself.
Unrelated to this thread, but that explains why I had to use
OUT_DIR
instead ofCARGO_MANIFEST_DIR
!Yes.
That makes sense. I'll try to find some time later to update the PR.
github-actions[bot] commented on PR #9580:
Subscribe to Label Action
cc @cfallin, @fitzgen
<details>
This issue or pull request has been labeled: "cranelift", "cranelift:meta", "isle"Thus the following users have been cc'd because of the following labels:
- cfallin: isle
- fitzgen: isle
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
nmattia submitted PR review.
nmattia created PR review comment:
sorry, didn't have much time to look into this!
After playing with this for a bit, I think it might make sense to update the existing function (
make_isle_source_path_relative
) to work with paths that are not subpaths of the current directory. Borrowing some code from pathdiff, here's what this would look like:diff --git a/cranelift/codegen/build.rs b/cranelift/codegen/build.rs index 2cbf2c9e5..cfe6d37d7 100644 --- a/cranelift/codegen/build.rs +++ b/cranelift/codegen/build.rs @@ -140,10 +140,11 @@ fn make_isle_source_path_relative( cur_dir: &std::path::Path, filename: &std::path::Path, ) -> std::path::PathBuf { - if let Ok(suffix) = filename.strip_prefix(&cur_dir) { - suffix.to_path_buf() - } else { - filename.to_path_buf() + + match diff_paths(filename, cur_dir) { + Some(relative) => relative, + None => filename.to_path_buf(), + } } @@ -253,3 +254,48 @@ fn rustfmt(code: &str) -> std::io::Result<String> { Ok(String::from_utf8(data).expect("rustfmt always writes utf-8 to stdout")) } + +pub fn diff_paths<P, B>(path: P, base: B) -> Option<std::path::PathBuf> +where + P: AsRef<std::path::Path>, + B: AsRef<std::path::Path>, +{ + let path = path.as_ref(); + let base = base.as_ref(); + + if path.is_absolute() != base.is_absolute() { + if path.is_absolute() { + Some(std::path::PathBuf::from(path)) + } else { + None + } + } else { + let mut ita = path.components(); + let mut itb = base.components(); + let mut comps: Vec<std::path::Component> = vec![]; + loop { + match (ita.next(), itb.next()) { + (None, None) => break, + (Some(a), None) => { + comps.push(a); + comps.extend(ita.by_ref()); + break; + } + (None, _) => comps.push(std::path::Component::ParentDir), + (Some(a), Some(b)) if comps.is_empty() && a == b => (), + (Some(a), Some(b)) if b == std::path::Component::CurDir => comps.push(a), + (Some(_), Some(b)) if b == std::path::Component::ParentDir => return None, + (Some(a), Some(_)) => { + comps.push(std::path::Component::ParentDir); + for _ in itb { + comps.push(std::path::Component::ParentDir); + } + comps.push(a); + comps.extend(ita.by_ref()); + break; + } + } + } + Some(comps.iter().map(|c| c.as_os_str()).collect()) + } +}
Can confirm that this also fixes the issue (no more non-determinism), plus the fix is a bit more self contained. Thoughts?
nmattia submitted PR review.
nmattia created PR review comment:
@bjorn3 what do you think?
nmattia submitted PR review.
nmattia created PR review comment:
@cfallin or @bjorn3 do you think it makes sense to extend
make_isle_source_path_relative
as suggested above?
cfallin submitted PR review.
cfallin created PR review comment:
Hmm -- I'll admit I'm having a somewhat visceral reaction to this much complexity in a
build.rs
script. Yes, it works, and it is conceptually well-defined, but it's just... a lot of logic that seems like it shouldn't be necessary.Popping up a few levels in the "why are we doing this" hierarchy, another thought occurs to me: we would also have deterministic output from the ISLE compiler if we omitted the comments with filenames in them; those comments are mainly meant for times when humans are trying to debug the generated code; we already have an environment variable to place the ISLE generated code in a non-build-outputs location for a human to be able to browse it (
ISLE_SOURCE_DIR
); perhaps we could add another environment variableISLE_COMMENTS
such that if set, it adds the comments, and we omit the comments otherwise?We would need to add an option to CodegenOptions in the ISLE compiler (call it, say,
include_comments
), then set that option appropriately in thisbuild.rs
, and follow it incodegen.rs
-- and let's omit all comments except the "this is machine generated" one at the top if unset. How does that sound?
nmattia submitted PR review.
nmattia created PR review comment:
@cfallin sounds good to me! Will see when I have free cycles to look into this; let me know if you beat me to it.
Last updated: Jan 24 2025 at 00:11 UTC