Stream: git-wasmtime

Topic: wasmtime / PR #9580 Make source links relative to OUT_DIR


view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2024 at 11:26):

nmattia opened PR #9580 from nmattia:nm-out-dir-relative to bytecodealliance:main:

Fixes #9553

When generating code, cranelift-isle records the path to the source files. These paths were sometimes absolute, meaning it kept a trace of the system it was run on.

These changes here add helper functions in Files to try and strip the value of $OUT_DIR (if set) from the file names.

This also removes a log line (which included the full path of files) from cranelift-codegen-meta.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2024 at 11:26):

nmattia requested wasmtime-compiler-reviewers for a review on PR #9580.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2024 at 11:26):

nmattia requested cfallin for a review on PR #9580.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2024 at 11:28):

nmattia commented on PR #9580:

@cfallin here are some changes related to our discussion here.

This also removes a log line (which included the full path of files) from cranelift-codegen-meta.

About this :wait_one_second: I'm happy to remove it/submit it separately/make it relative.

@fitzgen you mentioned that, in the past, files were not printed with absolute paths. Is it worth investing some time to ensure this doesn't happen again in the future?

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2024 at 11:29):

nmattia edited a comment on PR #9580:

@cfallin here are some changes related to our discussion here.

This also removes a log line (which included the full path of files) from cranelift-codegen-meta.

About this :wait_one_second: I'm happy to remove it/submit it separately/make it relative.

@fitzgen you mentioned that, in the past, files were not printed with absolute paths. Is it worth investing some time to ensure this doesn't happen again in the future?

Note: I originally changed the code to strip CARGO_MANIFEST_DIR, but it looks like the files were actually not in there; instead they were in OUT_DIR. This might be an artifact of the build system used in my project; let me know if this makes sense.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2024 at 11:53):

bjorn3 submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2024 at 11:53):

bjorn3 created PR review comment:

I wonder if the build script that invokes cranelift-isle can read this instead. Cranelift-isle can technically be used to build arbitrary code. Files::from_paths could get a set of paths to make make input paths relative to, or alternatively a set of paths against which to resolve relative paths and then have the build script only pass the relative paths for individual isle files instead.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2024 at 12:28):

nmattia submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2024 at 12:28):

nmattia created PR review comment:

Sorry, not too familiar with the code/cranelift. Are you suggesting that we should make the paths relative here instead?

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2024 at 12:32):

bjorn3 created PR review comment:

Yes. There is already a make_isle_source_path_relative function which gets used, but that only affects isle files that are part of the source rather than those that get generated by the build script itself. You did still have to separately pass a (set of) prefix(s) to join with the relative path to get a path that can be passed to fs::read, but this prefix can be ignored when printing the source locations in the generated file.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2024 at 12:32):

bjorn3 submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2024 at 12:48):

nmattia submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2024 at 12:48):

nmattia created PR review comment:

only affects isle files that are part of the source rather than those that get generated by the build script itself.

Unrelated to this thread, but that explains why I had to use OUT_DIR instead of CARGO_MANIFEST_DIR !

Yes.

That makes sense. I'll try to find some time later to update the PR.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2024 at 14:44):

github-actions[bot] commented on PR #9580:

Subscribe to Label Action

cc @cfallin, @fitzgen

<details>
This issue or pull request has been labeled: "cranelift", "cranelift:meta", "isle"

Thus the following users have been cc'd because of the following labels:

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

view this post on Zulip Wasmtime GitHub notifications bot (Jan 09 2025 at 14:50):

nmattia submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 09 2025 at 14:50):

nmattia created PR review comment:

sorry, didn't have much time to look into this!

After playing with this for a bit, I think it might make sense to update the existing function (make_isle_source_path_relative) to work with paths that are not subpaths of the current directory. Borrowing some code from pathdiff, here's what this would look like:

diff --git a/cranelift/codegen/build.rs b/cranelift/codegen/build.rs
index 2cbf2c9e5..cfe6d37d7 100644
--- a/cranelift/codegen/build.rs
+++ b/cranelift/codegen/build.rs
@@ -140,10 +140,11 @@ fn make_isle_source_path_relative(
     cur_dir: &std::path::Path,
     filename: &std::path::Path,
 ) -> std::path::PathBuf {
-    if let Ok(suffix) = filename.strip_prefix(&cur_dir) {
-        suffix.to_path_buf()
-    } else {
-        filename.to_path_buf()
+
+    match diff_paths(filename, cur_dir) {
+        Some(relative) => relative,
+        None => filename.to_path_buf(),
+
     }
 }

@@ -253,3 +254,48 @@ fn rustfmt(code: &str) -> std::io::Result<String> {

     Ok(String::from_utf8(data).expect("rustfmt always writes utf-8 to stdout"))
 }
+
+pub fn diff_paths<P, B>(path: P, base: B) -> Option<std::path::PathBuf>
+where
+    P: AsRef<std::path::Path>,
+    B: AsRef<std::path::Path>,
+{
+    let path = path.as_ref();
+    let base = base.as_ref();
+
+    if path.is_absolute() != base.is_absolute() {
+        if path.is_absolute() {
+            Some(std::path::PathBuf::from(path))
+        } else {
+            None
+        }
+    } else {
+        let mut ita = path.components();
+        let mut itb = base.components();
+        let mut comps: Vec<std::path::Component> = vec![];
+        loop {
+            match (ita.next(), itb.next()) {
+                (None, None) => break,
+                (Some(a), None) => {
+                    comps.push(a);
+                    comps.extend(ita.by_ref());
+                    break;
+                }
+                (None, _) => comps.push(std::path::Component::ParentDir),
+                (Some(a), Some(b)) if comps.is_empty() && a == b => (),
+                (Some(a), Some(b)) if b == std::path::Component::CurDir => comps.push(a),
+                (Some(_), Some(b)) if b == std::path::Component::ParentDir => return None,
+                (Some(a), Some(_)) => {
+                    comps.push(std::path::Component::ParentDir);
+                    for _ in itb {
+                        comps.push(std::path::Component::ParentDir);
+                    }
+                    comps.push(a);
+                    comps.extend(ita.by_ref());
+                    break;
+                }
+            }
+        }
+        Some(comps.iter().map(|c| c.as_os_str()).collect())
+    }
+}

Can confirm that this also fixes the issue (no more non-determinism), plus the fix is a bit more self contained. Thoughts?

view this post on Zulip Wasmtime GitHub notifications bot (Jan 14 2025 at 14:08):

nmattia submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 14 2025 at 14:08):

nmattia created PR review comment:

@bjorn3 what do you think?

view this post on Zulip Wasmtime GitHub notifications bot (Jan 21 2025 at 10:32):

nmattia submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 21 2025 at 10:32):

nmattia created PR review comment:

@cfallin or @bjorn3 do you think it makes sense to extend make_isle_source_path_relative as suggested above?

view this post on Zulip Wasmtime GitHub notifications bot (Jan 23 2025 at 02:38):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 23 2025 at 02:38):

cfallin created PR review comment:

Hmm -- I'll admit I'm having a somewhat visceral reaction to this much complexity in a build.rs script. Yes, it works, and it is conceptually well-defined, but it's just... a lot of logic that seems like it shouldn't be necessary.

Popping up a few levels in the "why are we doing this" hierarchy, another thought occurs to me: we would also have deterministic output from the ISLE compiler if we omitted the comments with filenames in them; those comments are mainly meant for times when humans are trying to debug the generated code; we already have an environment variable to place the ISLE generated code in a non-build-outputs location for a human to be able to browse it (ISLE_SOURCE_DIR); perhaps we could add another environment variable ISLE_COMMENTS such that if set, it adds the comments, and we omit the comments otherwise?

We would need to add an option to CodegenOptions in the ISLE compiler (call it, say, include_comments), then set that option appropriately in this build.rs, and follow it in codegen.rs -- and let's omit all comments except the "this is machine generated" one at the top if unset. How does that sound?

view this post on Zulip Wasmtime GitHub notifications bot (Jan 23 2025 at 10:38):

nmattia submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 23 2025 at 10:38):

nmattia created PR review comment:

@cfallin sounds good to me! Will see when I have free cycles to look into this; let me know if you beat me to it.


Last updated: Jan 24 2025 at 00:11 UTC