marmistrz commented on Issue #1199:
You're right. Unfortunately, I can't reproduce the error locally, the tests always succeed. I'm running the tests on Windows 10, they succeed both when run through msys64 ssh and windowed cmd (as administrator). I'm using
cargo test --features test_programs --package test-programs
but the full CI command succeeds too.Could you try and see if you can reproduce it?
peterhuene commented on Issue #1199:
I'll see if I can reproduce it.
peterhuene commented on Issue #1199:
I'm unable to reproduce the failure locally as well. I tried rerunning the GitHub actions jobs, but that apparently is broken (seems to checkout the wasmtime repo and then try to checkout your HEAD, but it obviously isn't in the history :shrug:♂).
The expect message of
removing a subdirectory
only appears inpath_link
which should still be ignored, so I'm not sure how we saw that in the log.To restart CI, let's rebase this branch off of master and push it up.
marmistrz commented on Issue #1199:
I rebased upon latest master
marmistrz edited a comment on Issue #1199:
I rebased upon latest master. The message is coming from
path_link_nonstrict
.
kubkon commented on Issue #1199:
Oh,
strip_trailing_slashes_and_concatenate
is there to correctly account for the case whereold_path
is a file, andnew_path
contains a trailing slash. On POSIX (and in WASI), as far as I remember, ifnew_path
exists on the host but without the trailing slash, we expect in this case anEEXIST
error code. On Windows, this is not respected in the same way, hence why the additional check.
kubkon commented on Issue #1199:
As far as mapping of error codes is concerned, yeah, MSDN is not very instructive. When I was figuring out
path_symlink
andpath_rename
, it was mostly the case of trial and error unfortunately.
kubkon commented on Issue #1199:
Also, thanks for the link! I'll need to investigate MacOS a bit further I think!
marmistrz commented on Issue #1199:
I tried to integrate those two tests into one and it appears that creating dangling symlinks doesn't fail when it should. I'll need to see why.
The previous version of the branch is available here: https://github.com/marmistrz/wasmtime/tree/path_link_old
github-actions[bot] commented on Issue #1199:
Subscribe to Label Action
This issue or pull request has been labeled: "w", "a", "s", "i"
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
github-actions[bot] commented on Issue #1199:
Subscribe to Label Action
This issue or pull request has been labeled: "w", "a", "s", "i"
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
marmistrz commented on Issue #1199:
There's good news! I managed to implement
path_link
for Windows so that it passes the originalpath_link
test case (with minor test case modifications only, most notably explicitfd_close
).Bad news: I'm still getting this weird unreproducible failure on Windows CI. I force pushed to restart the CI, but I don't expect wonders here. (restarting the CI through the web UI results in a checkout failure for some reason)
Do you have any ideas why? @peterhuene, you asked me to rebase upon latest master the previous time, did you have anything specific in mind?
I'm currently basing on code introduced in #1284, which needs to be merged first.
marmistrz commented on Issue #1199:
@kubkon can you please explain what's the mapping
ERROR_ACCESS_DENIED
->EEXIST
? When can such a situation occur?
marmistrz commented on Issue #1199:
ping, do you have an idea how to debug this mysterious CI failure?
marmistrz edited a comment on Issue #1199:
ping, do you have an idea how to try to reproduce this mysterious CI failure?
marmistrz edited a comment on Issue #1199:
ping, do you have an idea how to reproduce this mysterious CI failure?
peterhuene commented on Issue #1199:
Perhaps the test can print out the directory contents before attempting to delete it so we can gain some visibility into the problem (assuming the "not empty" error is correct)? I can't think of a good reason why this would be failing in CI and not locally.
marmistrz commented on Issue #1199:
Heh, good ol' debuggin by println :) I'm going to try.
@peterhuene What was the thing you had in mind in https://github.com/bytecodealliance/wasmtime/pull/1199#issuecomment-594798453? (if you still remember) Was it just _let's try and restart the CI and see if it helps_ sort of thing?
peterhuene commented on Issue #1199:
It was a suggestion to just see if was a transient problem or not. I didn't have a fix in mind that the rebase itself would have brought in, just something we could use to change the HEAD for the PR branch to kick off CI again (I've had problems doing so directly via the GitHub UI).
github-actions[bot] commented on Issue #1199:
Subscribe to Label Action
This issue or pull request has been labeled: "wasi", "wasi:impl", "wasi:tests"
<details> <summary>Users Subscribed to "wasi"</summary>
- @kubkon
</details>
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
marmistrz commented on Issue #1199:
Good news! I have managed to isolate the cause of the mysterious CI failures.
It appears that after calling
fd_fdstat_get
on a file the file will not properly be removed (internally,path_unlink_file
usesfs::remove_file
). This would manifest itself in two ways: (1)path_remove_directory
would fail due to the directory (2) if we skipped that part of the test, an attempt to create another file with the same name would fail due, due to permission having been deniedSupport for
fd_fdstat_get
on Windows was introduced in https://github.com/bytecodealliance/wasmtime/pull/557. @peterhuene, since the call was implemented by you, could you take this over from me at this point? I also suggest we disable the checks usingfd_fdstat_get
until it's figured out.The change is mostly ready. I'm still not sure if the
ERROR_ACCESS_DENIED
conversion is still needed, but this is basically discussed in #1359. Should it be the only blocker, I'm fine with removing it for now and possibly adding it back if we decide it's needed in #1359.
peterhuene commented on Issue #1199:
I don't particularly see how the problem lies with
fd_fdstat_get
. I double checked and it's not inadvertently leaking any handles.I do see how the test was missing calls to
fd_close
before the attempts to unlink the files and directories, which is required on Windows.If we uncomment the
fd_fdstat_get
calls with the rest of the changes, the test will still fail? I can't explain that, but I'll look into it.
marmistrz commented on Issue #1199:
If we uncomment the following lines: https://github.com/bytecodealliance/wasmtime/pull/1199/files#diff-6242b29603dfbc39ebbd9f3e58dde615R98-R102
the tests will fail in the CI (but not locally). I have absolutely no idea why this could happen.
marmistrz edited a comment on Issue #1199:
If we uncomment the following lines: https://github.com/bytecodealliance/wasmtime/pull/1199/files#diff-6242b29603dfbc39ebbd9f3e58dde615R98-R102
the tests will fail in the CI (but not locally). I have absolutely no idea why this could happen, I've checked the source too.
kubkon commented on Issue #1199:
@peterhuene @marmistrz did you manage to work out what the problem is? If not, what do you think the next steps should be both wrt the problem and this PR?
peterhuene commented on Issue #1199:
I haven't yet determined the problem and I have no educated guesses since the commented out lines should have absolutely no impact on handle reference counts that might keep things from being deleted properly.
@marmistrz when you get a chance, can you uncomment out the problematic lines and push it up for CI to fail again? I'd like to investigate the log just a little more.
marmistrz commented on Issue #1199:
I thought about merging this PR as-is and letting @peterhuene figure out what's wrong with fdstat in a subsequent PR. But for now, I'll add a comment uncommenting the problematic lines.
peterhuene commented on Issue #1199:
So it looks like it is passing Windows CI with the uncommented lines. Perhaps the other fix to properly close the handles before unlinking was what actually resolved the issue?
I think we can leave it uncommented and remove the preceding comments about the lines being a problem. I'll go ahead with a review.
marmistrz commented on Issue #1199:
So it looks like it is passing Windows CI with the uncommented lines. Perhaps the other fix to properly close the handles before unlinking was what actually resolved the issue?
Absolutely no idea. Perhaps some cleanup I did after removing the fdstats interfered. Anyway, I'm happy that this mysterious failure is gone. I will push out the review changes in several steps.
I'm still unsure about point 2 in non-strictness:
2\. `path_link` will return `EACCES` instead of `EPERM` when trying to create a link to a subdirectory. This violates the POSIX spec. We could manually check if the source path is a directory in case of `ERROR_ACCESS_DENIED` but this would cost us an extra syscall.Should I correct this in the error mappings or do we prefer to avoid the extra syscall?
kubkon commented on Issue #1199:
So it looks like it is passing Windows CI with the uncommented lines. Perhaps the other fix to properly close the handles before unlinking was what actually resolved the issue?
Absolutely no idea. Perhaps some cleanup I did after removing the fdstats interfered. Anyway, I'm happy that this mysterious failure is gone. I will push out the review changes in several steps.
I'm still unsure about point 2 in non-strictness:
2\. `path_link` will return `EACCES` instead of `EPERM` when trying to create a link to a subdirectory. This violates the POSIX spec. We could manually check if the source path is a directory in case of `ERROR_ACCESS_DENIED` but this would cost us an extra syscall.
Should I correct this in the error mappings or do we prefer to avoid the extra syscall?
I think we should be fine either way. We should be fine with an additional syscall since we already do that in
path::symlink
anyway.
marmistrz commented on Issue #1199:
So it looks like it is passing Windows CI with the uncommented lines. Perhaps the other fix to properly close the handles before unlinking was what actually resolved the issue?
Absolutely no idea. Perhaps some cleanup I did after removing the fdstats interfered. Anyway, I'm happy that this mysterious failure is gone. I will push out the review changes in several steps.
I'm still unsure about point 2 in non-strictness:
2\. `path_link` will return `EACCES` instead of `EPERM` when trying to create a link to a subdirectory. This violates the POSIX spec. We could manually check if the source path is a directory in case of `ERROR_ACCESS_DENIED` but this would cost us an extra syscall.
Should I correct this in the error mappings or do we prefer to avoid the extra syscall?
I think we should be fine either way. We should be fine with an additional syscall since we already do that in
path::symlink
anyway.@peterhuene and what do you think?
peterhuene commented on Issue #1199:
I'm :+1: on what you now have.
marmistrz commented on Issue #1199:
Awesome! Then, unless anything comes up, I'm merging the PR tomorrow evening CET.
Last updated: Dec 23 2024 at 12:05 UTC