Hello,
I was browsing the code here https://github.com/rust-lang/rust/blob/master/library/std/src/sys/wasi/time.rs and I noticed that:
pub fn actually_monotonic() -> bool {
true
}
But then again wamr for example implements it as:
uint64
os_time_get_boot_microsecond()
{
struct timespec ts;
if (clock_gettime(CLOCK_MONOTONIC, &ts) != 0) {
return 0;
}
return ((uint64) ts.tv_sec) * 1000 * 1000 + ((uint64)ts.tv_nsec) / 1000;
}
As this bugfix links there are bugs in various places where monotonic clocks aren't actually monotonic :confused:. Some stackoverflow issues suggest the use of CLOCK_MONOTONIC_RAW
instead.
Is this actually safe or do we need to do some workarounds for wasm-wasi too (or the VMs)?
For example the wasm
one seems to not be actually monotonic, returning false in that method.
If there are any workarounds necessary they should be in the wasi runtime imho and not in libstd. A non-monotonic monotonic clock is a bug. The only reason libstd has this workaround is because there are several cases where the os is buggy.
I know that is correct in theory, however the plain wasm
one is returning false for actually_monotonic
. Maybe it's too hard to actually offer this guarantee (especially since we rely on VMs not making mistakes)
I didn't check the wasi part tho, I just assumed it forwards the call
Plain wasm doesn't have Instant::now
at all. It panics.
One argument to not rely on wasi impls for this is that they update at various speeds so once they have a bug, it kind of recreates what the commit here has fixed: https://github.com/rust-lang/rust/pull/56988
Basically WASI impls bugs in the wild that have released and now need to be worked around.
So maybe it's safer to return false and add a path like windows and others have?
I don't know how to stop it from expanding links :grinning:
The current design incorporates some amount of idealism :-). In the long run, WASI will be a better platform if we just spec its monotonic clock to be reliably monotonic, so that we don't need workarounds in every language runtime that compiles to it.
If wamr is trusting the OS clock_gettime
, and the OS clock_gettime
doesn't work, we should file a bug against wamr.
That would be a good first step.
I still think that once a VM is in use, deployed in an environment where updates are rare (e.g. IOT cases),
then we will likely have to go back to this and re-consider treating it in a more realistic way like we do with Instant on the other systems :smile:
It's really hard to actually test this (VM impls providing actually monotonic times) in a conformance test.
There could even be cases where WAMR itself doesn't have technically a bug but it's deployed on one of the buggy situations from above see e.g windows.
It's bugs all the way down :grinning:.
Then again, duration of 0 is likely better than a panic from the program's perspective (even the current Instant::elapsed
kind of does that for instants created with now()
. Maybe the original panic on elapsed
was a bit too harsh of a behavior.
What's the issue on Windows?
Not exactly sure, but windows has configured in the PR above: pub fn actually_monotonic()
to return false
and is treated as such
Ah, thanks.
I'll need to do more investigation here.
I'm also hoping that as the ecosystem grows, it'll gain momentum, and vendors will have to think in terms of making sure their platforms are compatible with the ecosystem, rather than assuming they can compel the ecosystem to work around their bugs.
For example, when someone puts a wasm VM on an IOT device today, and the clock goes backward, it may be tempting to say "ah, well that's just what happens on that device, fix the wasm". But in the future, when there are lots of wasm programs, suddenly "oh, it's a lot of work to fix all the wasm programs, I guess we'll have to fix the OS"
I guess it depends on how we want to push for this:
a) VM implementers have to do what time::Instant does basically
b) We adapt the rust stdlib with a path that's already taken for many other platforms :smile:.
They aren't mutually exclusive. So doing b) doesn't stop us to push people for actually fixing bugs.
I think C++ stdlib likely suffers from the same issue, except it doesn't panic so it's likely not as observable as for rust stdlib.
Windows isn't niche for example and this PR is from 2019 so it will take some time to fix I suppose.
That's true. Part of my understanding here was on the assumption that there are relatively few platforms where this is a problem. I need to learn more about this, and Windows in particular.
Maybe the next rust edition (2023) can give us a std::time::Instant::elapsed()
that never panics for Instant::now()
:smile:
Last updated: Nov 22 2024 at 16:03 UTC