kubkon edited PR #1329 from fd_handles
to master
:
This PR adds a custom
FdSet
container which is intended
for use inwasi-common
to track WASI fd allocs/deallocs. The
main aim for this container is to abstract away the current
approach of spawning new handlesfd = fd.checked_add(1).ok_or(...)?;and to make it possible to reuse unused/reclaimed handles
which currently is not done.The struct offers 3 methods to manage its functionality:
FdSet::new
initialises the internal data structures,
and most notably, it preallocates anFdSet::BATCH_SIZE
worth of handles in such a way that we always start popping
from the "smallest" handle (think of it as of reversed stack,
I guess; it's not a binary heap since we don't really care
whether internally the handles are sorted in some way, just that
the "largets" handle is at the bottom. Why will become clear
when describingallocate
method.)
FdSet::allocate
pops the next available handle if one is available.
The tricky bit here is that, if we run out of handles, we preallocate
the nextFdSet::BATCH_SIZE
worth of handles starting from the
latest popped handle (i.e., the "largest" handle). This
works only because we make sure to only ever pop and push already
existing handles from the back, and push _new_ handles (from the
preallocation step) from the front. When we ultimately run out
of _all_ available handles, we then returnNone
for the client
to handle in some way (e.g., throwing an error such asWasiError::EMFILE
or whatnot).
FdSet::deallocate
returns the already allocated handle back to
the pool for further reuse.When figuring out the internals, I've tried to optimise for both
alloc and dealloc performance, and I believe we've got an amortised
O(1)~*
performance for both (if my maths is right, and it may very
well not be, so please verify!).In order to keep
FdSet
fairly generic, I've made sure not to hard-code
it for the current type system generated bywig
(i.e.,wasi::__wasi_fd_t
representing WASI handle), but rather, any type which wants to be managed
byFdSet
needs to conform toFd
trait. This trait is quite simple as
it only requires a couple of rudimentary traits (althoughstd:#️⃣:Hash
is quite a powerful assumption here!), and a custom methodFd::next(&self) -> Option<Self>;which is there to encapsulate creating another handle from the given one.
In the current state of the code, that'd be simplyu32::checked_add(1)
.
Whenwiggle
makes it way into thewasi-common
, I'd imagine it being
similar tofn next(&self) -> Option<Self> { self.0.checked_add(1).map(Self::from) }Anyhow, I'd be happy to learn your thoughts about this design!
kubkon edited PR #1329 from fd_handles
to master
:
This PR adds a custom
FdPool
container which is intended
for use inwasi-common
to track WASI fd allocs/deallocs. The
main aim for this container is to abstract away the current
approach of spawning new handlesfd = fd.checked_add(1).ok_or(...)?;and to make it possible to reuse unused/reclaimed handles
which currently is not done.The struct offers 3 methods to manage its functionality:
FdPool::new
initialises the internal data structures,
and most notably, it preallocates anFdPool::BATCH_SIZE
worth of handles in such a way that we always start popping
from the "smallest" handle (think of it as of reversed stack,
I guess; it's not a binary heap since we don't really care
whether internally the handles are sorted in some way, just that
the "largets" handle is at the bottom. Why will become clear
when describingallocate
method.)
FdPool::allocate
pops the next available handle if one is available.
The tricky bit here is that, if we run out of handles, we preallocate
the nextFdPool::BATCH_SIZE
worth of handles starting from the
latest popped handle (i.e., the "largest" handle). This
works only because we make sure to only ever pop and push already
existing handles from the back, and push _new_ handles (from the
preallocation step) from the front. When we ultimately run out
of _all_ available handles, we then returnNone
for the client
to handle in some way (e.g., throwing an error such asWasiError::EMFILE
or whatnot).
FdPool::deallocate
returns the already allocated handle back to
the pool for further reuse.When figuring out the internals, I've tried to optimise for both
alloc and dealloc performance, and I believe we've got an amortised
O(1)~*
performance for both (if my maths is right, and it may very
well not be, so please verify!).In order to keep
FdPool
fairly generic, I've made sure not to hard-code
it for the current type system generated bywig
(i.e.,wasi::__wasi_fd_t
representing WASI handle), but rather, any type which wants to be managed
byFdPool
needs to conform toFd
trait. This trait is quite simple as
it only requires a couple of rudimentary traits (althoughstd:#️⃣:Hash
is quite a powerful assumption here!), and a custom methodFd::next(&self) -> Option<Self>;which is there to encapsulate creating another handle from the given one.
In the current state of the code, that'd be simplyu32::checked_add(1)
.
Whenwiggle
makes it way into thewasi-common
, I'd imagine it being
similar tofn next(&self) -> Option<Self> { self.0.checked_add(1).map(Self::from) }Anyhow, I'd be happy to learn your thoughts about this design!
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
I think we can probably avoid a literal list-of-indices here perhaps? We could have a
Vec
of available indices, as well as a "next index to hand out" stored asOption<Fd>
. Allocation would pop from theVec
or otherwise.take()
the next index. If the next index was taken we store the result of.next()
back in there.Later during
deallocate
we'd just push the entry onto theVec
.I think that may simplify a bit here and make it a bit more lightweight in terms of overhead?
alexcrichton created PR Review Comment:
Instead of
.replace
I think this could beself.stdin = Some(...)
, right?
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Could this file reexport from the current snapshot to avoid duplication?
kubkon submitted PR Review.
kubkon created PR Review Comment:
So basically, you suggest to skip the preallocation step, did I get it right? If so, this is what I originally had, just thought we could preallocate a couple upfront as means of optimisation but I guess that might have been somewhat premature ;-)
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Right yeah I think the preallocation can largely be skipped. We can always continue to optimize it more though if necessary!
kubkon submitted PR Review.
kubkon created PR Review Comment:
Excellent question! I thought about this myself, and in principle, I'd love if we could. The problem is with type aliasing between the two snapshots. The compiler gets confused which
wasi::__wasi_fd_t as Fd
to use. Do you think there's a way to overcome this? Would perhaps type aliasing work here?
kubkon submitted PR Review.
kubkon created PR Review Comment:
Agreed! Let me roll it back then!
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Hm I'm not sure I understand the compiler error, wanna push up a branch I can poke at?
kubkon updated PR #1329 from fd_handles
to master
:
This PR adds a custom
FdPool
container which is intended
for use inwasi-common
to track WASI fd allocs/deallocs. The
main aim for this container is to abstract away the current
approach of spawning new handlesfd = fd.checked_add(1).ok_or(...)?;and to make it possible to reuse unused/reclaimed handles
which currently is not done.The struct offers 3 methods to manage its functionality:
FdPool::new
initialises the internal data structures,
and most notably, it preallocates anFdPool::BATCH_SIZE
worth of handles in such a way that we always start popping
from the "smallest" handle (think of it as of reversed stack,
I guess; it's not a binary heap since we don't really care
whether internally the handles are sorted in some way, just that
the "largets" handle is at the bottom. Why will become clear
when describingallocate
method.)
FdPool::allocate
pops the next available handle if one is available.
The tricky bit here is that, if we run out of handles, we preallocate
the nextFdPool::BATCH_SIZE
worth of handles starting from the
latest popped handle (i.e., the "largest" handle). This
works only because we make sure to only ever pop and push already
existing handles from the back, and push _new_ handles (from the
preallocation step) from the front. When we ultimately run out
of _all_ available handles, we then returnNone
for the client
to handle in some way (e.g., throwing an error such asWasiError::EMFILE
or whatnot).
FdPool::deallocate
returns the already allocated handle back to
the pool for further reuse.When figuring out the internals, I've tried to optimise for both
alloc and dealloc performance, and I believe we've got an amortised
O(1)~*
performance for both (if my maths is right, and it may very
well not be, so please verify!).In order to keep
FdPool
fairly generic, I've made sure not to hard-code
it for the current type system generated bywig
(i.e.,wasi::__wasi_fd_t
representing WASI handle), but rather, any type which wants to be managed
byFdPool
needs to conform toFd
trait. This trait is quite simple as
it only requires a couple of rudimentary traits (althoughstd:#️⃣:Hash
is quite a powerful assumption here!), and a custom methodFd::next(&self) -> Option<Self>;which is there to encapsulate creating another handle from the given one.
In the current state of the code, that'd be simplyu32::checked_add(1)
.
Whenwiggle
makes it way into thewasi-common
, I'd imagine it being
similar tofn next(&self) -> Option<Self> { self.0.checked_add(1).map(Self::from) }Anyhow, I'd be happy to learn your thoughts about this design!
kubkon submitted PR Review.
kubkon created PR Review Comment:
Right, I now remembered what the issue was. Since
__wasi_fd_t
is a type alias tou32
in both snapshots, we obviously cannot define two implementations of the form// this one is in snapshot1, in wasi.rs impl crate::fdpool::Fd for __wasi_fd_t { // ... }and
// this one is in snapshot0, in old/snapshot0/wasi.rs impl crate::fdpool::Fd for __wasi_fd_t { // ... }I guess given that there is virtually no ABI difference between the two types across both snapshots, we could simply ignore implementing it in snapshot0. Would that be acceptable? I mean given that soon-ish we'd want to use
wiggle
to completely rethink how we handle multiple snapshots with as little code duplication as possible, this seems fair to me.
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Yeah I think that'll be alright!
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
This could use
?
I think if you wanted to be super nifty
kubkon updated PR #1329 from fd_handles
to master
:
This PR adds a custom
FdPool
container which is intended
for use inwasi-common
to track WASI fd allocs/deallocs. The
main aim for this container is to abstract away the current
approach of spawning new handlesfd = fd.checked_add(1).ok_or(...)?;and to make it possible to reuse unused/reclaimed handles
which currently is not done.The struct offers 3 methods to manage its functionality:
FdPool::new
initialises the internal data structures,
and most notably, it preallocates anFdPool::BATCH_SIZE
worth of handles in such a way that we always start popping
from the "smallest" handle (think of it as of reversed stack,
I guess; it's not a binary heap since we don't really care
whether internally the handles are sorted in some way, just that
the "largets" handle is at the bottom. Why will become clear
when describingallocate
method.)
FdPool::allocate
pops the next available handle if one is available.
The tricky bit here is that, if we run out of handles, we preallocate
the nextFdPool::BATCH_SIZE
worth of handles starting from the
latest popped handle (i.e., the "largest" handle). This
works only because we make sure to only ever pop and push already
existing handles from the back, and push _new_ handles (from the
preallocation step) from the front. When we ultimately run out
of _all_ available handles, we then returnNone
for the client
to handle in some way (e.g., throwing an error such asWasiError::EMFILE
or whatnot).
FdPool::deallocate
returns the already allocated handle back to
the pool for further reuse.When figuring out the internals, I've tried to optimise for both
alloc and dealloc performance, and I believe we've got an amortised
O(1)~*
performance for both (if my maths is right, and it may very
well not be, so please verify!).In order to keep
FdPool
fairly generic, I've made sure not to hard-code
it for the current type system generated bywig
(i.e.,wasi::__wasi_fd_t
representing WASI handle), but rather, any type which wants to be managed
byFdPool
needs to conform toFd
trait. This trait is quite simple as
it only requires a couple of rudimentary traits (althoughstd:#️⃣:Hash
is quite a powerful assumption here!), and a custom methodFd::next(&self) -> Option<Self>;which is there to encapsulate creating another handle from the given one.
In the current state of the code, that'd be simplyu32::checked_add(1)
.
Whenwiggle
makes it way into thewasi-common
, I'd imagine it being
similar tofn next(&self) -> Option<Self> { self.0.checked_add(1).map(Self::from) }Anyhow, I'd be happy to learn your thoughts about this design!
kubkon submitted PR Review.
kubkon created PR Review Comment:
Of course, well spotted, thanks!
kubkon updated PR #1329 from fd_handles
to master
:
This PR adds a custom
FdPool
container which is intended
for use inwasi-common
to track WASI fd allocs/deallocs. The
main aim for this container is to abstract away the current
approach of spawning new handlesfd = fd.checked_add(1).ok_or(...)?;and to make it possible to reuse unused/reclaimed handles
which currently is not done.The struct offers 3 methods to manage its functionality:
FdPool::new
initialises the internal data structures,
and most notably, it preallocates anFdPool::BATCH_SIZE
worth of handles in such a way that we always start popping
from the "smallest" handle (think of it as of reversed stack,
I guess; it's not a binary heap since we don't really care
whether internally the handles are sorted in some way, just that
the "largets" handle is at the bottom. Why will become clear
when describingallocate
method.)
FdPool::allocate
pops the next available handle if one is available.
The tricky bit here is that, if we run out of handles, we preallocate
the nextFdPool::BATCH_SIZE
worth of handles starting from the
latest popped handle (i.e., the "largest" handle). This
works only because we make sure to only ever pop and push already
existing handles from the back, and push _new_ handles (from the
preallocation step) from the front. When we ultimately run out
of _all_ available handles, we then returnNone
for the client
to handle in some way (e.g., throwing an error such asWasiError::EMFILE
or whatnot).
FdPool::deallocate
returns the already allocated handle back to
the pool for further reuse.When figuring out the internals, I've tried to optimise for both
alloc and dealloc performance, and I believe we've got an amortised
O(1)~*
performance for both (if my maths is right, and it may very
well not be, so please verify!).In order to keep
FdPool
fairly generic, I've made sure not to hard-code
it for the current type system generated bywig
(i.e.,wasi::__wasi_fd_t
representing WASI handle), but rather, any type which wants to be managed
byFdPool
needs to conform toFd
trait. This trait is quite simple as
it only requires a couple of rudimentary traits (althoughstd:#️⃣:Hash
is quite a powerful assumption here!), and a custom methodFd::next(&self) -> Option<Self>;which is there to encapsulate creating another handle from the given one.
In the current state of the code, that'd be simplyu32::checked_add(1)
.
Whenwiggle
makes it way into thewasi-common
, I'd imagine it being
similar tofn next(&self) -> Option<Self> { self.0.checked_add(1).map(Self::from) }Anyhow, I'd be happy to learn your thoughts about this design!
kubkon requested alexcrichton, pchickey, and sunfishcode for a review on PR #1329.
alexcrichton submitted PR Review.
sunfishcode submitted PR Review.
sunfishcode submitted PR Review.
sunfishcode created PR Review Comment:
This can use a slice literal,
&[ ... ]
, instead of a temporaryvec!
.
sunfishcode created PR Review Comment:
In place of the
claimed
check (which I propose removing above), this can insteadassert
thatfd
is not greater thannext_alloc
, anddebug_assert
that it's not inavailable
(debug_assert
because it could be slow in some cases).
sunfishcode created PR Review Comment:
As above, this can use a slice literal instead of a temporary
vec!
.
sunfishcode created PR Review Comment:
For tidiness, we should do the
entries.remove
before doing thefds.deallocate
, so that we remove things in reverse order from how we add them, so that we maintain an invariant that at any time afd
is inentries
, it's allocated infds
too.With that, we can remove the
claimed
field ofFdPool
, and assume that removal always succeeds.
kubkon submitted PR Review.
kubkon created PR Review Comment:
The reason I didn't use slice literal here is due to the dereferencing rules. In
PendingEntry::File
arm we useOsHandle::from
which requiresstd::fs::File
, and with slice literal we provide a ref instead.
kubkon submitted PR Review.
kubkon created PR Review Comment:
See above.
kubkon submitted PR Review.
kubkon created PR Review Comment:
Hmm, I'm fine with asserts. However, the reason I've designed
deallocate
returnbool
was to mimic the behaviour ofHashSet::remove
which returns abool
to signal if removal worked or not (i.e., if any element was actually removed in the end). In this case, I guess you're right that if we try removing from theclaimed
HashSet
and it actually fails, then something went horribly wrong as it shouldn't ever happen.
kubkon submitted PR Review.
kubkon created PR Review Comment:
Ah, good call, thanks!
sunfishcode submitted PR Review.
sunfishcode created PR Review Comment:
Ah, makes sense!
kubkon created PR Review Comment:
Oh, after re-reading your comment again, checking whether
fd
is not greater thannext_alloc
will require anif
anyway sincenext_alloc
is wrapped in anOption
and it will not always be guaranteed to beSome
(in particular, when we exceed all possible allocs for instance, e.g.,std::u32::MAX
).
kubkon submitted PR Review.
kubkon submitted PR Review.
kubkon created PR Review Comment:
Given the above, unless you can think of a cleaner way of doing it, I guess I'd be in favour of the original approach. What do you think?
kubkon updated PR #1329 from fd_handles
to master
:
This PR adds a custom
FdPool
container which is intended
for use inwasi-common
to track WASI fd allocs/deallocs. The
main aim for this container is to abstract away the current
approach of spawning new handlesfd = fd.checked_add(1).ok_or(...)?;and to make it possible to reuse unused/reclaimed handles
which currently is not done.The struct offers 3 methods to manage its functionality:
FdPool::new
initialises the internal data structures,
and most notably, it preallocates anFdPool::BATCH_SIZE
worth of handles in such a way that we always start popping
from the "smallest" handle (think of it as of reversed stack,
I guess; it's not a binary heap since we don't really care
whether internally the handles are sorted in some way, just that
the "largets" handle is at the bottom. Why will become clear
when describingallocate
method.)
FdPool::allocate
pops the next available handle if one is available.
The tricky bit here is that, if we run out of handles, we preallocate
the nextFdPool::BATCH_SIZE
worth of handles starting from the
latest popped handle (i.e., the "largest" handle). This
works only because we make sure to only ever pop and push already
existing handles from the back, and push _new_ handles (from the
preallocation step) from the front. When we ultimately run out
of _all_ available handles, we then returnNone
for the client
to handle in some way (e.g., throwing an error such asWasiError::EMFILE
or whatnot).
FdPool::deallocate
returns the already allocated handle back to
the pool for further reuse.When figuring out the internals, I've tried to optimise for both
alloc and dealloc performance, and I believe we've got an amortised
O(1)~*
performance for both (if my maths is right, and it may very
well not be, so please verify!).In order to keep
FdPool
fairly generic, I've made sure not to hard-code
it for the current type system generated bywig
(i.e.,wasi::__wasi_fd_t
representing WASI handle), but rather, any type which wants to be managed
byFdPool
needs to conform toFd
trait. This trait is quite simple as
it only requires a couple of rudimentary traits (althoughstd:#️⃣:Hash
is quite a powerful assumption here!), and a custom methodFd::next(&self) -> Option<Self>;which is there to encapsulate creating another handle from the given one.
In the current state of the code, that'd be simplyu32::checked_add(1)
.
Whenwiggle
makes it way into thewasi-common
, I'd imagine it being
similar tofn next(&self) -> Option<Self> { self.0.checked_add(1).map(Self::from) }Anyhow, I'd be happy to learn your thoughts about this design!
kubkon created PR Review Comment:
Done in 647cc1f. I'm still a little bit unclear how to address
FdPool::deallocate
though. Would you mind giving it another look after re-reading my comments please?
kubkon submitted PR Review.
kubkon created PR Review Comment:
Perhaps we could
assert!
onself.claimed.remove
instead? This would ensure wepanic!
if the client code ever provides an illegalfd
value?
kubkon submitted PR Review.
kubkon updated PR #1329 from fd_handles
to master
:
This PR adds a custom
FdPool
container which is intended
for use inwasi-common
to track WASI fd allocs/deallocs. The
main aim for this container is to abstract away the current
approach of spawning new handlesfd = fd.checked_add(1).ok_or(...)?;and to make it possible to reuse unused/reclaimed handles
which currently is not done.The struct offers 3 methods to manage its functionality:
FdPool::new
initialises the internal data structures,
and most notably, it preallocates anFdPool::BATCH_SIZE
worth of handles in such a way that we always start popping
from the "smallest" handle (think of it as of reversed stack,
I guess; it's not a binary heap since we don't really care
whether internally the handles are sorted in some way, just that
the "largets" handle is at the bottom. Why will become clear
when describingallocate
method.)
FdPool::allocate
pops the next available handle if one is available.
The tricky bit here is that, if we run out of handles, we preallocate
the nextFdPool::BATCH_SIZE
worth of handles starting from the
latest popped handle (i.e., the "largest" handle). This
works only because we make sure to only ever pop and push already
existing handles from the back, and push _new_ handles (from the
preallocation step) from the front. When we ultimately run out
of _all_ available handles, we then returnNone
for the client
to handle in some way (e.g., throwing an error such asWasiError::EMFILE
or whatnot).
FdPool::deallocate
returns the already allocated handle back to
the pool for further reuse.When figuring out the internals, I've tried to optimise for both
alloc and dealloc performance, and I believe we've got an amortised
O(1)~*
performance for both (if my maths is right, and it may very
well not be, so please verify!).In order to keep
FdPool
fairly generic, I've made sure not to hard-code
it for the current type system generated bywig
(i.e.,wasi::__wasi_fd_t
representing WASI handle), but rather, any type which wants to be managed
byFdPool
needs to conform toFd
trait. This trait is quite simple as
it only requires a couple of rudimentary traits (althoughstd:#️⃣:Hash
is quite a powerful assumption here!), and a custom methodFd::next(&self) -> Option<Self>;which is there to encapsulate creating another handle from the given one.
In the current state of the code, that'd be simplyu32::checked_add(1)
.
Whenwiggle
makes it way into thewasi-common
, I'd imagine it being
similar tofn next(&self) -> Option<Self> { self.0.checked_add(1).map(Self::from) }Anyhow, I'd be happy to learn your thoughts about this design!
kubkon requested alexcrichton, and pchickey for a review on PR #1329.
kubkon requested alexcrichton, pchickey, and sunfishcode for a review on PR #1329.
alexcrichton submitted PR Review.
sunfishcode submitted PR Review.
kubkon merged PR #1329.
Last updated: Jan 24 2025 at 00:11 UTC