liutao-liu requested wasmtime-core-reviewers for a review on PR #8303.
liutao-liu opened PR #8303 from liutao-liu:opt_read_write
to bytecodealliance:main
:
- Issue#7973
Compared to native(C/C++),
fd-read
can only reach 20% of native. I will divide the performance hotspots into the following two categories:,
- Wasi implementation of
fd-read
(approximately 30% of the hotspots), such as the cost of redundant memory copies, the cost of inserting and deletingHostInputStream
intoWasiView
, and the cost of accessing guest memory(merged PR);;- Wasm program call host function(approximately 40% of the hotspots): I am not very familiar with this mechanism yet, and currently I cannot understand the reasons behind the hotspots here;
This PR is mainly optimized for the first type of performance hotspot:
- Avoid unnecessary copying;
- Avoid inserting and deleting
HostInputStream
intoWasiView
;The perfermance hotspots of
fd_write
is the samefd_read
;
liutao-liu requested alexcrichton for a review on PR #8303.
liutao-liu edited PR #8303:
- Issue #7973
Compared to native(C/C++),
fd-read
can only reach 20% of native. I will divide the performance hotspots into the following two categories:,
- Wasi implementation of
fd-read
(approximately 30% of the hotspots), such as the cost of redundant memory copies, the cost of inserting and deletingHostInputStream
intoWasiView
, and the cost of accessing guest memory(merged PR);- Wasm program call host function(approximately 40% of the hotspots): I am not very familiar with this mechanism yet, and currently I cannot understand the reasons behind the hotspots here;
This PR is mainly optimized for the first type of performance hotspot:
- Avoid unnecessary copying;
- Avoid inserting and deleting
HostInputStream
intoWasiView
;The perfermance hotspots of
fd_write
is the samefd_read
:
https://github.com/bytecodealliance/wasmtime/blob/86dcae460b5437703570e52253cc1b7d5bad8597/crates/wasi/src/preview1.rs#L1682
liutao-liu updated PR #8303.
liutao-liu updated PR #8303.
liutao-liu updated PR #8303.
alexcrichton submitted PR review:
Thanks for the PR!
I've left some notes below, but I'll additionally point out that this is specifically ignoring the
blocking_mode
now. Previously that was "respected" to a certain degree, but now it's being ignored. I believe that's ok because that's what the previous implementation in wasi-common did anyway as OSes tend to ignore nonblocking on files. That being said the current state of the code is a bit confusing where theblocking_mode
is sometimes used and sometimes isn't.I mostly wanted to point this out in case anyone runs into this in the future. I'll follow up with a PR after this one to clean this up. You don't have to clean it up here yourself @liutao-liu
alexcrichton submitted PR review:
Thanks for the PR!
I've left some notes below, but I'll additionally point out that this is specifically ignoring the
blocking_mode
now. Previously that was "respected" to a certain degree, but now it's being ignored. I believe that's ok because that's what the previous implementation in wasi-common did anyway as OSes tend to ignore nonblocking on files. That being said the current state of the code is a bit confusing where theblocking_mode
is sometimes used and sometimes isn't.I mostly wanted to point this out in case anyone runs into this in the future. I'll follow up with a PR after this one to clean this up. You don't have to clean it up here yourself @liutao-liu
alexcrichton created PR review comment:
This
.expect("...")
will need to be handled (e.g. we can't panic here). TheNone
happens here with shared memories at the core wasm level, meaning that this'll need to use thecopy_from_slice
method. (e.g. read data into a buffer on the host and then copy the contents of the buffer to the guest)
alexcrichton created PR review comment:
It's ok to omit this line, this isn't actually doing anything. As-written this is copying data from the guest to the host, truncating it, and then deallocating it.
alexcrichton created PR review comment:
Also given the inner loops below the outer loop can be removed too.
alexcrichton created PR review comment:
This limitation is only part of the
Host*Stream
traits, so it's ok to remove thisNOTE
as well as the.min(4096)
here too.
liutao-liu submitted PR review.
liutao-liu created PR review comment:
got
liutao-liu submitted PR review.
liutao-liu created PR review comment:
got
liutao-liu updated PR #8303.
liutao-liu updated PR #8303.
liutao-liu submitted PR review.
liutao-liu created PR review comment:
got
liutao-liu updated PR #8303.
liutao-liu commented on PR #8303:
@alexcrichton Your review comments have been completed,this pr can be merged.
alexcrichton submitted PR review.
alexcrichton merged PR #8303.
Last updated: Nov 22 2024 at 16:03 UTC