alexcrichton opened issue #4309:
Currently support for utf16+latin1 is not implemented in Wasmtime, but we'll need to finish this and test it before the component model is considered done.
In general I'd expect that this would use the
encoding_rs
crate for the internal details of latin1 to avoid open-coding that in Wasmtime itself.Lowering
Lowering a string into wasm is currently unimplemented. I think that this is required to implement the
store_string_to_latin1_or_utf16
function in the canonical ABI explainer. My current understanding is that even if we could implement something more optimal in Rust we can't do that because the semantics of lowering are already specified.I believe the pseudo-code there does most of the fiddly bits but some small helpers in
encoding_rs
are probably going to be required.Lifting
Calculation of the byte length and actually getting the string are unimplemented. I think that we're free to use
encoding_rs
here however we see fit. Probably thedecode_latin1
function will be useful here.Other notes
I am personally unfamilar with
latin1
as an encoding. I don't know if an arbitrary list of types are guaranteed to be valid latin1 or not. (the infallibility ofdecode_latin1
seems odd to me).Using
encoding_rs
may be a better option for utf16 decoding we currently do (and maybe even utf8 sinceencoding_rs
can probably do simd things that the standard library can't). If someone's intrepid it might be interesting to try to benchmark this and see if it's beneficial to useencoding_rs
for almost everything.
alexcrichton labeled issue #4309:
Currently support for utf16+latin1 is not implemented in Wasmtime, but we'll need to finish this and test it before the component model is considered done.
In general I'd expect that this would use the
encoding_rs
crate for the internal details of latin1 to avoid open-coding that in Wasmtime itself.Lowering
Lowering a string into wasm is currently unimplemented. I think that this is required to implement the
store_string_to_latin1_or_utf16
function in the canonical ABI explainer. My current understanding is that even if we could implement something more optimal in Rust we can't do that because the semantics of lowering are already specified.I believe the pseudo-code there does most of the fiddly bits but some small helpers in
encoding_rs
are probably going to be required.Lifting
Calculation of the byte length and actually getting the string are unimplemented. I think that we're free to use
encoding_rs
here however we see fit. Probably thedecode_latin1
function will be useful here.Other notes
I am personally unfamilar with
latin1
as an encoding. I don't know if an arbitrary list of types are guaranteed to be valid latin1 or not. (the infallibility ofdecode_latin1
seems odd to me).Using
encoding_rs
may be a better option for utf16 decoding we currently do (and maybe even utf8 sinceencoding_rs
can probably do simd things that the standard library can't). If someone's intrepid it might be interesting to try to benchmark this and see if it's beneficial to useencoding_rs
for almost everything.
alexcrichton closed issue #4309:
Currently support for utf16+latin1 is not implemented in Wasmtime, but we'll need to finish this and test it before the component model is considered done.
In general I'd expect that this would use the
encoding_rs
crate for the internal details of latin1 to avoid open-coding that in Wasmtime itself.Lowering
Lowering a string into wasm is currently unimplemented. I think that this is required to implement the
store_string_to_latin1_or_utf16
function in the canonical ABI explainer. My current understanding is that even if we could implement something more optimal in Rust we can't do that because the semantics of lowering are already specified.I believe the pseudo-code there does most of the fiddly bits but some small helpers in
encoding_rs
are probably going to be required.Lifting
Calculation of the byte length and actually getting the string are unimplemented. I think that we're free to use
encoding_rs
here however we see fit. Probably thedecode_latin1
function will be useful here.Other notes
I am personally unfamilar with
latin1
as an encoding. I don't know if an arbitrary list of types are guaranteed to be valid latin1 or not. (the infallibility ofdecode_latin1
seems odd to me).Using
encoding_rs
may be a better option for utf16 decoding we currently do (and maybe even utf8 sinceencoding_rs
can probably do simd things that the standard library can't). If someone's intrepid it might be interesting to try to benchmark this and see if it's beneficial to useencoding_rs
for almost everything.
Last updated: Jan 24 2025 at 00:11 UTC