mokhaled2992 edited issue #5005:
I'm getting a panic and the stacktrace points at the following line. This happens at the very early stages when I initialize my JIT and try to declare some functions. Any hints on how to solve this?
unsafe fn write_plt_entry_bytes(plt_ptr: *mut [u8; 16], got_ptr: NonNull<AtomicPtr<u8>>) { ................... plt_val[2..6].copy_from_slice(&i32::to_ne_bytes(i32::try_from(what - at).unwrap())); ................... }
mokhaled2992 edited issue #5005:
I'm getting a panic and the stacktrace points at the following line. This happens at the very early stages when I initialize my JIT and try to declare some functions. Any hints on how to solve this?
unsafe fn write_plt_entry_bytes(plt_ptr: *mut [u8; 16], got_ptr: NonNull<AtomicPtr<u8>>) { ................... plt_val[2..6].copy_from_slice(&i32::to_ne_bytes(i32::try_from(what - at).unwrap())); ................... }
More info:
I'm running on
x86_64
andUbuntu 20
.My JIT only declares some external rust functions and creates a wrapper function that calls those.
I ran into another memory permissions problem and was pointed to use the feature flag
selinux-fix
which solved that problem (see https://github.com/bytecodealliance/wasmtime/issues/4980#issue-1391090347).
mokhaled2992 edited issue #5005:
I'm getting a panic and the stacktrace points at the following line. This happens at the very early stages when I initialize my JIT and try to declare some functions. Any hints on how to solve this?
unsafe fn write_plt_entry_bytes(plt_ptr: *mut [u8; 16], got_ptr: NonNull<AtomicPtr<u8>>) { ................... plt_val[2..6].copy_from_slice(&i32::to_ne_bytes(i32::try_from(what - at).unwrap())); ................... }
More info:
- I'm running on
x86_64
andUbuntu 20
.- My JIT only declares some external rust functions and creates a wrapper function that calls those.
- I ran into another memory permissions problem and was pointed to use the feature flag
selinux-fix
which solved that problem (see https://github.com/bytecodealliance/wasmtime/issues/4980#issue-1391090347).
mokhaled2992 edited issue #5005:
I'm getting a panic and the stacktrace points at the following line. This happens at the very early stages when I initialize my JIT and try to declare some functions. Any hints on how to solve this?
unsafe fn write_plt_entry_bytes(plt_ptr: *mut [u8; 16], got_ptr: NonNull<AtomicPtr<u8>>) { ................... plt_val[2..6].copy_from_slice(&i32::to_ne_bytes(i32::try_from(what - at).unwrap())); ................... }
More info:
- I'm running on
x86_64
andUbuntu 20
.- My JIT only declares some external rust functions and creates a wrapper function that calls those.
- I ran into another memory permissions problem and was pointed to use the feature flag
selinux-fix
which solved that problem (see https://github.com/bytecodealliance/wasmtime/issues/4980#issue-1391090347).- Disabling
is_pic
solves this panic, is that a bug?
mokhaled2992 edited issue #5005:
I'm getting a panic and the stacktrace points at the following line. This happens at the very early stages when I initialize my JIT and try to declare some functions. Any hints on how to solve this?
unsafe fn write_plt_entry_bytes(plt_ptr: *mut [u8; 16], got_ptr: NonNull<AtomicPtr<u8>>) { ................... plt_val[2..6].copy_from_slice(&i32::to_ne_bytes(i32::try_from(what - at).unwrap())); ................... }
More info:
- I'm running on
x86_64
andUbuntu 20
.- My JIT only declares some external rust functions and creates a wrapper function that calls those.
- I previously ran into another memory execution permissions problem and was pointed to use the feature flag
selinux-fix
which solved that problem (see https://github.com/bytecodealliance/wasmtime/issues/4980#issue-1391090347).- Disabling
is_pic
obviously solves this panic, is that a bug?
mokhaled2992 edited issue #5005:
I'm getting a panic and the stacktrace points at the line shown below. This happens at the very early stages when I initialize my JIT and try to declare some functions. Any hints on how to solve this?
unsafe fn write_plt_entry_bytes(plt_ptr: *mut [u8; 16], got_ptr: NonNull<AtomicPtr<u8>>) { ................... plt_val[2..6].copy_from_slice(&i32::to_ne_bytes(i32::try_from(what - at).unwrap())); ................... }
More info:
- I'm running on
x86_64
andUbuntu 20
.- My JIT only declares some external rust functions and creates a wrapper function that calls those.
- I previously ran into another memory execution permissions problem and was pointed to use the feature flag
selinux-fix
which solved that problem (see https://github.com/bytecodealliance/wasmtime/issues/4980#issue-1391090347).- Disabling
is_pic
obviously solves this panic, is that a bug?
mokhaled2992 edited issue #5005:
I'm getting a panic and the stacktrace points at the line shown below. This happens at the very early stages when I initialize my JIT and try to declare some functions. Any hints on how to solve this?
unsafe fn write_plt_entry_bytes(plt_ptr: *mut [u8; 16], got_ptr: NonNull<AtomicPtr<u8>>) { ................... plt_val[2..6].copy_from_slice(&i32::to_ne_bytes(i32::try_from(what - at).unwrap())); ................... }
More info
- I'm running on
x86_64
andUbuntu 20
.- My JIT only declares some external rust functions and creates a wrapper function that calls those.
- I previously ran into another memory execution permissions problem and was pointed to use the feature flag
selinux-fix
which solved that problem (see https://github.com/bytecodealliance/wasmtime/issues/4980#issue-1391090347).- Disabling
is_pic
obviously solves this panic, is that a bug?
bjorn3 commented on issue #5005:
When is_pic is enabled, it requires all code to be within 2GB of each other as a 32bit signed pc-relative offset is used for getting the address of called local functions. There is no code to guarantee this, thus leading to panics like this. I've known about this issue for a long time, but haven't had the time to think of a good solution.
bjorn3 commented on issue #5005:
Pretty much.
mokhaled2992 commented on issue #5005:
@bjorn3 I have been trying to wrap my head around this for a while now. So this is happening because the got and plt entries for a given function get allocated addresses too far from each other or?
mokhaled2992 commented on issue #5005:
@bjorn3 I'm trying to understand why a PLT entry is always created for each function declaration in the module in pic mode. I have some questions:
- Where exactly in the code does the call instruction get translated to the target's call instruction/sequence?
- Are there any external/local function calls that go through the PLT? I saw that
X86CallPLTRel4
is only generated for this special instructionElfTlsGetAddr
.
mokhaled2992 edited a comment on issue #5005:
@bjorn3 I'm trying to understand why a PLT entry is always created for each function declaration in the module in pic mode. I have some questions:
- Where exactly in the code does the call instruction get translated to the target's call instruction/sequence?
- Are there any external/local function calls that go through the PLT? I saw that
X86CallPLTRel4
is only generated for this special instructionElfTlsGetAddr
?
mokhaled2992 edited a comment on issue #5005:
@bjorn3 I'm trying to understand why a PLT entry is always created for each function declaration in the module in pic mode. I have some questions:
- Where exactly in the code does the call instruction get translated to the target's call instruction/sequence?
- Are there any external or even local function calls that go through the PLT? I saw that
X86CallPLTRel4
is only generated for this special instructionElfTlsGetAddr
?
mokhaled2992 edited a comment on issue #5005:
@bjorn3 I'm trying to understand why a PLT entry is always created for each function declaration in the module in pic mode. I have some questions:
- Where exactly in the code does the call instruction get translated to the target's call instruction/sequence?
- Are there any external or even local function calls that emit PLT relocations and go through the PLT at call sites? I saw that
X86CallPLTRel4
is only generated for this special instructionElfTlsGetAddr
?
mokhaled2992 edited a comment on issue #5005:
@bjorn3 I'm trying to understand why a PLT entry is always created for each function declaration in the module in pic mode. I have some questions:
- Where exactly in the code does the call instruction get translated to the target's call instruction/sequence? I saw in the
isa/x64/inst/emit.rs
file entries forInst::CallKnown
andInst::CallUnknown
but I ca't connect the dots back to the IR call instructions.- Are there any external or even local function calls that emit PLT relocations and go through the PLT at call sites? I saw that
X86CallPLTRel4
is only generated for this special instructionElfTlsGetAddr
?
mokhaled2992 edited a comment on issue #5005:
@bjorn3 I'm trying to understand why a PLT entry is always created for each function declaration in the module in pic mode. I have some questions:
- Where exactly in the code does the call instruction get translated to the target's call instruction/sequence? I saw in the
isa/x64/inst/emit.rs
file entries forInst::CallKnown
andInst::CallUnknown
but I can't connect the dots back to the IR call instructions.- Are there any external or even local function calls that emit PLT relocations and go through the PLT at call sites? I saw that
X86CallPLTRel4
is only generated for this special instructionElfTlsGetAddr
?
bjorn3 commented on issue #5005:
call lowers to CallKnown. call_indirect to CallUnknown.
Are there any external or even local function calls that emit PLT relocations and go through the PLT at call sites? I saw that X86CallPLTRel4 is only generated for this special instruction ElfTlsGetAddr?
For PLT, no unless hotswapping is enabled. For GOT, yes CallKnown turns into GOT lookup followed by an indirect call. It would likely be possible to stop maintaining the PLT when hotswapping is disabled, but the GOT has the same issue.
bjorn3 edited a comment on issue #5005:
call lowers to CallKnown. call_indirect to CallUnknown.
Are there any external or even local function calls that emit PLT relocations and go through the PLT at call sites? I saw that X86CallPLTRel4 is only generated for this special instruction ElfTlsGetAddr?
For PLT, no unless hotswapping is enabled. For GOT, yes CallKnown turns into GOT lookup followed by an indirect call in some cases. It would likely be possible to stop maintaining the PLT when hotswapping is disabled, but the GOT has the same issue.
mokhaled2992 commented on issue #5005:
If I got it right, for the hotswapping we
declare_function
once and then we can (re)define that function through this sequence:define_function
+finalize_definitions
+get_finalized_function
which compiles/loads the (new) definition in a new piece of memory and returns the address to it, i think it does not use or clean the memory of the old definition. My question is why do we need the PLT indirection if we already get the address to the newly created definition?
mokhaled2992 edited a comment on issue #5005:
If I got it right, for the hotswapping we
declare_function
once and then we can (re)define that function through this sequence: (prepare_for_function_redefine
) +define_function
+finalize_definitions
+get_finalized_function
which compiles/loads the (new) definition in a new piece of memory and returns the address to it, i think it does not use or clean the memory of the old definition. My question is why do we need the PLT indirection if we already get the address to the newly created definition?
bjorn3 commented on issue #5005:
The
define_function
replaces the entry in the GOT, causing all calls from the old function to be redirected to the new definition, even when done by other compiled code. If the GOT didn't exist, we would have to change all executable memory back to read+write, relocate it again and change it back. This is much slower and isn't thread safe. The PLT indirection is for cases where a direct GOT access is not possible, like for function pointers. The PLT is basically a trampoline jumping to the address in the corresponding GOT entry.i think it does not use or clean the memory of the old definition.
Indeed. That it doesn't deallocate the old definition is something I would like to see fixed, but that will require a more complex scheme for allocating memory to put the executable code in and a way to check that the function is no longer in use by any thread.
mokhaled2992 commented on issue #5005:
Oh I believe I got it now, this is done through
get_address
that returns the PLT entry for hotswapped mode instead of the actual function pointer. This hotswapping is probably meant for the non top level functions, so If you have X calling Y, you can get a finalized definition to X you can hotswap Y, but you can't get PLT entries to top level functions right?
bjorn3 commented on issue #5005:
Oh I believe I got it now, this is done through get_address that returns the PLT entry for hotswapped mode instead of the actual function pointer
Indeed.
This hotswapping is probably meant for the non top level functions, so If you have X calling Y, you can get a finalized definition to X you can hotswap Y, but you can't get PLT entries to top level functions right?
get_finalized_function
returns a PLT entry too in hotswap mode I believe.
mokhaled2992 commented on issue #5005:
I think it just returns the pointer to the allocated memory where the function was loaded unless I missed some other indirection :)
https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/jit/src/backend.rs#L379
bjorn3 commented on issue #5005:
Indeed.
mokhaled2992 commented on issue #5005:
Thanks alot @bjorn3 for all the support :heart:.
Feel free to close the issue if you have something that duplicates the original problem. I saw your suggestion in https://github.com/bytecodealliance/wasmtime/issues/4986 which can hit two birds with one stone as far as I understood.
bjorn3 commented on issue #5005:
I don't believe I had opened an issue for this already, so keeping it open is fine I think.
Dimchikkk commented on issue #5005:
Indeed. That it doesn't deallocate the old definition is something I would like to see fixed, but that will require a more complex scheme for allocating memory to put the executable code in and a way to check that the function is no longer in use by any thread.
Hi @bjorn3 , sorry for hijacking the thread but just wondering if you could share mode details on how would you start implementing swapped function deallocation... is it sort of garbage collector that tracks all definitions... if it was swapped with new definition and old definition is not used by any thread - deallocate memory?
bjorn3 commented on issue #5005:
My idea has been for the user of cranelift-jit to tell which functions are no longer necessary. Then cranelift-jit needs to store which parts of each page are still used by necessary functions and when no used function is still part of the page, the page can be deallocated. Or alternatively at the cost of memory usage each function can be stored in separate pages which would allow immediately deallocating all pages when a function is marked as unused..
Last updated: Dec 23 2024 at 12:05 UTC