afonso360 commented on issue #4616:
#4546 introduces a
KnownSymbol
mechanism, which sounds like it would be useful here.
uweigand commented on issue #4616:
#4546 introduces a
KnownSymbol
mechanism, which sounds like it would be useful here.Looking at this, I'm wondering why the whole mechanism of allowing names to be replaced externally is even necessary here. I can see where this is useful for libcalls, in case you want to substitute another library. But for something like
_GLOBAL_OFFSET_TABLE_
, that name is defined by the ABI and hard-coded into the linker, there is no way to override this anyway.So maybe it would be simpler to just add a feature to allow the backend to simply hard-code a symbol name?
afonso360 commented on issue #4616:
bjorn3 mentioned that we may need it in the JIT to perform TLS lookups
uweigand commented on issue #4616:
bjorn3 mentioned that we may need it in the JIT to perform TLS lookups
I'm still not sure I see it - at least in the case of
_GLOBAL_OFFSET_TABLE_
, this is not a "real" symbol either way. When creating an object file used by the system linker, the name has to be exactly this string as specified by the ABI, otherwise the required special processing will not trigger in the linker. If we re-implement linker functionality in the JIT, the JIT will simply have to perform that special processing itself, which should not involve any symbol name at all. There is no scenario I can envision where it makes sense to replace a reference to_GLOBAL_OFFSET_TABLE_
with a reference to some symbol with another name.
bjorn3 commented on issue #4616:
In the JIT no object files are created and it is not possible to use the libc tls implementation. My idea has been to intercept the libc function call to get the tls address and implement it in the jit, entirely avoiding libc in the process (except for a single native tls access to get a list of all jit tls values for the current thread)
bjorn3 edited a comment on issue #4616:
In the JIT no object files are created and it is not possible to use the libc tls implementation. My idea has been to intercept the libc function call to get the tls address and implement it in the jit, entirely avoiding libc in the process (except for a single native tls access to get a list of all jit tls values for the current thread)
the JIT will simply have to perform that special processing itself, which should not involve any symbol name at all.
Unless the jit consumer does the implementation.
bjorn3 commented on issue #4616:
Reading the pr description it doesn't seem like the way I was planning to implement it on x86_64 will work on s390x, as on s390x the generated code already accesses the tls register, while on x86_64 this is left for the libcall that can be replaced.
afonso360 commented on issue #4616:
Reading the pr description it doesn't seem like the way I was planning to implement it on x86_64 will work on s390x, as on s390x the generated code already accesses the tls register, while on x86_64 this is left for the libcall that can be replaced.
I think this also applies to the COFF TLS model, we never perform any libcall's, its all relocations. And if we want to implement it in the JIT we are also going to need to handle that separately in the jit linker right?
afonso360 edited a comment on issue #4616:
Reading the pr description it doesn't seem like the way I was planning to implement it on x86_64 will work on s390x, as on s390x the generated code already accesses the tls register, while on x86_64 this is left for the libcall that can be replaced.
I think this also applies to the COFF TLS model, we never perform any libcall's, its all relocations. And if we want to implement it in the JIT we are also going to need to handle that separately in the JIT linker right?
bjorn3 commented on issue #4616:
I think this also applies to the COFF TLS model, we never perform any libcall's, its all relocations.
Even on Windows it would be possible to use ELF tls in the JIT.
And if we want to implement it in the JIT we are also going to need to handle that separately in the JIT linker right?
At least a prototype can be done outside of the jit in cg_clif itself.
uweigand commented on issue #4616:
Reading the pr description it doesn't seem like the way I was planning to implement it on x86_64 will work on s390x, as on s390x the generated code already accesses the tls register, while on x86_64 this is left for the libcall that can be replaced.
So the way GD TLS works on s390x is that you 1) have a constant pool entry that contains a relocation which the linker will resolve to the GOT offset of a pair of GOT slots holding the TLS module index and offset for the target symbol; 2) load this GOT offset and pass it to
__tls_get_offset
, which will load the module index and offset from the GOT, and return the TLS section offset of the requested symbol, and 3) add the value of the current thread pointer to that TLS offset.There is a linker optimization in play, for the case where you don't need the full general-dynamic support, that will do two things: 1) change the relocation on the constant pool entry to directly place the TLS section offset of the target symbol there (instead of the GOT offset of the module/offset pair), and replace the call to
__tls_get_offset
with a no-op. This will lead the generated code to effectively, instead, load the TLS section offset from the constant pool and add the current thread pointer, with no libcall involved.W.r.t. implementing TLS in the JIT, I guess that depends to a large extent on what exactly should be supported. But the fact that we add the thread pointer in itself doesn't change much - if nothing better is possible, you can always implement
__tls_get_offset
in terms of your__tls_get_addr
implementation and just subtract the current thread pointer before returning.This will require overloading
__tls_get_offset
(and__tls_get_addr
) - but those are (and should remain) normal libcalls that can already be overloaded by the JIT.This discussion is about
_GLOBAL_OFFSET_TABLE_
, which is needed here as parameter to__tls_get_offset
. (Of course there can also be other uses of_GLOBAL_OFFSET_TABLE_
unrelated to TLS.) This has to be handled by the JIT itself, because whatever value is used as_GLOBAL_OFFSET_TABLE_
must match the GOT start that is also used to resolve GOT-relative relocations. I don't see any way to implement_GLOBAL_OFFSET_TABLE_
by just redirecting to some other symbol provided outside the JIT.
bjorn3 commented on issue #4616:
This discussion is about _GLOBAL_OFFSET_TABLE_, which is needed here as parameter to __tls_get_offset. (Of course there can also be other uses of _GLOBAL_OFFSET_TABLE_ unrelated to TLS.) This has to be handled by the JIT itself, because whatever value is used as _GLOBAL_OFFSET_TABLE_ must match the GOT start that is also used to resolve GOT-relative relocations. I don't see any way to implement _GLOBAL_OFFSET_TABLE_ by just redirecting to some other symbol provided outside the JIT.
For the JIT case it would probably have a value of 0, or whatever arbitrary value the JIT wants as it isn't actually necessary there.
uweigand commented on issue #4616:
For the JIT case it would probably have a value of 0, or whatever arbitrary value the JIT wants as it isn't actually necessary there.
Well, the TLS relocation resolves to an _offset relative to the GOT_, so the value GLOBAL_OFFSET_TABLE must match how these offsets were computed. It's probably true you can choose any arbitrary value in the JIT, including zero, as long as it's used consistently.
However, this just reinforces my point that _the JIT will resolve this_, even if to always 0. It will never be replaced by some other symbol, therefore having a replaceable name is not useful.
uweigand commented on issue #4616:
I've now updated the patch to include the
KnownSymbol
mechanism by @afonso360. As discussed above, I've not included any "lookup" function to provide names from the outside for the well-known symbols. Instead, they only have internal names for display/debug purposes (just like libcalls), and the well-known external name is hard-coded in the object backend. (The JIT backend wouldn't need any symbol name, it can just implement the semantics directly.)@bjorn3 does this look reasonable to you?
uweigand commented on issue #4616:
Hi @cfallin @bjorn3 @afonso360 as far as I can see, this version no longer has any open issues and should be good to go. (It will conflict with https://github.com/bytecodealliance/wasmtime/pull/4546 as it pulls it parts of that change, but that should be easy to resolve once it's in.) Is this now OK, or are there any further changes needed?
cfallin commented on issue #4616:
(It will conflict with https://github.com/bytecodealliance/wasmtime/pull/4546 as it pulls it parts of that change, but that should be easy to resolve once it's in.)
@uweigand it's not totally clear to me, do you mean you want to merge this PR first and then resolve the conflicts in #4546, or wait for #4546 then resolve conflicts here?
uweigand commented on issue #4616:
(It will conflict with #4546 as it pulls it parts of that change, but that should be easy to resolve once it's in.)
@uweigand it's not totally clear to me, do you mean you want to merge this PR first and then resolve the conflicts in #4546, or wait for #4546 then resolve conflicts here?
I think we should merge this PR first. I copied the implementation of
KnownSymbol
initially from #4546, but then added a few changes to address issues that @bjorn3 had pointed out. So the implementation here should be the correct one.
Last updated: Nov 22 2024 at 16:03 UTC