Are there any tricks to declaring functions that are internal to the total linkage unit, but external to the current module? In particular, I'm trying to define a reference to a function that I'm going to declare in a different object file, but link statically.
It looks like cranelift is emitting full dynamic library calls / relocation info no matter what I do (I've tried declaring with Linkage::Import, and Linkage::Preemptible with a fake body), which gets me a linker warning on Linux and a bus error on MacOS.
Since the MacOS error is slightly more illuminating, I'll say that it pops up in dyld:
(lldb) bt
* thread #1, stop reason = EXC_BAD_ACCESS (code=2, address=0x100003f50)
* frame #0: 0x00000001b096eb7c dyld`dyld4::fixupPage64(void*, mwl_info_hdr const*, dyld_chained_starts_in_segment const*, unsigned int, bool) + 132
frame #1: 0x00000001b096e924 dyld`dyld4::dyld_map_with_linking_np(mwl_region const*, unsigned int, mwl_info_hdr const*, unsigned int) + 528
frame #2: 0x00000001b096adc8 dyld`dyld4::setUpPageInLinkingRegions(dyld4::RuntimeState&, dyld4::Loader const*, unsigned long, unsigned short, unsigned short, bool, dyld3::Array<dyld4::PageInLinkingRange> const&, dyld3::Array<void const*> const&) + 832
frame #3: 0x00000001b096a718 dyld`invocation function for block in dyld4::Loader::setUpPageInLinking(Diagnostics&, dyld4::RuntimeState&, unsigned long, unsigned long long, dyld3::Array<void const*> const&) const + 380
frame #4: 0x00000001b096a48c dyld`dyld4::Loader::setUpPageInLinking(Diagnostics&, dyld4::RuntimeState&, unsigned long, unsigned long long, dyld3::Array<void const*> const&) const + 536
frame #5: 0x00000001b096af10 dyld`dyld4::Loader::applyFixupsGeneric(Diagnostics&, dyld4::RuntimeState&, unsigned long long, dyld3::Array<void const*> const&, dyld3::Array<void const*> const&, bool, dyld3::Array<dyld4::Loader::MissingFlatLazySymbol> const&) const + 152
frame #6: 0x00000001b09709ec dyld`dyld4::JustInTimeLoader::applyFixups(Diagnostics&, dyld4::RuntimeState&, dyld4::DyldCacheDataConstLazyScopedWriter&, bool) const + 680
frame #7: 0x00000001b09548d0 dyld`dyld4::prepare(dyld4::APIs&, dyld3::MachOAnalyzer const*) + 2212
frame #8: 0x00000001b0953dc4 dyld`start + 2404
The fault address (0x100003f50) is where I'd expect the call to be:
(lldb) disassemble -n gogogo
test`gogogo:
0x100003f40 <+0>: stp x29, x30, [sp, #-0x10]!
0x100003f44 <+4>: mov x29, sp
0x100003f48 <+8>: ldr x0, #0x8 ; <+16>
0x100003f4c <+12>: b 0x100003f58 ; <+24>
0x100003f50 <+16>: udf #0x3f91
0x100003f54 <+20>: .long 0x00280000 ; unknown opcode
0x100003f58 <+24>: mov x1, #0x4
0x100003f5c <+28>: ldr x3, #0x8 ; <+36>
0x100003f60 <+32>: b 0x100003f6c ; <+44>
0x100003f64 <+36>: udf #0x3e80
0x100003f68 <+40>: udf #0x0
0x100003f6c <+44>: blr x3
0x100003f70 <+48>: ldp x29, x30, [sp], #0x10
0x100003f74 <+52>: ret
I'm going to assume that the bus error is a write to read-only memory. (Although I'm curious what it would've written.) Also, for reference, the warning I get on Linux is:
/usr/bin/ld: output.o: warning: relocation in read-only section `.text'
/usr/bin/ld: warning: creating DT_TEXTREL in a PIE
Although the resultant binary does run.
This is a "simplified" version of what I'm running: https://github.com/acw/cranelift-hmmm
(makedo.sh
will get you direct to lldb
)
Try setting the is_pic flag when creating the target isa.
Are there any tricks to declaring functions that are internal to the total linkage unit, but external to the current module?
Hidden visibility is not yet supported, but for PIE executables that shouldn't matter with respect to relocations as in PIE executables everything is handled as if the protected visibility is used, which prevents overriding of symbols by other DSO's. The problem here is that non-position independent code can't be linked into a position independent executable, so you need to tell cranelift to build position independent code using is_pic.
bjorn3 said:
Try setting the is_pic flag when creating the target isa.
Interesting. That seems to fix the linker warning on Linux, but doesn't fix the bus error on macOS.
Can you show the output of objdump -dr output.o
on macOS?
(arguments may be slightly different on macOS. basically I want a disassembly (-d
) with relocations (-r
))
bjorn3 said:
Can you show the output of
objdump -dr output.o
on macOS?
Oh, interesting:
% objdump -dr output.o
output.o: file format mach-o arm64
Disassembly of section __TEXT,__text:
0000000000000020 <_gogogo>:
20: fd 7b bf a9 stp x29, x30, [sp, #-16]!
24: fd 03 00 91 mov x29, sp
28: 40 00 00 58 ldr x0, 0x30 <_gogogo+0x10>
2c: 03 00 00 14 b 0x38 <_gogogo+0x18>
30: 00 00 00 00 udf #0
0000000000000030: ARM64_RELOC_UNSIGNED _variable-name-x
34: 00 00 00 00 udf #0
38: 81 00 80 d2 mov x1, #4
3c: 43 00 00 58 ldr x3, 0x44 <_gogogo+0x24>
40: 03 00 00 14 b 0x4c <_gogogo+0x2c>
44: 00 00 00 00 udf #0
0000000000000044: ARM64_RELOC_UNSIGNED _print
48: 00 00 00 00 udf #0
4c: 60 00 3f d6 blr x3
50: fd 7b c1 a8 ldp x29, x30, [sp], #16
54: c0 03 5f d6 ret
looks like the failure is actually do to a relocation from the string I'm declaring? hmmm.
I think that is just because it is the first relocation. Neither should use an absolute ARM64_RELOC_UNSIGNED relocation (which has to be resolved by the dynamic linker), but instead use a pc relative relocation that gets resolved during linking.
Could you please double check that you used the is_pic change on macOS too? If you did indeed use it, I don't understand why absolute relocations are used in the text segment.
Apart from is_pic, my cranelift based backend for rustc doesn't set any flags or anything that could affect what kind of relocations are used, yet it works just fine on macOS.
bjorn3 said:
Could you please double check that you used the is_pic change on macOS too? If you did indeed use it, I don't understand why absolute relocations are used in the text segment.
Yup, it's set. Ran .set("is_pic", "true")
on the builder, and then .is_pic()
on the flag returns true afterwards. It appears to be a MacOS thing, specifically, as the Linux build appears to be PC relative in both cases:
0000000000000000 <gogogo>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 8b 3d 00 00 00 00 mov 0x0(%rip),%rdi # b <gogogo+0xb>
7: R_X86_64_GOTPCREL variable-name-x-0x4
b: be 04 00 00 00 mov $0x4,%esi
10: 48 8b 0d 00 00 00 00 mov 0x0(%rip),%rcx # 17 <gogogo+0x17>
13: R_X86_64_GOTPCREL print-0x4
17: ff d1 call *%rcx
19: 48 89 ec mov %rbp,%rsp
1c: 5d pop %rbp
1d: c3 ret
Full source code that I'm using, if anyone sees anything odd. (Apologies for it being a trace long.)
Looks like the AArch64 backend doesn't support emitting pc-relative GOT relocations when trying to get the address of an external symbol.
@Adam Wick Opened https://github.com/bytecodealliance/wasmtime/issues/5544.
:wave: Hey @Adam Wick ,
I've been working on adding pc relative GOT relocations, could you try your code with this branch of cranelift and let me know if it works?
I tested it on AArch64 Linux and it seems to work, not sure if we need anything else for MacOS.
hmm, just noticed I only had implemented the ELF relocations but not the MachO ones, when I added them it now crashes in the object crate
It looks like we are in luck and the upstream object crate already supports this! I was able to compile it for MachO and dump it with llvm-objdump after updating it. Let me know if it works for you.
Afonso Bordado said:
It looks like we are in luck and the upstream object crate already supports this! I was able to compile it for MachO and dump it with llvm-objdump after updating it. Let me know if it works for you.
Hey, just going to confirm in thread (just in case someone finds this via search) that Afonso's fix in PR 5550 resolves this issue.
Adam Wick has marked this topic as resolved.
Last updated: Dec 23 2024 at 12:05 UTC