Description
Apologies for the mouthful of an issue title. Here's some example Rust for the issue at hand:
pub struct Ex {
d: u32
}
pub fn call(x: &Ex)
{
call_with_flags(x, 0)
}
pub fn call_with_flags(x: &Ex, flags: u32)
{
println!("d: {}, flags: {}", x.d, flags)
}
Compiling this with rustc -O -C inline-threshold=0
results in the following on x86-64 Linux (the use of -C inline-threshold=0
is a bit artificial, but I've seen the same sort of issue when disassembling std
, which presumably doesn't modify inline-threshold
):
0000000000000000 <plt::call::h4fbe4633d0f48375>:
0: 31 f6 xor %esi,%esi
2: e9 00 00 00 00 jmpq 7 <plt::call::h4fbe4633d0f48375+0x7>
3: R_X86_64_PLT32 plt::call_with_flags::he191eb6ceb0f1f21-0x4
That R_X86_64_PLT32
relocation is going to get resolved to a call into the PLT, which introduces a small amount of overhead on every call, in addition to taking up unneeded space with the PLT entry and the function pointer in the GOT. It would be better to use a R_X86_64_PC32
relocation there, which will get turned into a direct jump at link time. glibc
uses this technique to great effect so that all intra-libc calls (except for a few things like malloc
, etc.) don't go through the PLT.
Folks might want the ability to override public functions of a crate via LD_PRELOAD
or similar, but doing so seems a little tricky with the current name mangling scheme. Perhaps a -C
option could be added?