-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Rework c_variadic
#141980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Rework c_variadic
#141980
Conversation
This PR modifies cc @jieyouxu Some changes occurred in compiler/rustc_codegen_ssa |
cb90479
to
23a983d
Compare
compiler/rustc_abi/src/layout/ty.rs
Outdated
@@ -175,6 +175,9 @@ pub trait TyAbiInterface<'a, C>: Sized + std::fmt::Debug { | |||
fn is_tuple(this: TyAndLayout<'a, Self>) -> bool; | |||
fn is_unit(this: TyAndLayout<'a, Self>) -> bool; | |||
fn is_transparent(this: TyAndLayout<'a, Self>) -> bool; | |||
/// Returns `true` if the type is always passed indirectly. Currently only | |||
/// used for `VaList`s. | |||
fn is_pass_indirectly(this: TyAndLayout<'a, Self>) -> bool; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only see this as having an effect on x86-64 and aarch64, but doesn't the argument-passing algorithm already enforce this by making sure that VaList gets passed via Memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is only needed by the code in rustc_target::callconv
to ensure that VaList
gets PassMode::Indirect { on_stack: false, .. }
. repr_options_of_def
sets the PASS_INDIRECTLY
ReprFlag
for VaList
when it needs to be passed indirectly for non-rustic ABIs, however there is no way to directly access ReprFlag
s from the calling convention code. There is already a method is_transparent
on TyAbiInterface
to expose the IS_TRANSPARENT
ReprFlag
, so I used a similar pattern to expose PASS_INDIRECTLY
to the calling convention code. Without this method, the System V x86-64 ABI would pass VaList
with PassMode::Indirect { on_stack: true, .. }
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And just to finish the thought: the difference is important (only) in an FFI context: a foreign extern "C" { fn(ap: VaList); }
would expect ap
to be passed as PassMode::Indirect { on_stack: false, .. }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh, fascinating.
compiler/rustc_middle/src/ty/mod.rs
Outdated
if self.is_lang_item(did.to_def_id(), LangItem::VaList) | ||
&& !flags.contains(ReprFlags::IS_TRANSPARENT) | ||
{ | ||
flags.insert(ReprFlags::PASS_INDIRECTLY); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So randomly in the middle of repr_options_of_def
we decide what the ABI for VaList will be? That's a disaster waiting to happen IMO, nobody will suspect ABI-relevant code here...
Also, in the discussion the argument was that this is necessary when va_list is an array -- but I can't see that reflected here at all. Given that the actual source of the issue is array decay, shouldn't we just fix the extern "C"
logic so that it passes all arrays by-ptr (there different ways to do that, none of them pretty until #119183 are resolved), and then that'll do the right thing for array-based va_list
by itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where the (very-ABI-affecting) IS_TRANSPARENT
repr flag is set (based on the function attributes), so it seemed like the best place to put it. Having it as a repr flag seemed like the best way to communicate it to the code in rustc_target
(the only place that needs to be aware of it), but it would be easy enough to do it another way if this way is sub-optimal. You're correct about this not actually checking whether it is an array type: this was the easiest way to differentiate between the two types of array list when I was prototyping this, but it should be replaced with a more precise check. Passing all arrays by reference seems to have been discussed on Zulip and in the parent issue; I think it would be a footgun as C allows the user to observe modifications the callee makes to the array in the caller (unlike va_list
, where the C standard says that using the va_list
after passing it to a function is UB).
I think @folkertdev has suggested before something along the lines of a #[rustc_always_pass_indirectly]
attribute which could be applied to VaList
type as needed: would that be preferable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think if we do go for this (which I am not convinced we should), then it should be its own attribute, not magically connected to the lang item.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After more discussions on Zulip, I am convinced this is worth the experiment -- after being adjusted as proposed above: having a separate attribute. The compiler shouldn't tie rustc_always_pass_indirectly
to varargs in any way.
That should also make it much easier to write tests just for this attribute. Ideally that's a separate PR from actually using it for varargs -- maye splitting it up is too much work (separate commits would still be good); but at the very least, we should have separate tests.
@beetrees when I run this PR locally, on x86_64, (btw I landed a PR a while ago that makes it easy to cross-compile that test if you have the relevant linker installed and runner defined in [target.powerpc-unknown-linux-gnu]
runner = "/home/folkertdev/c/qemu/build/qemu-ppc -L /usr/powerpc-linux-gnu"
[target.s390x-unknown-linux-gnu]
runner = "qemu-s390x -L /usr/s390x-linux-gnu" I also tried this //@ assembly-output: emit-asm
//@ compile-flags: -O
#![feature(c_variadic)]
#![crate_type = "lib"]
use std::ffi::VaList;
// CHECK-LABEL: by_value:
#[unsafe(no_mangle)]
extern "C" fn by_value(mut x: VaList) -> u32 {
unsafe { x.arg::<u32>() }
}
// CHECK-LABEL: by_reference:
#[unsafe(no_mangle)]
extern "C" fn by_reference(x: &mut VaList) -> u32 {
unsafe { x.arg::<u32>() }
}
// CHECK-LABEL: dummy:
// CHECK: xlkj
#[unsafe(no_mangle)]
extern "C" fn dummy() {} When I diff Am I missing something, or is this broken? Over in #t-compiler > array-to-pointer decay npopov also mentions that |
I've just ran
I ran the assembly test you supplied locally and got this assembly: Test assembly .file "temp.9a048525d681a25f-cgu.0"
.section .text.by_reference,"ax",@progbits
.globl by_reference
.p2align 4
.type by_reference,@function
by_reference:
.cfi_startproc
movl (%rdi), %ecx
cmpq $40, %rcx
ja .LBB0_2
movq %rcx, %rax
addq 16(%rdi), %rax
addl $8, %ecx
movl %ecx, (%rdi)
movl (%rax), %eax
retq
.LBB0_2:
movq 8(%rdi), %rax
leaq 8(%rax), %rcx
movq %rcx, 8(%rdi)
movl (%rax), %eax
retq
.Lfunc_end0:
.size by_reference, .Lfunc_end0-by_reference
.cfi_endproc
.section .text.dummy,"ax",@progbits
.globl dummy
.p2align 4
.type dummy,@function
dummy:
.cfi_startproc
retq
.Lfunc_end1:
.size dummy, .Lfunc_end1-dummy
.cfi_endproc
.globl by_value
.type by_value,@function
.set by_value, by_reference
.ident "rustc version 1.89.0-dev"
.section ".note.GNU-stack","",@progbits LLVM appears to have merged the functions as they had identical assembly.
|
I am somewhat uncomfortable relying on that, and would personally prefer a slightly less slick API that requires fewer hacks. This feature is too niche to justify so much hackery, IMO. But maybe I am overly paranoid. If we do end up going for it, please add walls of comments everywhere that explain this, or else we can be sure some poor compiler contributor will spend hours digging through this in a few years... |
@beetrees hmm, I did rebase, maybe that messed something up? Can you rebase this branch to something more recent? |
23a983d
to
57bc996
Compare
I've rebased, and the two tests still pass locally. |
For me too, I must have messed something up earlier. Thanks for checking! |
57bc996
to
6347a21
Compare
6347a21
to
6e1e74f
Compare
I've reworked this to use a |
This comment has been minimized.
This comment has been minimized.
6e1e74f
to
26f7025
Compare
26f7025
to
79a6396
Compare
|
Currently blocked on #143397. |
This comment has been minimized.
This comment has been minimized.
79a6396
to
f262f05
Compare
This comment has been minimized.
This comment has been minimized.
f262f05
to
5feffd1
Compare
On some platforms, the C
va_list
type is actually a single-element array of a struct (on other platforms it is just a pointer). In C, arrays passed as function arguments expirience array-to-pointer decay, which means that C will pass a pointer to the array in the caller instead of the array itself, and modifications to the array in the callee will be visible to the caller (this does not match Rust by-value semantics). However, forva_list
, the C standard explicitly states that it is undefined behaviour to use ava_list
after it has been passed by value to a function (in Rust parlance, theva_list
is moved, not copied). This matches Rust's pass-by-value semantics, meaning that when the Cva_list
type is a single-element array of a struct, the ABI will match C as long as the Rust type is always be passed indirectly.In the old implementation, this ABI was achieved by having two separate types:
VaList
was the type that needed to be used when passing aVaList
as a function parameter, whereasVaListImpl
was the actualva_list
type that was correct everywhere else. This however is quite confusing, as there are lots of footguns: it is easy to cause bugs by mixing them up (e.g. the C functionvoid foo(va_list va)
was equivalent to the Rustfn foo(va: VaList)
whereas the C functionvoid bar(va_list* va)
was equivalent to the Rustfn foo(va: *mut VaListImpl)
, notfn foo(va: *mut VaList)
as might be expected); also converting fromVaListImpl
toVaList
withas_va_list()
had platform specific behaviour: on single-element array of a struct platforms it would return aVaList
referencing the originalVaListImpl
, whereas on other platforms it would return a cioy,In this PR, there is now just a single
VaList
type (renamed fromVaListImpl
) which represents the Cva_list
type and will just work in all positions. Instead of having a separate type just to make the ABI work, the first commit adds a#[rustc_pass_indirectly_in_non_rustic_abis]
attribute, which when applied to a struct will force the struct to be passed indirectly by non-Rustic calling conventions. The second commit then implements theVaList
rework, making use of the new attribute on all platforms where the Cva_list
type is a single-element array of a struct.Cleanup of the
VaList
API and implementation is also included in the second commit: since it was decided it was OK to experiment with Rust requiring that not callingva_end
is not undefined behaviour (#141524 (comment)), I've removed thewith_copy
method as it was redundant to theClone
impl (theDrop
impl ofVaList
is a no-op asva_end
is a no-op on all known platforms). I've also removed the compiler support for the unused Rustva_start
andva_end
intrinsics; the backendva_start
andva_end
calls for theVaList
created by calling a varargs function are generated directly byrustc_codegen_ssa
and are unaffected.Previous discussion: #141524 and t-compiler > c_variadic API and ABI
Tracking issue: #44930
r? @joshtriplett