Skip to content

fix some bitfield size calculate bug #3215

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 16 commits into from
Closed

Conversation

qinghon
Copy link
Contributor

@qinghon qinghon commented May 31, 2025

Now, we can perceive non-starting bitfields and non-ending bitfields

Also, I modified the calculation method of align: only calculate alignment for bitfield from structure using bitfield_width
The reason is that the code now no longer relies on #[repr(C)] to generate alignments, but always contains all bytes
The problem that is currently known is that the test case bitfield-linux-32.hpp was dropped #[repr(packed(4))]. I don't know the reason, hope someone can help me

qinghon added 3 commits June 1, 2025 00:44
Now, we can perceive non-starting bitfields and non-ending bitfields
@qinghon
Copy link
Contributor Author

qinghon commented May 31, 2025

Issue known to be repaired:
#1377
#3105
#743
#981
#1314

@qinghon
Copy link
Contributor Author

qinghon commented Jun 3, 2025

r? @emilio

@rustbot
Copy link
Collaborator

rustbot commented Jun 3, 2025

Error: The feature assign is not enabled in this repository.
To enable it add its section in the triagebot.toml in the root of the repository.

Please file an issue on GitHub at triagebot if there's a problem with this bot, or reach out on #t-infra on Zulip.

Copy link
Contributor

@emilio emilio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Generally it looks good but I suspect this is specific to 64-bit architectures? Or am I missing something? In any case some more comments would be good. Thanks

@qinghon qinghon requested a review from emilio June 9, 2025 11:09
qinghon added 3 commits June 10, 2025 10:33
For 32bit structures compiled across platforms, align_of is not accurate whether it is set to 8 or 4.
add bitfield pack offset bug
@qinghon
Copy link
Contributor Author

qinghon commented Jun 10, 2025

fix and add #3104 case

this change add depended clang report offset

@qinghon
Copy link
Contributor Author

qinghon commented Jun 20, 2025

@emilio Can you take some time to review this PR ?

@wdouglass
Copy link

Fixes #3238

@nikvoid
Copy link

nikvoid commented Jul 3, 2025

Looks like some cases are still broken.

struct appHand_supportedAppProtocolRes {
    appHand_responseCodeType ResponseCode; // enum with few variants, offset = 0x0
    uint8_t SchemaID; // offset = 0x1 (why?, isn't enum i32?)
    unsigned int SchemaID_isUsed:1; // offset = 0x2 ?, size of struct = 4
};

Generated with bindgen 0.72:

#[repr(C)]
pub struct appHand_supportedAppProtocolRes {
    pub ResponseCode: appHand_responseCodeType::Type,
    pub SchemaID: u8,
    pub _bitfield_align_1: [u8; 0], 
    pub _bitfield_1: __BindgenBitfieldUnit<[u8; 1usize]>, // offset = 0x5 ?
    pub __bindgen_padding_0: u16, // offset = 0x6, size of struct = 8
}

Generated with this PR:

#[repr(C)]
pub struct appHand_supportedAppProtocolRes {
    pub ResponseCode: appHand_responseCodeType::Type,
    pub SchemaID: u8,
    pub _bitfield_align_1: [u8; 0],
    pub _bitfield_1: __BindgenBitfieldUnit<[u8; 3usize]>, // offset = 0x5, still
}

Bitfield storage alignment is invalid

EDIT: Wrong original offsets
EDIT2: The actual problem is in size of enum, so this is a bit unrelated

@qinghon
Copy link
Contributor Author

qinghon commented Jul 3, 2025

@nikvoid

Bitfield storage alignment is invalid

Yes

In the C compiler, SchemaID_isUsed and SchemaID will be packaged into an int and accessed in 4 byte alignment.

However, in bindgen, it is difficult to ensure this requirement under the current structure, which will cause some performance losses, but it is better than the wrong offset (the purpose of this PR is to fix the wrong offset/size, but at present, it does not destroy the alignment of the entire structure)

@nikvoid
Copy link

nikvoid commented Jul 4, 2025

This actually was a thing with different compilers used from bindgen and C sides... GCC fitted enum to 1 byte, but clang did not.

Copy link
Contributor

@emilio emilio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks good to me with some of the bits clarified / documented, and comments reworded.

}
// depended clang report field offset, manual calculates sometime not true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot make sense of this comment, can you reword?

I think you are saying something like:

// Use the clang-reported offset, since the manual calculation may be off.

Or so? Or the other way around?

@@ -643,8 +653,15 @@ where
// NB: The `bitfield_width` here is completely, absolutely
// intentional. Alignment of the allocation unit is based on the
// maximum bitfield width, not (directly) on the bitfields' types'
// alignment.
unit_align = cmp::max(unit_align, bitfield_width);
// alignment. this is only available in bitfields start of structure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a libclang limitation?

if bitfields_bit_begin_off == 0 {
unit_align = cmp::max(unit_align, bitfield_width);
} else {
// if bitfields is not start of structure, due to the presence
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like:

If the bitfield is not the first member of the struct, we can't use max alignment. Default to u8.

What is backward embedding? Never heard that term. You mean something like the optimization where you put fields in the padding of the base class or something?

.offset()
.map_or(unit_size_in_bits, |off| off - bitfields_bit_begin_off)
} else if let Some(parent_layout_) = parent_layout {
// if the bitfield is an end of structure, use bitfields start to the structure end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, is this special-case necessary? Seems a bit unfortunate...

@emilio
Copy link
Contributor

emilio commented Jul 11, 2025

Also could you squash before I hit the merge button? A lot of the commits don't have particularly descriptive commit messages (like cargo fmt)

@@ -1,4 +1,4 @@
// bindgen-flags: -- --target=i586-unknown-linux
// bindgen-flags: --no-layout-tests -- --target=i586-unknown-linux
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I don't understand why this is needed. Aren't we regressing this test-case? Why is that fine? Can we file a follow-up issue to address it at least?

Or is the issue I guess that the layout doesn't match on Linux64 which is where we build the code?

let bitfield_align = bitfield_layout.align;
// bitfield offset of struct/union
let bitfield_offset = bitfield.offset().unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I'm not sure we can guarantee we have this, right? This panics with this patch:

template <typename T> struct foo {
  T member;
  bool b : 8;
};

next_field
.offset()
.map_or(unit_size_in_bits, |off| off - bitfields_bit_begin_off)
} else if let Some(parent_layout_) = parent_layout {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to remove parent_layout doesn't cause any layout test failures at a glance. So I think it's fine to remove.

@emilio
Copy link
Contributor

emilio commented Jul 11, 2025

Ok so I dug into what's going on. The key difference between how bindgen tracks bitfields vs. LLVM is that we track the offset within the bitfield, but llvm tracks the global offset within the struct.

#3247 fixes that, which fixes all the regressions, and it is significantly simpler. @qinghon lmk if you have any feedback on that. I kept credit since the test-cases are yours and you spent a lot of effort on this.

Thanks!

@emilio
Copy link
Contributor

emilio commented Jul 11, 2025

Superseded by #3247. But feel free to comment there or send more patches if something is not addressed. Thanks!

@emilio emilio closed this Jul 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants