Skip to content

Conversation

iv461
Copy link

@iv461 iv461 commented Sep 15, 2025

Description

Fixes #2951

Is this user-facing behavior change?

When users publish custom message types using type adapters, they must now publish by unique_ptr if the custom message type overloads operator new. This is to prevent new/delete mismatch.

Did you use Generative AI?

No

Additional Information

Tested, and correctly fails to compile if I try to publish a pcl::PointCloud by value AND I'm pulling the dependency on PCL via pcl_ros.

According to cppreference, there are 22 overloads of operator new. I only implement detecting the class-specific ones 15 and 17. I do not implement the array-operators (16 and 18) since the detection becomes ambiguous and causes compile error.
I do not think it is reasonable to guard against overloaded global operator new/delete, i.e. overloads 1-10.

Also, overloads 11-14 and 19-22 accept arbitrary arguments and are therefore likely to be undetectable.

Copy link
Collaborator

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although I think this is a right thing to do to avoid undefined behavior by new/delete mismatch when the data object is copied by global allocator, this possibly breaks the application not to be able to compile once this fix is merged. again, i think this is also good to let the user application know the correct approach to publish the data in this particular case. but i would like to have other opinions from other developers.

@fujitatomoya
Copy link
Collaborator

@Mergifyio rebase

Copy link
Contributor

mergify bot commented Sep 17, 2025

rebase

✅ Branch has been successfully rebased

@iv461
Copy link
Author

iv461 commented Sep 17, 2025

Yes, this is undefined behavior, more speficially it might cause heap corruption and with this random crashes.
Therefore, I think it is reasonable to aim to fix all packages that have this new/delete mismatch bug and now would fail to compile.

But you are right, the problem is that there might be many places that need to be fixed, i.e. many packages that would fail to compile.

To fix the compilation, generally it suffices to replace publish(my_message) with publish(std::make_unique<MyMessage>(my_message)). Since rclcpp would copy the message anyway, this also has no performance impact.

Two ideas:

  • Bump pcl_ros to use C++17 on rolling to prevent pcl::PointCloud from using custom allocators.
  • Make an exception for the pcl::PointCloud type, (i.e. allow it to be published by value) if we can prove that calling the Eigen aligned free function does not cause heap corruption when freeing memory allocated from the regular global new operator. But unfortunately, it looks like it would cause heap corruption.

By the way, I looked at the CI and it failed again due to an apparently unrelated issue in rclcpp_action.

Comment on lines 580 to 581
"When publishing by value (i.e. when calling publish(const T& msg)), the published message"
"type must not have an overloaded operator new. In this case, please use the"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion. Please double-check the line length.

Suggested change
"When publishing by value (i.e. when calling publish(const T& msg)), the published message"
"type must not have an overloaded operator new. In this case, please use the"
"When publishing by value (i.e. when calling publish(const T& msg)), the published message "
"type must not have an overloaded operator new. In this case, please use the "

Copy link
Author

@iv461 iv461 Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've used now ament_uncrustify --reformat --linelength 100 for code formatting, which formatter do you use ?

@alsora
Copy link
Collaborator

alsora commented Oct 3, 2025

We discussed this during the client library working group.
Overall we are in favor of this.

If possible it would be better to make it a static warning rather than an assertion, to have a deprecation cycle.
If not possible, we think it would be ok to merge as is.

Ivo Ivanov added 11 commits October 7, 2025 21:42
This reverts commit f1917a35a5037220be2f537031e3ef5b65b5be3e.

Signed-off-by: Ivo Ivanov <[email protected]>
Signed-off-by: Ivo Ivanov <[email protected]>
…nstead of an static_assert that would cause a compile error.

Signed-off-by: Ivo Ivanov <[email protected]>
…late specialization. This forces the compiler to evaluate the condition later, not as early as in the lexial stage

Signed-off-by: Ivo Ivanov <[email protected]>
Signed-off-by: Ivo Ivanov <[email protected]>
…ept GCC: We need two patterns to avoid struct redeclaration error

Signed-off-by: Ivo Ivanov <[email protected]>
Signed-off-by: Ivo Ivanov <[email protected]>
…g: The warning simply does not trigger when called from the rclcpp publish method for some reason

Signed-off-by: Ivo Ivanov <[email protected]>
@iv461
Copy link
Author

iv461 commented Oct 7, 2025

Thanks for the update. I would also be fine with a deprecation warning instead of a compile error. However, I was unable to to implement a deprecation warning. I tried an approach in df1e782 which generally works, but for some strange reason I can't get it to work inside rclcpp. Therefore, I'll leave it as it is, with the static_assert.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Potential new/delete mismatch when publishing a custom type that has overloaded new/delete operators (e.g. pcl::PointCloud) using type adapters
5 participants