Skip to content

Conversation

@mergify
Copy link
Contributor

@mergify mergify bot commented Oct 9, 2025

Description

This PR fixes a lock order inversion bug detected by Thread Sanitizer between GraphListener::shutdown_mutex_ and Context::on_shutdown_callbacks_mutex_, by storing and instantiating the GraphListener inside of the Context instead of NodeGraph. With this change, we avoid locking the context weak pointer and inverting the order of lock acquisition on startup and shutdown.

Usage of is_shutdown_->load() was also replaced with the existing is_shutdown() method in GraphListener for consistency.

Fixes # (2946)

Is this user-facing behavior change?

no

Did you use Generative AI?

no

Additional Information

This code appears to have remained unchanged for a long time, and I was able to reproduce this tsan error on iron, jazzy and rolling with this example. Tsan identified a lock order inversion as well as one potential data race in other parts of the code. With this change, the lock order inversion warning is gone, no additional tsan warnings were introduced, and all unit tests passed locally when these changes were built against both jazzy and rolling.


This is an automatic backport of pull request #2952 done by Mergify.

@mergify mergify bot added the conflicts label Oct 9, 2025
@mergify
Copy link
Contributor Author

mergify bot commented Oct 9, 2025

Cherry-pick of 715a983 has failed:

On branch mergify/bp/jazzy/pr-2952
Your branch is up to date with 'origin/jazzy'.

You are currently cherry-picking commit 715a983.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   rclcpp/include/rclcpp/context.hpp
	modified:   rclcpp/include/rclcpp/graph_listener.hpp
	modified:   rclcpp/include/rclcpp/node_interfaces/node_graph.hpp
	modified:   rclcpp/src/rclcpp/graph_listener.cpp
	modified:   rclcpp/src/rclcpp/node_interfaces/node_graph.cpp
	modified:   rclcpp/test/rclcpp/test_graph_listener.cpp

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   rclcpp/src/rclcpp/context.cpp

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

Copy link
Contributor

@ahcorde ahcorde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skyegalaxy do you mind to fix the conflicts ?

@skyegalaxy skyegalaxy force-pushed the mergify/bp/jazzy/pr-2952 branch from 0759325 to 7106305 Compare October 9, 2025 16:00
)

* factor shutdown hook registration into its own method

Signed-off-by: Skyler Medeiros <[email protected]>

* run ament_uncrustify

Signed-off-by: Skyler Medeiros <[email protected]>

* consistent use of is_shutdown in graph-listener

Signed-off-by: Skyler Medeiros <[email protected]>

* this doesn't actually need to be a separate function

Signed-off-by: Skyler Medeiros <[email protected]>

* whitespace

Signed-off-by: Skyler Medeiros <[email protected]>

* fix typo

Signed-off-by: Skyler Medeiros <[email protected]>

* move graph listener into the context

Signed-off-by: Skyler Medeiros <[email protected]>

* make cpplint happy

Signed-off-by: Skyler Medeiros <[email protected]>

* Update rclcpp/src/rclcpp/context.cpp

Signed-off-by: Skyler Medeiros <[email protected]>

---------

Signed-off-by: Skyler Medeiros <[email protected]>
@skyegalaxy skyegalaxy force-pushed the mergify/bp/jazzy/pr-2952 branch from 7106305 to 4303518 Compare October 9, 2025 16:12
@skyegalaxy
Copy link
Member

Pulls: #2966
Gist: https://gist.githubusercontent.com/skyegalaxy/9c026cefc94545bea772327e79dc1ba3/raw/42c219db391d86a0ea45150cc4a6a6d418a8aca0/jazzy-pr-2952.repos
BUILD args: --packages-above-and-dependencies rclcpp
TEST args: --packages-above rclcpp
ROS Distro: jazzy
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/17270/

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Linux-rhel Build Status
  • Windows Build Status

@skyegalaxy
Copy link
Member

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Linux-rhel Build Status
  • Windows Build Status

@skyegalaxy skyegalaxy requested a review from ahcorde October 10, 2025 16:07
ahcorde
ahcorde previously approved these changes Oct 14, 2025
@jmachowinski
Copy link
Collaborator

This is ABI breaking and may not be backported

@ahcorde ahcorde dismissed their stale review October 14, 2025 16:46

ABI break

@skyegalaxy skyegalaxy closed this Oct 14, 2025
Copy link
Collaborator

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmachowinski thanks for letting me know that 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants