Skip to content

Remap thread id to avoid bitmap contention #223

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from
Closed

Conversation

zhengyu123
Copy link
Contributor

@zhengyu123 zhengyu123 commented Jun 10, 2025

What does this PR do?:

An application starts a few threads in short period of window, the newly created threads are likely to have consecutive or close thread ids. When ThreadFilter maps them to the bits on bitmap, there are high chances that they may be mapped to a single cache line, that results cache line contention when updating the bitmap concurrently.

The PR purposes to remap thread id by reversing lower 2 bytes of thread id, so that consecutive/close thread ids map to different cache lines.

The reason for only reversing lower 2 bytes, is that, thread id is remapped within the same bitmap.

Motivation:

Reduce cache line contention, improve performance.

Additional Notes:

How to test the change?:

For Datadog employees:

  • If this PR touches code that signs or publishes builds or packages, or handles
    credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
  • This PR doesn't touch any of that.
  • JIRA: PROF-12002

Unsure? Have a question? Request a review!

@zhengyu123 zhengyu123 marked this pull request as draft June 10, 2025 19:13
Copy link

github-actions bot commented Jun 10, 2025

🔧 Report generated by pr-comment-cppcheck

CppCheck Report

Errors (2)

Warnings (4)

Style Violations (295)

Copy link

github-actions bot commented Jun 10, 2025

🔧 Report generated by pr-comment-scanbuild

@r1viollet
Copy link
Collaborator

Would the contention not remain if you only swap lower bits ?
I also thought of swapping all bits, however that would lead to other issues (fragmentation in the radix tree). Happy to talk this through. I'll bench it to see what the results have to say.

@r1viollet
Copy link
Collaborator

I think the proposal here is still performing better:

Operations/second/thread: 769,128

Versus this one:

Operations/second/thread: 576,013

As a first step can we deliver the bench in a separate PR to get it running in CI ? This way, we could get more stable results. My laptop has a lot of noise.

@r1viollet
Copy link
Collaborator

Also a different bench would be interesting.

@zhengyu123
Copy link
Contributor Author

I think the proposal here is still performing better:

Operations/second/thread: 769,128

Versus this one:

Operations/second/thread: 576,013

As a first step can we deliver the bench in a separate PR to get it running in CI ? This way, we could get more stable results. My laptop has a lot of noise.

As discussed, I updated PR to reverse lower 32 bits that will further more distant 2 consecutive number to ensure they are not mapped to the same cache line.

@zhengyu123 zhengyu123 marked this pull request as ready for review June 13, 2025 14:09
Copy link
Collaborator

@jbachorik jbachorik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do the benchmarks look with the latest changes?
Otherwise, LGTM!

@zhengyu123
Copy link
Contributor Author

JFR recordings from Erwan's benchmark:

Basline:
before

After patch:
after

Copy link
Collaborator

@r1viollet r1viollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is faster than main, so LGTM
I would still need to check the #224 PR to understand the impact of JNI

@jbachorik
Copy link
Collaborator

Is this superseded by #229 ?
If yes, we should close this one.

@zhengyu123 zhengyu123 closed this Jun 25, 2025
@zhengyu123 zhengyu123 deleted the zgu/rehash-tid branch June 25, 2025 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants