Skip to content

Create a new monitor for node-level write load #131560

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

DiannaHohensee
Copy link
Contributor

@DiannaHohensee DiannaHohensee commented Jul 18, 2025

Plugs a basic NodeUsageStatsForThreadPoolsMonitor
into the code and sets it up with access to some
components it will need to implement ES-11992.

Relates ES-11991

Plugs a basic NodeUsageStatsForThreadPoolsMonitor
into the code and sets it up with access to
information it will need to implement ES-11992.

Relates ES-11991
@DiannaHohensee DiannaHohensee self-assigned this Jul 18, 2025
@DiannaHohensee DiannaHohensee requested a review from a team as a code owner July 18, 2025 21:04
@DiannaHohensee DiannaHohensee added >non-issue :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) Team:Distributed Coordination Meta label for Distributed Coordination team v9.2.0 labels Jul 18, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@DiannaHohensee DiannaHohensee requested a review from mhl-b July 18, 2025 21:17
Copy link
Contributor

@mhl-b mhl-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with nit naming

*
* TODO (ES-11992): implement
*/
public class NodeUsageStatsForThreadPoolsMonitor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be more specific, for example WriteLoadMonitor. I think it should monitor write-load in general, which can be a set of different metrics - write-thread-pool utilization, index/shard write load.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, thanks. I've renamed it WriteLoadConstraintMonitor, to follow the existing naming pattern we have.

NodeUsageStatsForThreadPools has been used generically for the thread pool usage, even though we only care about write load right now, because we'll presumably have search load in future. But I expect any monitoring added for search in future should go into a separate monitor class for maintainability reasons.

Copy link

❌ Author of the following commits did not sign a Contributor Agreement:
9bcf8ab

Please, read and sign the above mentioned agreement if you want to contribute to this project

@DiannaHohensee DiannaHohensee added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Jul 21, 2025
*
* TODO (ES-11992): implement
*/
public class WriteLoadConstraintMonitor {
Copy link
Contributor

@mhl-b mhl-b Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can follow disk allocator naming too, WriteLoadThresholdSettings and WriteLoadThresholdMonitor. I dont see reason for Constraint or Threshold wording here, it's implicit in the name of Monitor. If we monitor something that means there is a value and threshold for this value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping to convey the notion of resource constrained shard allocation decisions.

Monitor by itself doesn't imply thresholds or constraints. It says we're watching for something. In the case of WriteLoadMonitor, we're monitoring write load, but we aren't saying for what.

A significant part of this is simply consistency in naming. For example, you can find all the DiskThreshold* files with the same prefix. Similarly here, WriteLoadConstraint*. If we settle on something other than the WriteLoadConstraint* prefix, I'm inclined to put that in a separate PR to rename the other file as well. The new heap decider logic also follows a prefix pattern -- they in fact did a followup PR for renaming.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went ahead with pushing this so as not to block ES-11992.

@DiannaHohensee DiannaHohensee merged commit 79bdc32 into elastic:main Jul 22, 2025
32 of 33 checks passed
@DiannaHohensee DiannaHohensee deleted the 2025/07/18/node-thread-pool-monitor branch July 22, 2025 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >non-issue Team:Distributed Coordination Meta label for Distributed Coordination team v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants