Skip to content

Conversation

ftomassetti
Copy link
Contributor

@ftomassetti ftomassetti commented Aug 17, 2025

Here we provide the code to observe nodes. This is a pre-requisite to implement Delta and the code has been extracted from the branch where delta is being implemented.

  • If a Node that was root becomes not root, then its partitions observers should be removed
  • When a node is added into a tree it should update its observer (perhaps we could add a flag called dirty to make it knows that it should update the cache at the next access)

Given this change was very impactful we started measuring coverage, and set the rule to get coverage to 60%+ for each file tested so we added a lot of tests and comments as needed.

@ftomassetti ftomassetti force-pushed the feature/node-observer branch 2 times, most recently from f788ab2 to 8ccfdea Compare August 17, 2025 08:01
Copy link
Contributor

@enikao enikao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In delta protocol, we consciously decided to only have observers on partitions.
Main reason: If they're on nodes, each observer (or a shared superclass) has to keep track of all newly added / removed nodes, and keep attaching / removing itself from them. It also gets more tricky if we moved a node from on parent to the other, and have observers on both -- who gets the message?

Implementation-wise this very simple in C#: We walk up parents until we find the root (or first IPartitionInstance, we have that there). If we find one, we check if there's an observer registered, and do the same thing as here.

Of course, this implementation can be used to satisfy delta spec. But I think we should only have two pretty different strategies for such a similar task if we have a good reason to do so.

@ftomassetti
Copy link
Contributor Author

In delta protocol, we consciously decided to only have observers on partitions. Main reason: If they're on nodes, each observer (or a shared superclass) has to keep track of all newly added / removed nodes, and keep attaching / removing itself from them. It also gets more tricky if we moved a node from on parent to the other, and have observers on both -- who gets the message?

Implementation-wise this very simple in C#: We walk up parents until we find the root (or first IPartitionInstance, we have that there). If we find one, we check if there's an observer registered, and do the same thing as here.

Of course, this implementation can be used to satisfy delta spec. But I think we should only have two pretty different strategies for such a similar task if we have a good reason to do so.

Thank you, I did not think about such approach. Let me think about it and get back to you

@ftomassetti
Copy link
Contributor Author

One thing I would not like about having an observer on the partition is this: I want to be sure that when we have no observers the performance issue is as small as it can be. Currently we have some checks to verify if the current observer field is null. If we moved the observer on the partition we would need to traverse all the ancestors to check if the observer field on the root is null or not. So even when we have no observer registered we will have a performance impact. I would like to reflect on how to minimize it

@enikao
Copy link
Contributor

enikao commented Aug 18, 2025

Before doing any optimization on this I'd measure the actual impact.
If we see some, we could store an internal reference (or null) from each node to the observer -- without having an individual node observer API. This way, at least only the internal implementation has to care, not every observer.
We could even store only a boolen if there's any observer.

@ftomassetti
Copy link
Contributor Author

Before doing any optimization on this I'd measure the actual impact.

Normally I would postpone optimizations, but for years I have been fighting a lot of performance issues and at this time we are using LionWeb Java for projects where we cannot afford worsening performance. For this reason so I want to be careful and ensure we do not merge anything that would make performance significantly worse when observers are not used. If a performance hit is acceptable for those who needs to use observers, others should not experience any change.

The problem with measuring the impact would be that it is easy to do "in the wild", using the library for a while on different use-cases, but it is for me difficult to reproduce it "in lab", for this reason I tend to err on the side of caution.

If we see some, we could store an internal reference (or null) from each node to the observer -- without having an individual node observer API.

So that would mean having a field with the pointer to the observer, with the only difference that the public methods to register/unregister observers would work only for partitions, right?

We could even store only a boolen if there's any observer.

That also would be reasonable

Copy link

github-actions bot commented Aug 24, 2025

Code Coverage

Overall Project 65.11% -1.07% 🍏
Files changed 76.87% 🍏

File Coverage
AbstractNode.java 100% 🍏
ClassifierInstance.java 100% 🍏
ReferenceValue.java 100% 🍏
CompositePartitionObserver.java 100% 🍏
Classifier.java 97.26% -2.74% 🍏
ProxyNode.java 96.24% 🍏
Node.java 93.55% 🍏
DynamicNode.java 88.54% 🍏
M3Node.java 83.37% -11.09% 🍏
DynamicClassifierInstance.java 69.58% -13.11% 🍏
AbstractClassifierInstance.java 53.25% -22.17% 🍏

@ftomassetti ftomassetti force-pushed the feature/node-observer branch from cf52bbf to 3c980f1 Compare August 30, 2025 07:00
@ftomassetti ftomassetti marked this pull request as ready for review August 30, 2025 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants