Skip to content

Conversation

Danil-Grigorev
Copy link
Member

Motivation

Exploring alternatives for performing reflection on multiple watches, and storing them in a shared cache. Purely a POC at the moment, inspired by #1687

Solution

since the question has showed up a few times.

ultimately this is awkward because the hard generic use in ObjectRef<K>
and in Store<K> which ties them to a particular resource.

It feels like it shouldn't have to be this way because the Store::get
takes an ObjectRef, which feels dynamically typed, but actually isnt.

Tried out a couple of potentials and had to just resort to the dumb thing instead.

Maybe there's some other things we can do for dynamics. Maybe worth exploring in an issue?

Signed-off-by: clux <[email protected]>
@Danil-Grigorev Danil-Grigorev force-pushed the native-multi-reflector branch 3 times, most recently from 450df09 to d7906eb Compare February 10, 2025 16:15
Copy link

codecov bot commented Feb 10, 2025

Codecov Report

Attention: Patch coverage is 21.73913% with 54 lines in your changes missing coverage. Please review.

Project coverage is 75.3%. Comparing base (0bcc625) to head (f860c39).
Report is 45 commits behind head on main.

Files with missing lines Patch % Lines
kube-runtime/src/reflector/dispatcher.rs 0.0% 22 Missing ⚠️
kube-runtime/src/reflector/multi_dispatcher.rs 0.0% 15 Missing ⚠️
kube-runtime/src/reflector/mod.rs 0.0% 7 Missing ⚠️
kube-core/src/dynamic.rs 0.0% 3 Missing ⚠️
kube-core/src/metadata.rs 0.0% 3 Missing ⚠️
kube-core/src/resource.rs 0.0% 2 Missing ⚠️
kube-runtime/src/reflector/store.rs 88.3% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##            main   #1692     +/-   ##
=======================================
- Coverage   75.8%   75.3%   -0.5%     
=======================================
  Files         84      85      +1     
  Lines       7630    7811    +181     
=======================================
+ Hits        5779    5877     +98     
- Misses      1851    1934     +83     
Files with missing lines Coverage Δ
kube-core/src/gvk.rs 75.5% <ø> (ø)
kube-core/src/resource.rs 58.9% <0.0%> (-1.3%) ⬇️
kube-runtime/src/reflector/store.rs 95.8% <88.3%> (-1.1%) ⬇️
kube-core/src/dynamic.rs 72.6% <0.0%> (-4.5%) ⬇️
kube-core/src/metadata.rs 62.2% <0.0%> (-5.4%) ⬇️
kube-runtime/src/reflector/mod.rs 92.1% <0.0%> (-7.9%) ⬇️
kube-runtime/src/reflector/multi_dispatcher.rs 0.0% <0.0%> (ø)
kube-runtime/src/reflector/dispatcher.rs 84.7% <0.0%> (-11.5%) ⬇️

... and 10 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Danil-Grigorev Danil-Grigorev force-pushed the native-multi-reflector branch 5 times, most recently from cb67680 to f26a217 Compare February 13, 2025 01:03
Comment on lines 160 to 163
} = DynamicList::deserialize(d)?;
let mut resources = vec![];
for o in items.iter_mut() {
o.types = Some(types.clone().singular());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the ping @clux, just wanted to ask about this change. DynamicObject lists are affected by kubernetes/kubernetes#80609, and this performs correction.

k8s-openapi already does this for all generated objects during individual object deserialization, so the issue appears only with unstructured resource list.

Is this something which can be done in this instance? Issue may not go away for a while.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..pooosssibly. It's hard to tell from this change alone; we could preserve the kind like this with unit tests and care, but in theory overriding it to fix a bug is possible like this. I'm struggling a bit to visualise the direct use case for doing this conversion. Don't we already have the type information of the kind we are doing dynamic api calls for as an ApiResource?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is passed directly, then largely it is available, but it is on dev to store it and pass it around. Internals don't do that, and returned objects in the list as Vec<DynamicObject> always have empty type meta as the ListMeta goes out of scope, and the only clue is stored there.

The story is different in static types, as it is custom serialization which fixes that: https://github.com/Arnavion/k8s-openapi/blob/30a268f669650450fd4391d3785a078dbbf1d1ee/src/v1_32/api/core/v1/pod.rs#L153-L154

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now removed, for context: #1692 (comment)

Copy link
Member

@clux clux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some quick comments

Comment on lines 55 to 69
/// Construct a new `TypeMeta` for the object from the list `TypeMeta`.
///
/// ```
/// # use k8s_openapi::api::core::v1::Pod;
/// # use kube_core::TypeMeta;
///
/// let mut type_meta = TypeMeta::resource::<Pod>();
/// type_meta.kind = "PodList".to_string();
/// assert_eq!(type_meta.clone().singular().kind, "Pod");
/// assert_eq!(type_meta.clone().singular().api_version, "v1");
/// ```
pub fn singular(self) -> Self {
Self {
kind: self.kind.strip_suffix("List").unwrap_or(&self.kind).to_string(),
..self
}
}
Copy link
Member

@clux clux Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this fn is nice by itself. but it relies on context that is not clear from the name;

  • it's not clear that it works on a List type
  • it's not clear it's a noop on a non-list type

maybe it's better to have this fn named as fn singularize_list to resolve this.
feel free to do this as it's own pr.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Comment on lines 292 to 293
fn gvk(res: &Self::Resource) -> GroupVersionKind {
res.type_meta().try_into().unwrap_or_default()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have derived Default on GroupVersionKind for this afaikt, but this is not particularly useful. I would rather not derive this Default (because it's not useful in the default state), and have this fn return an Option

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed.

Comment on lines 17 to 20
pub trait CacheWriter<K> {
/// Applies a single watcher event to the store
fn apply_watcher_event(&mut self, event: &watcher::Event<K>);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this kind of makes sense to traitify at some point. But I am not sure we are ready to do this yet. There are uncertainties around how it operates (e.g. the delete event issue).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now removed. Only one potential change remains - apply_watcher_event with Arc<K> to remove the need to clone underlying object all the time.

}

///
pub trait TypedResource: Resource + Sized {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this trait differ from one of our existing ones here? How does this help out with solving the issue? I see a lot of trait magic here, and it's non-trivial for me to try to decipher this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a problem with anything else available. DynamicObject is a special case of resource, and unlike any typed resource it has types field we can use.

Unfortunately for anything generated by k8s-openapi or derive CustomResourceDefinition types, TypeMeta is available from constant kind, group and version. There is no type field we can use, so data, even if returned from API server, is lost.

This ✨ magic ✨ makes it possible to have GVK of a resource available as if in a blanket implementation, and it is used for identification of resources in dynamic watchers cache, for routing of events.

This dynamic cache can later be used as a source to establish dynamic watches on statically typed resources, if used as a stream which filters resources by GVK and serializes it to the expected type. This is not a part of this implementation yet, thought.

Such cache, if passed around one or multiple controllers, may serve as an up-to-date state of all watched objects for read operations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JIC: I removed this change as after some debugging it appears that:

  • List returns no TypeMeta on the resource. It is the only place where this trait and additional serialization step is needed.
  • When using streamingList watch option, enabled by default starting from k/k 1.32 no list requests are sent to k8s API server, so it is not needed.

This implementation is therefore focused on 1.32 clusters with streamingList setting enabled for watchers.

Comment on lines 160 to 163
} = DynamicList::deserialize(d)?;
let mut resources = vec![];
for o in items.iter_mut() {
o.types = Some(types.clone().singular());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..pooosssibly. It's hard to tell from this change alone; we could preserve the kind like this with unit tests and care, but in theory overriding it to fix a bug is possible like this. I'm struggling a bit to visualise the direct use case for doing this conversion. Don't we already have the type information of the kind we are doing dynamic api calls for as an ApiResource?

impl<K> TypedResource for K
where
K: Resource,
(K, K::DynamicType): TypedResourceImpl<Resource = K>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol. yeah, this is a bit too much magic imo.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now removed, for context: #1692 (comment)

@Danil-Grigorev Danil-Grigorev force-pushed the native-multi-reflector branch 4 times, most recently from fa82100 to c01f5fe Compare February 26, 2025 12:30
@Danil-Grigorev Danil-Grigorev changed the title Native multi reflector Broadcast shared reflector Feb 26, 2025
Signed-off-by: Danil-Grigorev <[email protected]>
Signed-off-by: Danil-Grigorev <[email protected]>
Signed-off-by: Danil-Grigorev <[email protected]>
Signed-off-by: Danil-Grigorev <[email protected]>
@Danil-Grigorev Danil-Grigorev force-pushed the native-multi-reflector branch from ceac3c1 to 137aa41 Compare March 3, 2025 21:48
@Danil-Grigorev Danil-Grigorev force-pushed the native-multi-reflector branch from 137aa41 to 8770710 Compare March 3, 2025 21:52
Signed-off-by: Danil-Grigorev <[email protected]>
Signed-off-by: Danil-Grigorev <[email protected]>
Signed-off-by: Danil-Grigorev <[email protected]>
@Danil-Grigorev
Copy link
Member Author

Danil-Grigorev commented Apr 4, 2025

As an update here - this change is non breaking since it builds on streams, and so it was simply implemented downstream on stream based controllers (works fine and allows to build granular cache). I'm thinking to close this PR, and look into other areas to improve once I would be able to.

  1. Mainly the gap here is currently in inability to uniquely identify Init, initApply and InitDone events per stream, which would allow to re-populate or remove specific objects from cache in case when only one stream from multiple is restarted or is closed.

  2. Other issue is that events from source stream(s) are consumed immediately, so if the downstream controllers are registered/polled later, some events maybe missed.

And most importantly, this setup works perfectly (except 1 and 2) with watchlist semantic, which is enabled by default in k/k 1.32 and is not affected by missing type meta issue on listed resources (as a LIST vs WATCH used internally in event propagation). But this is still a couple versions away from MSRV.

Signed-off-by: Danil-Grigorev <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants