-
Notifications
You must be signed in to change notification settings - Fork 2
FCP-1434: Explicit wildcard support for snapshot cache #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
FCP-1434: Explicit wildcard support for snapshot cache #29
Conversation
…added. Clean up tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nil check above prevents sending empty responses when filtering returns no matching
resources for non-wildcard subscriptions, preserving the xDS "watch stays open" behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should likely be changed to "if non-wildcard" rather than "if has resources"
| @@ -598,10 +608,10 @@ func TestAvertPanicForWatchOnNonExistentSnapshot(t *testing.T) { | |||
| // Create watch. | |||
| req := &cache.Request{ | |||
| Node: &core.Node{Id: "test"}, | |||
| ResourceNames: []string{"rtds"}, | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed test bug: resource name in request didn't match snapshot ("rtds" vs "one-second").
New filtering logic correctly returns nil for non-existent resources, exposing the mismatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure why these tests needed this mock struct. The test can be accomplished without, though I was hesitant to make an out-of-scope change
valerian-roche
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe overall the code might need a bit more refactor on the respond vs. createResponse functions, as well as within createDeltaResponse at this stage
The alternative would be to fully redesign as the linear cache but might be out of scope here
| @@ -54,6 +72,47 @@ func NewSnapshot(version string, resources map[resource.Type][]types.Resource) ( | |||
| return &out, nil | |||
| } | |||
|
|
|||
| // NewSnapshotWithExplicitWildcard creates a Snapshot where each resource | |||
| // explicitly specifies whether it should be returned as part of wildcard requests. | |||
| func NewSnapshotWithExplicitWildcard( | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: given the functional implication, I'm wondering if we should rename this "default" instead
Wildcard is the technical name within the xDS specification, but it does not necessarily mean much in the context of users
| @@ -54,6 +72,47 @@ func NewSnapshot(version string, resources map[resource.Type][]types.Resource) ( | |||
| return &out, nil | |||
| } | |||
|
|
|||
| // NewSnapshotWithExplicitWildcard creates a Snapshot where each resource | |||
| // explicitly specifies whether it should be returned as part of wildcard requests. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you specify that it is mainly for OdCDS currently?
| out.Resources[index] = NewResourcesWithTTL(version, rawResources) | ||
|
|
||
| // Only store wildcard map entry if there are resources of this type | ||
| // An empty set means no resources are wildcard for this type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the functional interpretation of this? Is it that the type uses all resources for wildcard or none?
|
|
||
| // Only store wildcard map entry if there are resources of this type | ||
| // An empty set means no resources are wildcard for this type | ||
| if len(snapshotResources) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this check needed? If no resources exist it's fine to set an empty set
| // IsResourceWildcard checks if a specific resource should be returned for wildcard requests. | ||
| // If wildcardMap is nil, all resources are considered wildcard-eligible (backward compatibility). | ||
| // If wildcardMap is not nil, only resources explicitly marked as wildcard are eligible. | ||
| func (s *Snapshot) IsResourceWildcard(typeURL resource.Type, resourceName string) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: maybe ShouldReturnOnWildcard or WatchedByDefault
| @@ -195,7 +248,7 @@ func (cache *snapshotCache) sendHeartbeats(ctx context.Context, node string) { | |||
| for id, watch := range info.watches { | |||
| // Respond with the current version regardless of whether the version has changed. | |||
| version := snapshot.GetVersion(watch.Request.GetTypeUrl()) | |||
| resources := snapshot.GetResourcesAndTTL(watch.Request.GetTypeUrl()) | |||
| resources := getResourcesAndTTLForSubscription(snapshot, watch.Request.GetTypeUrl(), watch.subscription.SubscribedResources(), watch.subscription.IsWildcard()) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: why not just pass in the watch or subscription directly?
| // for instance if the subscription is newly wildcard. | ||
| for r := range snapshot.GetResources(request.GetTypeUrl()) { | ||
| for r := range snapshot.GetWildcardResources(request.GetTypeUrl()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This must also assess non-wildcard resources as above.
Likely would better be rewritten as always checking "subscribed" and also checking wildcard when applicable (instead of if/else)
Also shows there might be a missing test
| @@ -485,6 +538,10 @@ func (cache *snapshotCache) cancelWatch(nodeID string, watchID int64) func() { | |||
| // Respond to a watch with the snapshot value. The value channel should have capacity not to block. | |||
| // TODO(kuat) do not respond always, see issue https://github.com/envoyproxy/go-control-plane/issues/46 | |||
| func (cache *snapshotCache) respond(ctx context.Context, watch ResponseWatch, resources map[string]types.ResourceWithTTL, version string, heartbeat bool) error { | |||
| if resources == nil { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not exit on initial query for wildcard, even if nothing is to be returned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should likely be changed to "if non-wildcard" rather than "if has resources"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wildcard support should be added here
Part 1 of implementing explicit wildcard support for xds-control-plane.
Context
We leverage the Snapshot Cache in go-control-plane, where delta for a node ID snapshot is proliferated to Envoys with that node ID when there is a change. This is acceptable in the current state where Fabric Proxy declares the destinations (clusters) it needs at start time in YAML.
The introduction of ODCDS makes the current state unpalatable, even though it is functional. The issue is that we do not want a mass proliferation of new clusters to all Envoys with node ID X when one pod with that node ID X requests a resource. Not solving this could create significant load on our control plane, especially for Fabric Proxy users with hundreds or thousands of destinations.
To solve this, we should add support for explicit wildcard support for control plane. By default, a snapshot resource will be considered wildcard and proliferated to all Envoys with the relevant node ID. By contrast, we will allow certain resources to not be wildcard, allowing ODCDS clusters to only be proliferated on demand.
Implementation details
This PR aims to make changes in a way that are fully backward compatible for Delta, SoTW, and legacyWildcardEnabled.
Exported functions
NewSnapshotWithExplicitWildcard: creates a Snapshot where each resource explicitly specifies whether it should be returned as part of wildcard requests.This allows the snapshot to contain wildcard metadata for every resource. E.g.
The following are public accessors for snapshots to specifically get resources marked as wildcard:
GetWildcardResourcesGetWildcardResourcesAndTTL