-
Notifications
You must be signed in to change notification settings - Fork 129
(feat) initial types and interfaces for pluggable data layer #1154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @elevran. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
pkg/epp/datalayer/datasource.go
Outdated
// TODO: the following is useful for a data source that operates on | ||
// endpoints and might not be relevant for "global/system" collectors which | ||
// might not need the concept of an endpoint. This can be split, if needed, | ||
// to a separate interface in the future. | ||
|
||
// AddEndpoint adds an endpoint to collect from. | ||
AddEndpoint(ep Endpoint) error | ||
|
||
// RemoveEndpoint removes an endpoint from collection. | ||
RemoveEndpoint(ep Endpoint) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you wrote it yourself in the comments, it might make sense to have DataSource
interface without these functions and additional EndpointDataSource
interface that embeds DataSource
and defines these functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that's a possibility.
As part of #1023 we agreed to push non-endpoint data source out until we have concrete requirements so I think we can keep it for now (i.e., don't have enough to go on at this time to rationalize what is shared and what's not between the per-endpoint and the non-endpoint data source).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with explicit, tight scope for now. And only expanding as needed.
The future will have more information with how we want to expand our interface, so if we can postpone decisions make more informed decisions, that is fantastic.
pkg/epp/datalayer/datasource.go
Outdated
// ExpectedType defines the type expected by the extractor. It must match | ||
// the DataSource.OutputType() the extractor registers for. | ||
ExpectedType() reflect.Type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this also used internally only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so - it is needed by the DataSource to validate the extractor can process its output.
Prefer to validate that data source and extractor are compatible at registration time, so at least one needs to have the type as part of the interface.
But if you have an idea on how to implement type checks differently - would be happy to remove from here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer to validate that data source and extractor are compatible at registration time, so at least one needs to have the type as part of the interface.
Agreed, we should have some 'pre-flight checks' to make sure everything is buttoned up before marking EPP ready
.
I see that this function exists below, and have commented there
pkg/epp/datalayer/datasource.go
Outdated
|
||
var ( | ||
// DefaultDataSources is the system default data source registry. | ||
DefaultDataSources = DataSourceRegistry{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this is initialized on bootstrap and remains static. in such a case we don't need locking and it might be implemented the same way as plugins registry (just as a map[string]DataSource
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to guarantee that concurrent additions are also safe or not expected/allowed.
Am fine with making the assumption that data sources are all added before reads happen and are added from the same (e.g., main) go routine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am fine with making the assumption that data sources are all added before reads happen and are added from the same (e.g., main) go routine.
Yeah, I think that is a safe assumption. Personally, I am cautious to allow any dynamic reconfiguration for our pluggable systems. That would allow for some pretty wild and hard-to-defend behavior imo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I am cautious to allow any dynamic reconfiguration for our pluggable systems. That would allow for some pretty wild and hard-to-defend behavior imo
Does it make sense to make it explicit (e.g., introduce an explicit "freeze data sources" call so that they can't be changed afterwards)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed here to use sync.Map as well
pkg/epp/datalayer/metrics.go
Outdated
) | ||
|
||
// Addressable supports getting an IP address and a namespaced name. | ||
type Addressable interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why Addressable in metrics file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanted to keep it closest to where it used (the metrics client).
Seems to small to have its own file but open to suggestions on a better fitting location.
Another option is to move it next to PodInfo (which implements Addressable)
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments!
I have some higher level questions around the extractor model & how its intended to be used. If our v1 of the pluggable data layer is scoped specifically to endpoints, I'm also wondering why we arent feeding into the datasource representation of the endpoints.
Nothing hard-blocking, but probably better to discuss eariler rather than later.
Thanks Etai!
pkg/epp/datalayer/datasource.go
Outdated
// TODO: the following is useful for a data source that operates on | ||
// endpoints and might not be relevant for "global/system" collectors which | ||
// might not need the concept of an endpoint. This can be split, if needed, | ||
// to a separate interface in the future. | ||
|
||
// AddEndpoint adds an endpoint to collect from. | ||
AddEndpoint(ep Endpoint) error | ||
|
||
// RemoveEndpoint removes an endpoint from collection. | ||
RemoveEndpoint(ep Endpoint) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with explicit, tight scope for now. And only expanding as needed.
The future will have more information with how we want to expand our interface, so if we can postpone decisions make more informed decisions, that is fantastic.
|
||
// DataSource is an interface required from all datalayer data collection | ||
// sources. | ||
type DataSource interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implies that each datasource will have it's own list of endpoints, that all need to be kept up to date, rather than feeding into a centralized list of endpoints, just with its custom data included.
OOC, why did we choose this path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two possible design options. I'm fine with either so if there's a strong preference one way or another, am happy to change it.
- data sources keep references to endpoints (e.g., in the case of
/metrics
, it needs the IP address and a way to set metrics on the endpoint itself). This is what I had in mind the rationale forAdd/RemoveEndpoint
. - endpoints keep references (directly or indirectly) to datasources (see bullets below).
I was thinking of the first option as more flexible (e.g., a data source that records node name for each endpoint does not need to be continuously called for the endpoint... the data source would not need to create a go routine for this endpoint at all), but engineering wise both work.
If we prefer a model where there is explicitly one go routine per endpoint (possible downsides: head of line blocking by "slow/failing" data sources impact collection by others, all data sources are applicable to all endpoints, etc.), the design would change as follows:
- data store own the go routines and calls start/stop when endpoints are added and removed. There is a "collection context/state" stored in datastore per endpoint and there should not be a start/stop
RefreshLoop
method defined on endpoints - they're "pure data structures". - on every "tick", the go routine code calls all data sources to
CollectFor(ep)
. Internally the data source can still call multiple extractors, so the main change is removal ofStart
/Stop
methods and addition ofCollectFor
to data source.
WDYT? The second option (go routine per endpoint calling all data sources passing each the endpoint) is possibly closer to how it is done in GIE today so can continue along that path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kfswain @nirrozenbaum this is a schematic snippet of what the second option may look like:
type Collector interface {
Name() string
CollectFor(Endpoint) error
AddExtractor(Extractor)
}
type Extractor interface {
Name() string
ExpectedType() reflect.Type
Extract(interface{}, Endpoint)
}
// when data store is notified of new Pod, it starts a go routine for it (there a scraping context that saves a reference to the Endpoint, cancellation contexts, etc)
func (dlc *CollectionState) CollectionCycle() {
ep := dlc.Endpoint()
for _, c := range datalayer.Collectors() { // or we can store collectors on the collection state directly
if err := c.CollectFor(ep); err != nil {
// handle failures - such as exponential backoff, disabling collectors after some time...
}
// if successful, collector has called extractors with ep for updating
}
}
Let me know if this seems more "natural/aligned".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed so DataSource is no longer informed of and tracks endpoints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm with you here. The second option, from my perspective, separates concerns more cleanly:
- Let the datastore focus on the actual collection mechanism. It does potentially create head of line blocking, agreed. But this seems acceptable with proper timeouts/backoff.
- the Pod controller remains the single source of truth wrt endpoints and just needs to update the datastore endpoint list
- Custom endpoints just focus on what data they care about, and how to get it
pkg/epp/datalayer/datasource.go
Outdated
// ExpectedType defines the type expected by the extractor. It must match | ||
// the DataSource.OutputType() the extractor registers for. | ||
ExpectedType() reflect.Type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer to validate that data source and extractor are compatible at registration time, so at least one needs to have the type as part of the interface.
Agreed, we should have some 'pre-flight checks' to make sure everything is buttoned up before marking EPP ready
.
I see that this function exists below, and have commented there
pkg/epp/datalayer/datasource.go
Outdated
|
||
var ( | ||
// DefaultDataSources is the system default data source registry. | ||
DefaultDataSources = DataSourceRegistry{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am fine with making the assumption that data sources are all added before reads happen and are added from the same (e.g., main) go routine.
Yeah, I think that is a safe assumption. Personally, I am cautious to allow any dynamic reconfiguration for our pluggable systems. That would allow for some pretty wild and hard-to-defend behavior imo
this LGTM. /approve |
I'm leaning towards the second option outlined in the discussion below this comment where Endpoints are not updated into the DataSource but instead managed in the data store exclusively, calling collectors with the endpoint as parameter (i.e., data source manages the collection go routines). |
Signed-off-by: Etai Lev Ran <[email protected]>
Signed-off-by: Etai Lev Ran <[email protected]>
Signed-off-by: Etai Lev Ran <[email protected]>
Signed-off-by: Etai Lev Ran <[email protected]>
- DataSource is no longer tracking Endpoints. - Removed unused interfaces. - Changes to exported scope. - Move Addressable to podinfo.go. Signed-off-by: Etai Lev Ran <[email protected]>
Signed-off-by: Etai Lev Ran <[email protected]>
Sorry for the delay @elevran I checked this PR a few days ago and it didn't look like you had responded to all comments. Either I looked just before you commented or I'm going blind :) /lgtm Thanks! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: elevran, kfswain, nirrozenbaum The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This PR includes the types and interfaces needed to support a pluggable data source.
It is the first PR of several aimed to refactor the current pkg/epp/backend in the design described in #1023.
Specifically:
PodMetrics
to includesPodInfo
,Metrics
andAttributeMap
(an extensible set of attributes, to be used by pluggable data sources)DataSource
encapsulates the scraping loop withExtractors
collecting specific items from the raw object being scraped (e.g.,/metrics
). Specific implementation, matching the current backend/metrics, will be added under datalayer/metrics.Endpoint
interface currently includes management of the refresh loop. This will be removed under control of the DataSource in a later PR. Similarly, MetricClient and Factory might become redundant.Next PRs
backend.Pod
=>datalayer.PodInfo
across datastore, tests, etc.). This will also introduce mocks for tests.