-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Hi! Thanks again for a really useful library.
I'm working on moving centrally managed schemas into the application that actually owns them, and I've stumbled on an issue with the register_schema method.
We're using avro encoding for the keys as well as the values in the kafka messages, and in this instance the key schema is just a simple string schema. All schemas owned by the application share the same key schema.
I noticed that when testing schema creation from the application using kafkit, only the key schema for the first subject is actually created in the schema registry, the others get short circuited in the register_schema method here:
kafkit/kafkit/registry/sansio.py
Lines 369 to 372 in 1603628
| # look in cache first | |
| try: | |
| schema_id = self.schema_cache[schema] | |
| return schema_id |
This becomes a problem when other clients rely on the key schema_id being registered under the other subject as well in the schema registry, and they now crash when they can't find a key-schema for their subjects.
Update: A quick workaround for this problem is to wipe the schema cache in the registry manually when I loop through the schemas I need to make sure exist in the confluent schema registry, like so:
registry._schema_cache = SchemaCache()
I'm happy to provide a PR with a solution when I know a bit more about the rationale and use case for short circuiting the schema registering.