Skip to content

kafkit fails to register schema under new subject if schema_id already exists in cache #4

@ulrikjohansson

Description

@ulrikjohansson

Hi! Thanks again for a really useful library.

I'm working on moving centrally managed schemas into the application that actually owns them, and I've stumbled on an issue with the register_schema method.

We're using avro encoding for the keys as well as the values in the kafka messages, and in this instance the key schema is just a simple string schema. All schemas owned by the application share the same key schema.

I noticed that when testing schema creation from the application using kafkit, only the key schema for the first subject is actually created in the schema registry, the others get short circuited in the register_schema method here:

# look in cache first
try:
schema_id = self.schema_cache[schema]
return schema_id

This becomes a problem when other clients rely on the key schema_id being registered under the other subject as well in the schema registry, and they now crash when they can't find a key-schema for their subjects.

Update: A quick workaround for this problem is to wipe the schema cache in the registry manually when I loop through the schemas I need to make sure exist in the confluent schema registry, like so:

registry._schema_cache = SchemaCache()

I'm happy to provide a PR with a solution when I know a bit more about the rationale and use case for short circuiting the schema registering.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions