-
-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Labels
Description
Which new version of Apache Kafka should we support?
Kafka 4 has been released. This is the first release that operates entirely without ZooKeeper and running KRaft by default.
KRaft is officially available since release Kafka 3.9.
This means a new (Kafka-) role must be introduced to replace the external ZooKeeper.
A powerful new consumer group protocol designed to dramatically improve rebalance performance is introduced to significantly reduce downtime and latency. Java versions were updated to 11 and 17 respectively.
Release notes: https://archive.apache.org/dist/kafka/4.0.0/RELEASE_NOTES.html
Docker image
Current Status
- Support deploying Kafka with KRaft instead of ZooKeeper for consensus building #690
- Documentation: Update for kafka 4.x #876
- Update logging for Kafka 4.x #872
Next
- Kafka Demos: Update to Kafka 4.x demos#232
- demo docs need to be rewritten to showcase the usage of Kafka but without
kcat
- demo docs need to be rewritten to showcase the usage of Kafka but without
- Replace
kcatwith Kafka client scripts- affects the kafka operator, tests and demo documentation.
- also implement this Kafka: operator should create client.properties for CLI tools in the product container issues#768
- GracefulShutdown improvements: Currently Prestop sleep hook is used in the Controller to provide brokers more time to off load when shutting down the cluster. This is a beta feature until Kubernetes 1.34 and must be replaced since we do not want to use beta features. We want to do this timeboxed (4h) if e.g. autodetection of the Kubernetes version / Endpoint to request features is possible and we switch from Prestop hook to a different implementation.
- Improve
AntiAffinitiescontroller / broker to ensure they are on different nodes?- @razvan: Currently the anti affinity rules ensure that brokers are spread out as much as possible. Same for controllers. To also separate controllers from brokers, taints and tolerations are probably the better mechanism because it allows nodes to be provisioned accordingly. For example, broker nodes could require more resources than controllers.
- Liveness / Readiness (controller): Currently TCPProbe, improve via (e.g. check if quorum joinend?)
- @razvan: An alternative to the tcp probe would either have to use a lightweight process like
kcator an HTTP endpoint. kcatdoesn't support Kraft controllers- the kafka rest proxy cannot be used because of the license restrictions
- @razvan: An alternative to the tcp probe would either have to use a lightweight process like
- Improve
PDBs for broker (currently 1) or controller (currently 1)?- @razvan: Leave as is for now.
3.7.2 no dynamic quorum (bad for scaling) https://developers.redhat.com/articles/2024/11/27/dynamic-kafka-controller-quorum; documented here, do we want to suppress / warn within the operator?Discovery (currently justhost:portcombinations exposed for brokers, no other connection details (TLS))
Next 2
The following issues are only partially (or not at all) implemented and tested.
- Kraft controller authorization (opa)
- Kraft controller down-scaling and/or shutdown
- Kerberized Kraft controllers
- ZooKeeper - KRaft migration (manual guide, half automated, full automated)
- https://strimzi.io/blog/2024/03/21/kraft-migration/
- https://docs.confluent.io/platform/current/installation/migrate-zk-kraft.html
- https://kafka.apache.org/documentation.html#upgrade
- Try out manual migration from Zk to Kraft to have an answer to it
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Development: In Progress
Status
In Progress