In this article, we’ll look how into how producers work internally and how the partitions are selected by producer while sending messages
Kafka Series:
- Apache Kafka: Quick Start
- Kafka Connect: Quick Start
- Setup Kafka Cluster in Local/Docker
Here we have 3 Kafka brokers running at ports : 29091, 29092, 29093
Sample Topic: kafka-prod
Producer:
It’s an application which acts as a source of the data stream. In other publish/subscribe systems, the same components are known as publishers or writers.
Producers sends a message/record to a specific topic (as specified while creating a Producer).
By default, the producer doesn’t care about the topic-partition on which the messages are written to and will balance the messages fairly over all the partitions of a topic. Producer picks the partition based on the hash of the record’s key or in round-robin fashion if the record has no key.
Producer defaults:
· Partition strategy for message/record without a key — Round-robin
· Partition strategy for message/record with a key à Record with same key — goes to the same partition
· Producer picks the partition to which the record is sent to
· Default key is set to null, if not specified
Kafka uses the key to specify the target partition. The default strategy is to choose a partition based on a hash of the key or use round-robin algorithm if the key is null.
· Kafka works with key-value pairs, but so far, we sent records with values only (i.e., we sent key-value pairs but the keys are null).
· To enable sending full key-value pairs, from the command-line, we need to use two properties as below:
- parse.key : default it’s false, if it’s true then message key is mandatory.
- key.separator
Above image depicts how the messages with the same key placed in the same partition.