Do you think IIT Guwahati certified course can help you in your career?
No
Introduction
Do you also have a doubt about how we can read the data from the Kafka Topics; here, you can assume topics as the tables of the database that stores the data. Kafka Producers are the ones that are used to publish the data to topics, and Kafka Consumers are used to read the data from the topics by subscribing to them. So before understanding Kafka consumers, we will discuss what the topics are.
In this article, we will discuss what the Kafka topics are, what the Kafka consumers are, and what are the consumer groups, advantages, and disadvantages of Kafka consumers.
In Apache Kafka, Kafka Topics can be considered as the name that is used to store the stream of data. In a single topic, there can be different partitions that will be contiguous similar to an array where each partition will have different offset numbers. These topics can be published with the help of Kafka producers and can be read with the help of Kafka consumers.
Here is an example for understanding the topics and partitions in Kafka:
In the above image, there is one topic where the data is stored in each partition (p0, p1, and p2), each of different offset numbers. In Kafka, these topics can be of n numbers; this can be related to the databases where the table stores the data, but there is no concept of constraints in Kafka topics. We will discuss a bit about brokers in the next subsection.
Brokers in Kafka
But the question arises then what is the role of Apache Kafka? There is a concept of a broker, which is nothing but a Kafka cluster composed of multiple servers. A broker is a container that stores the topics with their multiple partitions. The below image can be used to understand it more:
Now that we have seen what the topics are, we can move to the main goal of our article, “Kafka Consumers” to understand Kafka consumers.
What is Kafka Consumer?
As the name suggests, Kafka consumers must be used to consume something. So Kafka Consumers are used to consume or read the data from the Kafka topics. The Kafka consumers are used to read the data from the partition in a sequential manner.
The Kafka consumers basically read the data from the broker as the topics with their multiple partitions are stored in the broker. So, Here is an example of how the consumers read the data from the broker:
In the above image, there are four consumers (C1, C2, C3, C4), where C1 and C2 reads the data from broker 1, which contains the Partition 0 of Topic A. C3 and C4 reads the data from broker 2, which contains the Partition 1 of Topic A.
Advantages of Kafka Consumer
The real-time data pipelining can be handled while reading the data by the consumer.
A large number of messages can be handled, which makes the application more scalable.
Easy and Simple to read the data stream with the help of the consumer.
In Kafka, the latency value will be low because it decouples the message so that the consumer can read the message anytime.
Cases like the number of consumers are more than the partitions that can be handled in the consumer group.
Disadvantages of Kafka Consumer
Consumers can reduce the performance of Kafka by compressing and decompressing the data stream.
The throughput of Kafka can also be reduced along with the performance.
The reading of the data stream can not be monitored completely as there is no complete set of monitoring tools in Kafka.
Kafka Consumer Group
As the name suggests Kafka consumer group, which is nothing but a group of multiple consumers. At a time, there can be multiple groups with different numbers of consumers in them.
The consumers present in the groups can directly read the data from the exclusive partitions of the topic. Here is a diagram to understand how consumer reads the data from the partitions:
In the above diagram, there are 3 consumers in Group 1 and 2 consumers in Group 2. In group 1, there are C1, C2, and C3 consumers who are reading data from partitions 1, 2, and 3. In group 2, there are C4 and C5 consumers who are reading data from partitions 4 and 5. There is no conflict, and all the consumers are reading the data successfully.
But a problem can arise; imagine what if there are more consumers than the partitions, then what will happen? Then some of the consumers will be placed in the inactive state. There are two features that are implemented by Apache Kafka called GroupCoordinator and ConsumerCoordinator. These features automatically decide which consumer is supposed to be in the inactive state.
Here is an example to understand this problem more:
In the above diagram, there are 3 consumers in Group 1 and 2 consumers in Group 2. In group 1, there are C1, C2, and C3 consumers who are reading data from partitions 1, 2, and 3. In group 2, there are C4 and C5 consumers who are reading data from partition 4. Now here, there is no partition left for C5, so it is supposed to be in an inactive state. C5 will be back into an active state if any other consumer leaves the partition.
The Kafka Consumers are used to consuming or reading the data from Kafka topics. The Kafka consumers are used to read the data from the partition in a sequential manner.
What are the Kafka Topics?
Kafka Topics can be considered as the name that is used to store the stream of data. In a single topic, there can be different partitions that will be contiguous, where each partition can be of different sizes.
What is the broker in Kafka?
This is the main role of Kafka in reading the data from the topics. A broker is nothing but a Kafka cluster composed of multiple servers. A broker is a container that stores the topics with their multiple partitions.
What is Kafka Consumer Group, and how is data reading handled?
A Kafka consumer group is a group of multiple consumers. At a time, there can be multiple groups with different numbers of consumers in them. There are two features called GroupCoordinator and ConsumerCoordinator which automatically take care of reading like active and inactive states.
Conclusion
Kafka Consumers are used to consume or read the data from the Kafka topics. In this article, we discussed what is the concept of topics and partition; then, we discussed what is the Kafka consumer along with the Kafka consumer group. We also discussed the advantages and disadvantages of Kafka consumers.
Here are more articles that are recommended to read: