Apache Kafka

Install Kafka:

Download tar file.
Extract it at location say /usr/local/kafka_2.11-0.8.2.2
Set variables in .bashrc

###Kafka

export KAFKA_HOME=/usr/local/kafka_2.11-0.8.2.2

export PATH=$PATH:$KAFKA_HOME/bin

###

With Kafka, we can create multiple types of clusters, such as the following:

A single node—single broker cluster
A single node—multiple broker cluster
Multiple nodes—multiple broker clusters

A single node – a single broker cluster

Starting the ZooKeeper server:

>bin/zookeeper-server-start.sh config/zookeeper.properties

· Starting the Kafka broker:

>bin/kafka-server-start.sh config/server.properties

· Creating a Kafka topic:

>kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic kafkatopic

· Get list of topics:

>kafka-topics.sh --list --zookeeper localhost:2181

· Start console-based producer

>kafka-console-producer.sh --broker-list localhost:9092 --topic kafkatopic

type:

Welcome to Kafka DS

This is single node single broker cluster

Just started !! Jai Ganesh

· Start command line consumer client

>kafka-console-consumer.sh --zookeeper localhost:2181 --topic kafkatopic --from-beginning

Output:

Welcome to Kafka DS

This is single node single broker cluster

Just started !! Jai Ganesh

A single node – multiple broker clusters

· Starting the ZooKeeper server:

>bin/zookeeper-server-start.sh config/zookeeper.properties

· Starting the Kafka broker:

For setting up multiple brokers on a single node, different server property files are required for each broker. Each property file will define unique, different values for the following properties: broker.id, port, log.dir

>bin/kafka-server-start.sh config/server-1.properties

>bin/kafka-server-start.sh config/server-2.properties

· Creating a Kafka topic using the command line

>kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 2 --partitions 4 --topic replicated-kafkatopic

Note: - Replication factor should be in accordance with number of brokers. Else can cause below exception:

kafka.admin.AdminOperationException: replication factor: 3 larger than available brokers: 2

at kafka.admin.AdminUtils$.assignReplicasToBrokers(AdminUtils.scala:70)

at kafka.admin.AdminUtils$.createTopic(AdminUtils.scala:171)

at kafka.admin.TopicCommand$.createTopic(TopicCommand.scala:93)

at kafka.admin.TopicCommand$.main(TopicCommand.scala:55)

at kafka.admin.TopicCommand.main(TopicCommand.scala)

· Starting a producer to send messages

>kafka-console-producer.sh --broker-list localhost:9093, localhost:9094 --topic replicated-kafkatopic

If we have a requirement to run multiple producers connecting to different combinations of brokers, we need to specify the broker list for each producer as we did in the case of multiple brokers.

· Starting a consumer to consume messages

>kafka-console-consumer.sh --zookeeper localhost:2181 --from-beginning --topic replicated-kafkatopic

Multiple node- multiple broker cluster

We should install Kafka on each node of the cluster, and all the brokers from the different nodes need to connect to the same ZooKeeper. Then follow the same step on every machine to start broker as followed above to start multiple broker on single machine.

Tech Devins

Search This Blog

Apache Kafka

Comments

Post a Comment

Popular posts

Unix Server ( Edge Node ) hangs when there are many jobs running on hadoop cluster started from Unix Edge Node.

Spring MongoDB Rest API not returning response in 90 seconds which is leading to client timeout

MongoDB Regex Query taking more time in Production but same query perform well in UAT

Spark MongoDB Connector Not leading to correct count or data while reading

Clone multiple projects or repositories from GitLab under a folder or subfolder.

Tech Devins

Subscribe

Apache Kafka

Comments

Post a Comment

Popular posts

Unix Server ( Edge Node ) hangs when there are many jobs running on hadoop cluster started from Unix Edge Node.

Spring MongoDB Rest API not returning response in 90 seconds which is leading to client timeout

MongoDB Regex Query taking more time in Production but same query perform well in UAT

Spark MongoDB Connector Not leading to correct count or data while reading

Clone multiple projects or repositories from GitLab under a folder or subfolder.