Multinode Kafka

22 Nov 2017

I've been learning Kafka, and figured I'd leave some installation notes here.

Ensure you have installed the jdk, zookeeper, and kafka as described in Installing Zookeeper and Kafka, and then ensure you have gotten multinode zookeeper working as described in Multinode Zookeeper.

New machine:

new-machine$ cd zookeeper-3.4.11/
new-machine$ ./bin/zkServer.sh start

Original machine:

orig-machine$ cd zookeeper-3.4.11/
orig-machine$ ./bin/zkServer.sh start

(I can see how being able to start and stop zookeeper notes via some orchestration mechanism might be nice. Maybe even something as simple as dssh.)

Set up my original kafka to point at my multinode zookeeper, and to be one of two Kafka nodes.

Original node:

orig-machine$ cd kafka_2.11-1.0.0/
orig-machine$ vim config/server.properties

I see that we already have broker.id=0, so let's leave that alone and, for our second node, set broker.id to 1.

The only other thing we should need to change is zookeeper.connect to point to both of our zookeeper nodes:

zookeeper.connect=10.150.60.150:2181,10.150.60.102:2181

Now let's copy Kafka from our original node to our new node:

Original node:

orig-machine$ cd ..
orig-machine$ rsync -av kafka_2.11-1.0.0 mwood@gopher:

New node:

new-machine$ cd kafka_2.11-1.0.0/
new-machine$ vim config/server.properties 

Be sure this broker.id is different from the one we used on the original server!

broker.id=1

Oh, and because we copied our data over as well, it looks like we need to change broker.id in logs/meta.properties as well, so let's go do that:

new-machine$ vim logs/meta.properties 
broker.id=1

We should be able to leave everything else alone.

Now start Kafka on each node.

Original node:

orig-machine$ cd kafka_2.11-1.0.0/
orig-machine$ ./bin/kafka-server-start.sh ./config/server.properties

New node:

new-machine$ ./bin/kafka-server-start.sh ./config/server.properties

Now let's see if we can still put in data and read them back.

Original node:

orig-machine$ cd kafka_2.11-1.0.0/

Let's make our topic first

orig-machine$ ./bin/kafka-topics.sh \
    --create \
    --zookeeper 10.150.60.150:2181,10.150.60.102:2181 \
    --replication-factor 2 \
    --partitions 1 \
    --topic multinodetest
Created topic "multinodetest".

Now let's insert an item into our topic:

orig-machine$ ./bin/kafka-console-producer.sh --broker-list 10.150.60.150:9092,10.150.60.102:9092 --topic multinodetest
>This is a test. Do not adjust your set.

New node:

For fun, let's read the items back from the new node.

new-machine$ cd kafka_2.11-1.0.0/
new-machine$ ./bin/kafka-console-consumer.sh \
    --zookeeper 10.150.60.150:2181,10.150.60.102:2181 \
    --topic multinodetest \
    --from-beginning
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release.
Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
This is a test. Do not adjust your set.

Yup, that worked!

I wonder what happens if I stop the new node and leave the original node running?

Wow, things got really upset. This is getting out of scope for a blog entry on how to get multinode Kafka working. But this might be a good topic for another day.