Effective Kafka Setup on Raspberry Pi 4: A Comprehensive Guide
Written on
Introduction to Kafka on Raspberry Pi
This guide serves as an introductory tutorial for configuring Apache Kafka on a Raspberry Pi 4 and publishing messages with Python.
Apache Kafka, developed by LinkedIn and later contributed to the Apache Software Foundation, is an event-driven data pipeline designed for streaming real-time data across various systems. This article will walk you through the process of setting up Kafka on your Raspberry Pi and sending your first messages via Python.
Understanding Kafka’s Architecture
At its core, Kafka functions as a cluster of logically grouped servers, referred to as brokers. A Kafka cluster may consist of a single broker or multiple brokers working together. Unlike relational databases that store data in tables, Kafka utilizes topics to which messages are published.
A topic acts like a message queue, processing messages in a first-in, first-out manner. Each incoming message is assigned a unique offset ID that indicates its position within the topic. Topics can span across several brokers, enhancing fault tolerance; if one broker fails, other brokers can still handle the messages.
To manage the brokers, Kafka relies on a service named Zookeeper. Upon starting, Kafka registers its brokers with Zookeeper, which elects a primary broker. If the main broker fails, Zookeeper designates another one. Each broker is assigned a partition of a topic, allowing multiple brokers to process different partitions simultaneously.
Installing Apache Kafka
Apache Kafka is developed in Java, so the first step is to verify if Java is installed:
java --version
If Java is not installed, you can install it using the following command:
sudo apt-get install default-jdk -y
Next, download Apache Kafka from its official website. For this guide, Kafka version 3.6.0 is used, but feel free to select another version if necessary. If your setup is headless, you can download Kafka using wget:
After downloading, extract the contents:
sudo tar -xzf kafka_2.12-3.6.0.tgz
This creates a directory called kafka_2.12-3.6.0. You may rename and relocate this directory for convenience:
sudo mv kafka_2.12-3.6.0 /usr/local/kafka
Configuring Apache Kafka
To start, navigate to the config directory in your Kafka installation. Open the server.properties file for configuration.
You can leave the broker.id as is for now, but it’s helpful to set up automatic topic generation. This ensures that if a topic doesn’t exist, Kafka will create it automatically rather than failing to deliver messages.
Next, update the Socket Server Settings. Kafka defaults to port 9092, which you can keep while using your own IP address. For the Zookeeper connection string at the end of the file, provide the address of your Raspberry Pi, using the default port 2181.
Important Note: It's advisable to use an odd number of Zookeeper servers to simplify the broker leader election process, avoiding 50/50 splits in voting.
Once you save your server.properties, update the zookeeper.properties file by adding the 4lw.commands.whitelist clause at the bottom. This allows specific commands that are disabled by default for security reasons.
Launching Zookeeper and Apache Kafka
In the Kafka installation, locate the bin directory containing shell scripts to interact with Kafka and Zookeeper. Start Zookeeper first, as Kafka needs to register its brokers with it:
sudo /usr/local/kafka/bin/zookeeper-server-start.sh /usr/local/kafka/config/zookeeper.properties
Alternatively, set permissions on the Kafka install directory to avoid using sudo for subsequent commands:
sudo chown -R yourusername:yourgroup /usr/local/kafka
sudo chmod -R 755 /usr/local/kafka
Now, start Zookeeper in the background:
/usr/local/kafka/bin/zookeeper-server-start.sh /usr/local/kafka/config/zookeeper.properties > /tmp/zookeeper/zookeeper-output.log 2>&1 &
To start Kafka, use a similar command:
/usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties > /tmp/kafka/kafka-server.log 2>&1 &
You can verify if Zookeeper is running by sending a ruok command:
echo ruok | nc [host] [port]
Or use a simple Python script to check the connection.
Producers and Consumers
With Kafka running, we can now produce our first message. In Kafka, entities publishing messages to topics are termed Producers, while those processing messages are called Consumers.
Thanks to the auto topic generation setting we configured, we can write to a topic without creating it first. Use the following command to produce a message:
/usr/local/kafka/bin/kafka-console-producer.sh --broker-list 192.168.108.1:9092 --topic btc-price
To start consuming messages, employ the following script:
/usr/local/kafka/bin/kafka-console-consumer.sh --bootstrap-server 192.168.108.1:9092 --topic testTopic --from-beginning
By default, consumers only start receiving messages as new ones come in. To read from the earliest available record, use the --from-beginning flag.
Using Python with Confluent Kafka
While shell scripts are useful for message production and consumption, you can also utilize Python with the confluent-kafka-python library. Install the library using:
pip install confluent-kafka
Set up a consumer for the btc-price topic:
from confluent_kafka import Consumer
config = {
'bootstrap.servers': '192.168.108.1:9092',
'group.id': 'btc-price@1',
'auto.offset.reset': 'earliest'
}
consumer = Consumer(config)
consumer.subscribe(['btc-price'])
To start consuming messages, utilize the poll function:
while True:
msg = consumer.poll(timeout=1.0)
if msg is None:
continueif msg.error():
print(f"Error: {msg.error()}")else:
print(f"Received message: {msg.value().decode('utf-8')}")
Producing messages using confluent-kafka is straightforward. Set up a producer like this:
from confluent_kafka import Producer
config = {
'bootstrap.servers': '192.168.108.1:9092',
}
producer = Producer(config)
To publish messages, use a callback function to receive feedback on the operation's success or failure.
Conclusion
For smaller businesses or hobby projects, a single broker setup can suffice. However, as projects scale, you may encounter challenges that necessitate a multi-broker setup. Future articles will delve into creating a robust Kafka environment with multiple brokers and partitions.
Thank you for reading! If you found this guide helpful, please show your support by clapping for the article and sharing your thoughts in the comments below.
Learn how to install Apache Kafka on Raspberry Pi 4 with this comprehensive tutorial.
Discover the automation of a barn using Arduino, Raspberry Pi, Kafka, Docker, and Kubernetes in this insightful video.