Protocol Comparison MQTT vs. Kafka: Protocol Comparison for IoT and Data Streaming

When building IoT applications or data streaming systems, choosing the right messaging protocol is crucial. This guide compares MQTT and Kafka, helping you determine which is best for your specific use case.

Protocol Overviews

MQTT

MQTT (Message Queuing Telemetry Transport) is a lightweight publish-subscribe messaging protocol designed for constrained devices and low-bandwidth, high-latency, or unreliable networks.

Key characteristics:

  • Lightweight with minimal overhead (as small as 2 bytes header)
  • Push-based communication model
  • Designed for resource-constrained IoT devices
  • Three Quality of Service (QoS) levels
  • Topic-based message filtering
  • Support for last will messages and retained messages

Kafka

Apache Kafka is a distributed event streaming platform designed for high-throughput, fault-tolerance, and durability of data streams.

Key characteristics:

  • High-throughput, distributed architecture
  • Pull-based communication model
  • Optimized for data streaming and processing
  • Log-based storage with configurable retention
  • Strong ordering guarantees within partitions
  • Support for stream processing with Kafka Streams

Architectural Differences

Communication Model

MQTT uses a publish-subscribe (pub/sub) model where:

  • Publishers send messages to topics
  • The broker routes messages to subscribed clients
  • Communication is push-based - subscribers receive messages as they arrive
  • Clients can be both publishers and subscribers
  • Topics are hierarchical and support wildcards

Kafka uses a distributed log model where:

  • Producers write messages to topics
  • Topics are divided into partitions distributed across brokers
  • Communication is pull-based - consumers request messages
  • Consumers maintain their own offset (position) in the log
  • Topics are flat with no hierarchy or wildcard support

Message Persistence and Delivery

MQTT focuses on message delivery with configurable persistence:

  • QoS 0: At most once delivery (fire and forget)
  • QoS 1: At least once delivery (with acknowledgment)
  • QoS 2: Exactly once delivery (with handshake)
  • Limited persistence (primarily for QoS 1 and 2 messages while clients are offline)
  • Retained messages provide simple state persistence

Kafka prioritizes persistence and processing:

  • All messages are persisted to disk by default
  • Configurable retention period (time or size-based)
  • At-least-once delivery semantics by default
  • Exactly-once semantics possible with transactions
  • Messages can be replayed from any point in the log

Detailed Comparison

Feature MQTT Kafka
Primary Use Case IoT communication, constrained devices Data streaming, event sourcing, log aggregation
Message Size Small (optimized for low overhead) Large (optimized for throughput)
Throughput Moderate (thousands of msg/sec per broker) High (millions of msg/sec across cluster)
Latency Very low (milliseconds) Low to moderate (milliseconds to seconds)
Message Ordering Per-topic ordering (with QoS 1 or 2) Strong ordering per partition
Client Resource Requirements Very low (can run on microcontrollers) Moderate to high (JVM-based)
Message Retention Limited (offline message queuing) Extensive (configurable log retention)
Protocol TCP/IP, WebSockets TCP/IP
Scalability Vertical, limited horizontal (clustering) Highly horizontally scalable (distributed)

When to Use Each Protocol

Choose MQTT When:

  • Device Constraints: Your devices have limited processing power, memory, or battery life
  • Network Constraints: You're operating over unreliable or bandwidth-constrained networks
  • Real-time Requirements: You need near real-time communication with minimal latency
  • Push Model: Devices need to receive messages immediately as they're published
  • Simple Hierarchical Structure: Your topics fit naturally into a hierarchical structure
  • Widespread IoT Support: You need a protocol with broad support across IoT platforms and devices

MQTT Ideal Use Cases:

  • Smart home and building automation
  • Industrial IoT sensor networks
  • Remote monitoring of distributed equipment
  • Mobile applications requiring real-time notifications
  • Automotive and telematics applications
  • Healthcare monitoring devices

Choose Kafka When:

  • High Throughput: You need to process millions of messages per second
  • Data Retention: Long-term storage and replay of message history is required
  • Stream Processing: You need complex stream processing and analytics
  • Enterprise Integration: You're connecting multiple enterprise systems
  • Scalability: You need a highly scalable, distributed architecture
  • Exactly-once Processing: Your application requires exactly-once semantics for processing

Kafka Ideal Use Cases:

  • Log aggregation and processing
  • Stream processing for real-time analytics
  • Event sourcing architectures
  • Website activity tracking and analytics
  • Operational metrics collection and monitoring
  • Data pipeline integration between systems

Using MQTT and Kafka Together

In many IoT architectures, MQTT and Kafka work together as complementary technologies:

MQTT and Kafka Architecture

MQTT at the Edge, Kafka in the Backend: A Common Architecture for IoT Data Processing

A common pattern is to use MQTT at the edge for device communication, then bridge to Kafka for backend processing:

  1. Edge Layer: IoT devices connect via MQTT to brokers (like MQTT.pro)
  2. Integration Layer: MQTT messages are forwarded to Kafka topics through a connector or bridge
  3. Processing Layer: Kafka Streams or other stream processing technologies analyze and transform the data
  4. Storage Layer: Processed data is stored in databases or data lakes
  5. Visualization Layer: Results are displayed in dashboards or analytics tools

This architecture leverages the strengths of both protocols:

  • MQTT handles efficient device communication at the edge
  • Kafka provides scalable data streaming and processing in the backend

Implementation Examples

MQTT Implementation Example

Publishing sensor data with MQTT using Python:

import paho.mqtt.client as mqtt
import json
import time
import random

# Connect to MQTT.pro broker
client = mqtt.Client("sensor_publisher")
client.username_pw_set("username", "password")
client.connect("broker.mqtt.pro", 1883, 60)

# Publish temperature readings every 5 seconds
while True:
    temperature = round(random.uniform(20.0, 25.0), 1)
    payload = json.dumps({"device_id": "sensor-001", "temperature": temperature, "timestamp": time.time()})
    
    # Publish with QoS 1 to ensure delivery
    client.publish("sensors/temperature", payload, qos=1)
    print(f"Published: {payload}")
    time.sleep(5)

Kafka Implementation Example

Consuming and processing the same data with Kafka using Python:

from kafka import KafkaConsumer
import json

# Connect to Kafka broker
consumer = KafkaConsumer(
    'sensors.temperature',
    bootstrap_servers=['kafka-broker:9092'],
    auto_offset_reset='earliest',
    value_deserializer=lambda m: json.loads(m.decode('utf-8'))
)

# Process incoming messages
for message in consumer:
    data = message.value
    device_id = data['device_id']
    temperature = data['temperature']
    timestamp = data['timestamp']
    
    # Process data (e.g., alert on high temperatures)
    if temperature > 24.0:
        print(f"ALERT: High temperature of {temperature}°C detected from {device_id}")
    else:
        print(f"Normal temperature of {temperature}°C from {device_id}")

MQTT to Kafka Bridge Example

Simple bridge to connect MQTT topics to Kafka using Python:

import paho.mqtt.client as mqtt
from kafka import KafkaProducer
import json

# Kafka producer setup
producer = KafkaProducer(
    bootstrap_servers=['kafka-broker:9092'],
    value_serializer=lambda m: json.dumps(m).encode('utf-8')
)

# MQTT callback when message is received
def on_message(client, userdata, msg):
    try:
        # Convert MQTT topic to Kafka topic (replace / with .)
        kafka_topic = msg.topic.replace("/", ".")
        
        # Parse JSON payload
        payload = json.loads(msg.payload.decode('utf-8'))
        
        # Send to Kafka
        producer.send(kafka_topic, payload)
        print(f"Forwarded message from MQTT topic {msg.topic} to Kafka topic {kafka_topic}")
    except Exception as e:
        print(f"Error processing message: {e}")

# Set up MQTT client
client = mqtt.Client()
client.username_pw_set("username", "password")
client.on_message = on_message

# Connect and subscribe
client.connect("mqtt.broker.com", 1883, 60)
client.subscribe("sensors/#")  # Subscribe to all sensor topics

# Start the loop
client.loop_forever()

Conclusion

MQTT and Kafka serve different but complementary purposes in modern data architectures:

  • MQTT excels at lightweight, efficient communication between IoT devices and servers, making it perfect for the "edge" of your architecture.
  • Kafka provides robust, scalable data streaming and processing capabilities for handling large volumes of data in backend systems.

Understanding the strengths and limitations of each protocol allows you to choose the right tool for each part of your architecture. In many cases, using both together provides the ideal solution for end-to-end IoT data processing.

With MQTT.pro's serverless MQTT broker service, you can quickly set up the edge communication layer of your IoT architecture, providing a reliable foundation that integrates seamlessly with Kafka and other backend systems.

Try MQTT.pro Free

Implement your MQTT edge communication layer today.

Additional Resources