← All articles
INFRASTRUCTURE Apache Kafka for Developers: Event Streaming Fundame... 2026-03-04 · 4 min read · kafka · event streaming · message queue

Apache Kafka for Developers: Event Streaming Fundamentals

Infrastructure 2026-03-04 · 4 min read kafka event streaming message queue node.js typescript distributed systems developer tools pub-sub

Kafka processes millions of events per second across thousands of companies. It's used for real-time data pipelines, event sourcing, log aggregation, and inter-service communication at scale. Understanding Kafka's model is valuable even if you're not running it at Google scale — the concepts apply at any size, and Kafka works well at moderate scale too.

Core Concepts

Event (Message): A record of something that happened. Has a key, value, timestamp, and optional headers.

Topic: A logical channel for events. Similar to a database table name but for event streams. Topics are partitioned for scalability.

Partition: A topic is divided into N partitions. Messages within a partition are ordered. A topic with 4 partitions can be consumed by up to 4 consumers in parallel (within a consumer group).

Producer: Application that writes events to a topic.

Consumer: Application that reads events from a topic. Maintains its position (offset) in the stream.

Consumer Group: Multiple consumers that collectively process a topic. Each partition is assigned to exactly one consumer in a group. Scale out by adding consumers (up to the number of partitions).

Offset: The position of a message within a partition. Consumers commit offsets to track their progress. On restart, they resume from the last committed offset.

Retention: Kafka retains messages for a configurable period (default 7 days) regardless of whether they've been consumed. Old messages are available to replay.

Local Development Setup

Docker Compose for local Kafka:

services:
  kafka:
    image: confluentinc/cp-kafka:latest
    hostname: kafka
    ports:
      - 9092:9092
    environment:
      KAFKA_NODE_ID: 1
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 'CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT'
      KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092'
      KAFKA_PROCESS_ROLES: 'broker,controller'
      KAFKA_CONTROLLER_QUORUM_VOTERS: '1@kafka:29093'
      KAFKA_LISTENERS: 'PLAINTEXT://kafka:29092,CONTROLLER://kafka:29093,PLAINTEXT_HOST://0.0.0.0:9092'
      KAFKA_INTER_BROKER_LISTENER_NAME: 'PLAINTEXT'
      KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'
      CLUSTER_ID: 'MkU3OEVBNTcwNTJENDM2Qk'
      KAFKA_AUTO_CREATE_TOPICS_ENABLE: 'true'
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1

  kafka-ui:
    image: provectuslabs/kafka-ui:latest
    ports:
      - 8080:8080
    environment:
      KAFKA_CLUSTERS_0_NAME: local
      KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS: kafka:29092
    depends_on:
      - kafka

This runs KRaft mode Kafka (no ZooKeeper dependency) with the Kafka UI dashboard.

Producing Messages (Node.js / TypeScript)

Using kafkajs:

npm install kafkajs
import { Kafka, Partitioners } from "kafkajs";

const kafka = new Kafka({
  clientId: "my-producer",
  brokers: ["localhost:9092"],
});

const producer = kafka.producer({
  createPartitioner: Partitioners.LegacyPartitioner,
});

await producer.connect();

// Send a single message
await producer.send({
  topic: "user-events",
  messages: [
    {
      key: "user-123",              // Used for partitioning
      value: JSON.stringify({
        event: "user.registered",
        userId: "user-123",
        email: "[email protected]",
        timestamp: new Date().toISOString(),
      }),
      headers: {
        "content-type": "application/json",
        "source": "user-service",
      },
    },
  ],
});

// Batch sending (more efficient)
await producer.send({
  topic: "order-events",
  messages: events.map(event => ({
    key: event.orderId,
    value: JSON.stringify(event),
  })),
});

await producer.disconnect();

Consuming Messages

import { Kafka, Consumer, EachMessagePayload } from "kafkajs";

const kafka = new Kafka({
  clientId: "my-consumer",
  brokers: ["localhost:9092"],
});

const consumer = kafka.consumer({
  groupId: "order-processing-service",  // Consumer group ID
});

await consumer.connect();
await consumer.subscribe({
  topic: "order-events",
  fromBeginning: false,  // true = read all historical messages on first run
});

await consumer.run({
  autoCommit: true,  // Commit offsets automatically after processing
  eachMessage: async ({ topic, partition, message }: EachMessagePayload) => {
    const key = message.key?.toString();
    const value = message.value?.toString();

    if (!value) return;

    const event = JSON.parse(value);
    console.log(`Processing ${event.event} from partition ${partition}`);

    try {
      await processEvent(event);
    } catch (error) {
      // Handle processing errors — log to dead letter queue, etc.
      console.error("Processing failed:", error);
    }
  },
});

Consumer Groups and Scaling

Multiple instances of a service in the same consumer group split partitions:

Topic: order-events (4 partitions)
Consumer Group: order-service

Instance 1: handles partitions 0, 1
Instance 2: handles partitions 2, 3

Scale to 4 instances → each handles 1 partition. Scale to 5+ instances → extra instances are idle (no partition to consume).

Implication: Topic partition count limits consumer group parallelism. Size partitions based on expected max throughput.

Key Selection and Partitioning

Message keys determine which partition receives the message:

For event sourcing: use the entity ID as the key (e.g., userId, orderId). All events for the same entity land on the same partition in order.

// All events for user-123 go to the same partition
await producer.send({
  topic: "user-events",
  messages: [{ key: "user-123", value: JSON.stringify(event) }],
});

Exactly-Once vs At-Least-Once Delivery

At-least-once (default): Message is processed at least once, possibly multiple times if consumer crashes after processing but before committing the offset. Make your consumers idempotent.

Exactly-once: Kafka supports transactional exactly-once semantics between producers and consumers. More complex to implement, necessary for financial or critical applications.

For most use cases, at-least-once with idempotent consumers is the right approach.

When to Use Kafka

Use Kafka when:

Use a simpler queue (RabbitMQ, BullMQ, SQS) when:

Kafka's overhead (partitions, consumer groups, replication) is worth it at scale. For "process this job and acknowledge it", a simpler queue is often better.

Kafka UI Tools

# CLI: list topics
kcat -b localhost:9092 -L

# CLI: consume from topic
kcat -b localhost:9092 -t my-topic -C -o beginning