Skip to main content

Queue Groups

In standard publish-subscribe, every subscriber receives every message. Queue groups change this: when subscribers share a queue name, NATS delivers each message to only one randomly chosen member of that group.

Watch how each message (animated dot) flows to only one worker, even though all three are subscribed. NATS automatically distributes the load.

How It Works

Queue groups operate at the subject level - subscribers still filter messages by subject, but NATS adds distribution logic:

  1. Single member: A lone subscriber in a queue group receives all messages for that subject
  2. Multiple members: NATS randomly selects one member for each message
  3. Member joins/leaves: Distribution automatically adjusts without configuration

The queue name is application-defined, not server-configured. Subscribers specify it when subscribing, and NATS handles the rest. If a selected member is slow or unresponsive, subsequent messages go to other members.

Basic Queue Groups

Multiple subscribers use the same queue group name when subscribing to a subject. NATS ensures each message is delivered to only one member of that group, chosen randomly.

Common use cases: background job processing, API request handling across service instances, event processing pipelines, batch operations.

# Terminal 1: First worker in queue group
nats sub orders.new --queue workers

# Terminal 2: Second worker in same queue group
nats sub orders.new --queue workers

# Terminal 3: Third worker in same queue group
nats sub orders.new --queue workers

# Terminal 4: Publish messages (distributed across workers)
nats pub orders.new "Order 1"
nats pub orders.new "Order 2"
nats pub orders.new "Order 3"
nats pub orders.new "Order 4"

# Each message goes to exactly one worker

Dynamic Scaling

Add or remove workers at any time and NATS automatically adjusts distribution. When a worker joins, it immediately starts receiving messages. When it leaves, NATS stops routing to it within milliseconds.

Perfect for auto-scaling scenarios where orchestration systems (Kubernetes, ECS) spin up new workers based on metrics. Supports gradual rollouts, traffic spike handling, and cost optimization.

# Start with one worker
nats sub tasks --queue workers

# Load increases - add more workers (in new terminals)
nats sub tasks --queue workers
nats sub tasks --queue workers

# Load decreases - stop workers with Ctrl+C
# Remaining workers automatically take over

Queue Groups with Request-Reply

Queue groups enable horizontally scalable services without a service mesh or API gateway. Each request goes to exactly one service instance, providing automatic load balancing.

Your service code doesn't need to know about other instances, handle leader election, or coordinate work. Just subscribe with a queue group name and respond to requests.

# Terminal 1: Service instance 1
nats reply api.calculate --queue api-workers 'echo "Result from instance 1"'

# Terminal 2: Service instance 2
nats reply api.calculate --queue api-workers 'echo "Result from instance 2"'

# Terminal 3: Service instance 3
nats reply api.calculate --queue api-workers 'echo "Result from instance 3"'

# Terminal 4: Make requests (load balanced across instances)
nats request api.calculate ""
nats request api.calculate ""
nats request api.calculate ""

Mixed Subscribers

Queue groups coexist with regular subscribers on the same subject. Regular subscribers receive every message (pub-sub), while queue group members share the load (work distribution).

Use queue groups for operational work that needs to happen exactly once, and regular subscribers for observational tasks (audit logging, monitoring, analytics).

#!/bin/bash

# Terminal 1: Audit logger (sees all messages)
nats sub "orders.>"

# Terminal 2: Worker 1 in queue group
nats sub "orders.new" --queue workers

# Terminal 3: Worker 2 in queue group
nats sub "orders.new" --queue workers

# Terminal 4: Publish messages
nats pub orders.new "Order 123"
# Audit logger sees it
# One worker processes it

Geo-Affinity in Super-Clusters

In globally distributed NATS super-clusters, queue groups exhibit geo-affinity - automatically preferring local workers when available.

How It Works

When you have queue group subscribers distributed across multiple regions:

  1. Local preference: Messages are delivered to workers in the same cluster/region as the publisher
  2. Automatic failover: If no local workers are available, NATS routes to workers in other regions
  3. No configuration needed: This happens automatically based on network topology

Example Scenario

Consider a queue group named "order-processors" with workers in three regions:

RegionWorkersPublisher Location
US-East3 workers✅ Publisher here
US-West2 workers-
EU-West2 workers-

Result: Messages from the US-East publisher are preferentially delivered to the 3 US-East workers. Only if all US-East workers are unavailable will messages route to US-West or EU-West workers.

Benefits

  • Lower latency: Local processing is faster
  • Reduced bandwidth: Fewer cross-region transfers
  • Natural failover: Automatic global distribution if local workers fail
  • No configuration: Works out of the box in super-clusters

Best Practices

Naming Conventions

Queue groups follow similar naming conventions as subjects. Here are some common patterns:

# Service-based naming
api.auth.workers
api.payments.workers
api.notifications.workers

# Environment-based naming
prod.order-processors
staging.order-processors
dev.order-processors

# Version-based naming
service.v1.workers
service.v2.workers

Worker Design

  1. Idempotent processing: Messages might be redelivered
  2. Graceful shutdown: Drain messages before stopping
  3. Error handling: Failed messages should be handled appropriately
  4. Health checks: Monitor worker health and availability

Scaling Strategy

  1. Start small: Begin with few workers
  2. Monitor metrics: Track queue depth and processing time
  3. Scale based on load: Add workers when queue grows
  4. Auto-scaling: Use metrics to automatically scale

Monitoring

Track these metrics for queue groups:

  • Message processing rate
  • Queue depth (with JetStream)
  • Worker count
  • Processing latency
  • Error rates