Messaging & Async
Decouple producers from consumers using queues. Covers RabbitMQ, SQS, and when to use point-to-point queues vs publish-subscribe topics.
A message queue is a middleware component that enables asynchronous communication between services. Instead of Service A calling Service B directly (synchronous), Service A puts a message on a queue and moves on. Service B picks up the message later and processes it.
This decoupling is one of the most powerful patterns in distributed systems. It improves reliability (the producer does not fail if the consumer is down), scalability (consumers can be scaled independently), and resilience (messages are buffered during load spikes).
Consider an e-commerce order flow without queues:
Synchronous (fragile):
User places order -> API
-> Charge payment (500ms)
-> Reserve inventory (200ms)
-> Send confirmation email (300ms)
-> Update analytics (100ms)
Total: 1100ms, all must succeed for order to completeNow with a queue:
Asynchronous (resilient):
User places order -> API
-> Charge payment (500ms)
-> Reserve inventory (200ms)
-> Publish "order.created" event to queue
Total: 700ms, user gets response immediately
Asynchronously:
Email service picks up event -> sends email
Analytics service picks up event -> records dataThe user gets a faster response. If the email service is down, the message waits in the queue and is delivered later. The payment and inventory (which are critical) are still synchronous.
A message is sent to a queue and consumed by exactly one consumer. Once consumed, the message is removed from the queue.
Producer -> [Queue] -> Consumer A
-> Consumer B (only one gets each message)
-> Consumer C
Messages are distributed among consumers (competing consumers pattern).Use case: Task distribution. You have 10,000 images to process and 5 worker instances. Each image is processed by exactly one worker.
Examples: Amazon SQS, RabbitMQ (default mode), Celery tasks.
A message is published to a topic and delivered to all subscribers. Each subscriber gets a copy of every message.
Publisher -> [Topic] -> Subscriber A (gets all messages)
-> Subscriber B (gets all messages)
-> Subscriber C (gets all messages)Use case: Event notification. When an order is placed, the email service, analytics service, and inventory service all need to know.
Examples: Amazon SNS, Kafka topics, Google Cloud Pub/Sub, Redis Pub/Sub.
A common pattern is to combine pub/sub with queues. An event is published to a topic, which fans out to multiple queues — one per consuming service. Each queue is processed independently.
Publisher -> [Topic: order.created]
-> [Queue: email-service] -> Email Consumer
-> [Queue: analytics-service] -> Analytics Consumer
-> [Queue: inventory-service] -> Inventory ConsumerIn AWS: SNS topic fans out to multiple SQS queues. In Kafka: A topic has multiple consumer groups, each processing all messages independently.
This gives you the broadcast semantics of pub/sub with the durability and independent scaling of queues.
This is one of the most important concepts in messaging and a frequent interview topic.
The message is delivered zero or one times. If something goes wrong, the message may be lost.
Producer -> Queue -> Consumer
If consumer crashes before processing -> message is lost
(Queue already removed it from the queue upon delivery)Implementation: The queue removes the message when it is delivered, without waiting for acknowledgment.
When acceptable: Metrics, logging, and other telemetry where losing a few data points is fine.
The message is delivered one or more times. It will never be lost, but it may be delivered multiple times.
Producer -> Queue -> Consumer -> ACK back to Queue
If consumer processes but crashes before ACK:
-> Queue re-delivers the message
-> Consumer processes it again (duplicate)Implementation: The queue keeps the message until the consumer explicitly acknowledges it. If no ACK within a timeout, the message is redelivered.
This is the most common guarantee. SQS, RabbitMQ (with ACK), and Kafka all default to at-least-once.
Critical requirement: Consumers must be idempotent — processing the same message twice must produce the same result as processing it once. For example, "set balance to 10 to balance" is not (without an idempotency key).
def process_order(message):
order_id = message["order_id"]
# Idempotency check: have we already processed this?
if db.exists("processed_orders", order_id):
return # Already handled, skip
# Process the order
charge_payment(order_id)
# Record that we processed it
db.insert("processed_orders", order_id)
queue.ack(message)The message is delivered exactly one time. This is the holy grail of messaging but is extremely difficult to achieve in distributed systems.
In theory: True exactly-once delivery across independent systems is impossible (it reduces to the Two Generals Problem). What systems actually provide is effectively exactly-once — at-least-once delivery combined with idempotent processing.
Kafka's approach: Kafka 0.11+ supports exactly-once semantics within the Kafka ecosystem using idempotent producers and transactional reads/writes. This works when both the source and sink are Kafka topics. When the consumer writes to an external database, you need application-level idempotency.
Practical advice: Design for at-least-once and make your consumers idempotent. This is simpler, more robust, and works across any messaging system.
When a message repeatedly fails processing (e.g., the consumer crashes or throws an error every time), it should not block the queue forever. After a configurable number of retries, the message is moved to a dead letter queue for investigation.
Main Queue -> Consumer -> fails -> retry 1 -> fails -> retry 2 -> fails
-> Dead Letter Queue (for manual inspection and replay)Best practices:
When producers publish messages faster than consumers can process them, the queue grows unboundedly. This can exhaust memory or disk.
Bounded queues: Set a maximum queue size. When full, either block the producer (producer waits until space is available) or reject new messages (producer gets an error).
Producer -> [Queue: max 10,000 messages] -> Consumer
Queue full: producer blocks or gets errorRate limiting producers: Limit how fast producers can publish.
Auto-scaling consumers: Monitor queue depth and automatically add consumer instances when the queue grows. This is the most common approach in cloud environments.
Queue depth > 10,000 -> scale consumers from 2 to 10
Queue depth < 1,000 -> scale consumers back to 2Dropping messages: For non-critical data (e.g., analytics events), it may be acceptable to drop messages during extreme load.
In a distributed queue with multiple consumers, messages may be processed out of order:
Producer sends: [M1, M2, M3]
Consumer A picks up M1, takes 500ms
Consumer B picks up M2, takes 100ms
Consumer C picks up M3, takes 200ms
Processing order: M2, M3, M1 (out of order!)Single consumer: Only one consumer processes messages. Ordering is preserved but throughput is limited.
Partitioned queues: Messages with the same key (e.g., user_id) are always routed to the same partition, and each partition has a single consumer. This preserves per-key ordering while allowing parallelism across keys.
Partition 0 (user_ids hashing to 0): [M1, M4] -> Consumer A
Partition 1 (user_ids hashing to 1): [M2, M5] -> Consumer B
Partition 2 (user_ids hashing to 2): [M3, M6] -> Consumer C
Ordering is preserved per user_id, not globally.This is how Kafka works. Each topic has multiple partitions. Messages with the same key go to the same partition. Within a partition, order is guaranteed.
SQS FIFO queues: SQS offers FIFO queues that guarantee ordering within a message group ID (similar to Kafka's partition key). Standard SQS queues provide best-effort ordering only.
A traditional message broker implementing the AMQP protocol.
Strengths:
Weaknesses:
Best for: Task queues, RPC patterns, workloads with complex routing needs, moderate throughput (tens of thousands of messages per second).
A fully managed, serverless message queue.
Strengths:
Weaknesses:
Best for: AWS-native applications, serverless architectures, workloads that need zero-ops queuing.
A distributed event streaming platform. Fundamentally different from traditional queues — it is a distributed commit log.
Strengths:
Weaknesses:
Best for: High-throughput event streaming, event sourcing, log aggregation, real-time data pipelines, cases where message replay is needed.
| Feature | RabbitMQ | SQS | Kafka |
|---|---|---|---|
| Throughput | ~50K msg/s | Virtually unlimited | Millions msg/s |
| Latency | Low (~1ms) | Medium (~20-50ms) | Low-Medium (~5-10ms) |
| Ordering | Per-queue | Per-message-group (FIFO) | Per-partition |
| Replay | No | No | Yes (retention period) |
| Delivery | At-least-once | At-least-once | At-least-once / exactly-once |
| Routing | Flexible (exchanges) | Simple (queue per consumer) | Partition key only |
| Operations | Self-managed or hosted | Fully managed | Self-managed or hosted |
| Best for | Task queues, routing | Serverless, AWS-native | Event streaming, high throughput |
Use two queues — a request queue and a response queue — to implement async RPC.
Client -> [Request Queue] -> Worker -> [Reply Queue] -> Client
(Client includes a correlation_id to match responses to requests)Multiple consumers process messages from the same queue. The queue distributes messages among them. Scale consumers up or down based on load.
[Queue] -> Consumer 1 (processing M1)
-> Consumer 2 (processing M2)
-> Consumer 3 (processing M3)For distributed transactions spanning multiple services, use a sequence of messages and compensating actions.
Order Service -> [Queue] -> Payment Service (charge)
If payment fails -> [Queue] -> Order Service (cancel order)
If payment succeeds -> [Queue] -> Inventory Service (reserve)
If inventory fails -> [Queue] -> Payment Service (refund)
-> [Queue] -> Order Service (cancel order)Each service is responsible for its own transaction and for publishing events that drive the next step (or trigger compensating actions).
Know when to use queues vs direct calls. Queues add latency and complexity. Use them when you need decoupling, buffering, or async processing. Do not use them for simple synchronous request-response where latency matters.
Always specify the delivery guarantee. Say "at-least-once delivery with idempotent consumers" rather than just "we will use a queue." This shows you understand the subtleties.
Discuss idempotency. This is the most common follow-up. If you say "at-least-once," the interviewer will ask how you handle duplicates. Have an answer ready (idempotency keys, deduplication tables).
Know Kafka vs SQS vs RabbitMQ. You do not need to be an expert in all three, but you should know which to recommend and why. Kafka for high-throughput event streaming, SQS for simple managed queues, RabbitMQ for flexible routing.
Address ordering. If your system requires message ordering (e.g., processing events for a single user in sequence), explain how you achieve it (partition key in Kafka, message group ID in SQS FIFO).
Mention dead letter queues. This shows you think about failure handling. "Failed messages go to a DLQ after 3 retries with exponential backoff, and we have monitoring to alert when the DLQ grows."
Connect to the bigger picture. Message queues are a building block. They enable event-driven architecture, CQRS, saga patterns, and stream processing. Mentioning these connections shows architectural maturity.
For the standalone design question ("Design a message queue"), cover: persistence (how messages are stored), delivery guarantees, ordering, consumer groups, dead letter handling, and scaling (partitioning).