Event-Driven Architecture

5 min read

Reading Progress0%
Streaming Systems Index
Streaming Systems Index

Event-Driven Architecture

1. What Is It?

Event-driven architecture (EDA) is a system design style where components communicate by producing and consuming events through a durable, asynchronous channel — rather than by calling each other synchronously over RPC or HTTP. An event is an immutable record that something happened in the past (OrderPlaced, PaymentCaptured, UserSignedUp), and the doesn't know or care which services consume it.

The problem it solves: in request/response architectures, every new of an action requires the to call it — adding latency, coupling, and a new failure mode for the producer. EDA decouples what happened from who reacts. Without it, your checkout service ends up calling inventory, email, analytics, fraud, recommendations, and warehouse — each call a new way for checkout to fail or slow down.

QUICK CHECK

A checkout service currently calls inventory, email, analytics, fraud, and warehouse services directly via HTTP whenever an order is placed. The team notices that a slowdown in the analytics service is causing checkout to time out. Which architectural change most directly addresses this problem?

Choose one answer

2. How It Works

  1. A service performs its own business logic and writes an event to a log or ( , SNS , EventBridge bus).
  2. The durably stores the event and makes it available to any subscribed .
  3. Consumers independently read the event, do their own work, and update their own state. They don't acknowledge back to the .
  4. Each maintains its own progress () and failure handling. A slow consumer doesn't slow down the producer or other consumers.

Concrete example. Compare a checkout flow before and after:

  • RPC style: Checkout synchronously calls InventoryService.reserve(), EmailService.send(), AnalyticsService.track(). If email is slow, the user waits. If analytics is down, the request fails (or you add try/catch noise everywhere).
  • EDA style: Checkout writes one OrderPlaced event to and returns 201 to the client. Inventory, Email, and Analytics each have their own consumer reading the topic. Each fails or scales independently. Adding a fifth consumer (recommendations) requires zero changes to checkout.
QUICK CHECK

A checkout service publishes an OrderPlaced event to a Kafka topic and immediately returns 201 Created to the client. Three consumers — Inventory, Email, and Analytics — each read from that topic independently. The Email consumer experiences a spike in latency and falls behind processing events. What is the effect on the other consumers and the checkout service?

Choose one answer

3. What Mid-Senior SWEs Actually Need to Know

  • Events are facts, not commands. OrderPlaced (past tense, immutable, public fact) is an event. SendEmail (imperative, addressed to one service) is a command — that's still RPC, just over a queue.
  • The contract is the event schema, not an interface. Breaking changes to an event schema break every . (/) with compatibility rules is how you survive.
  • never knows the . This is the point. If your needs to know whether a downstream succeeded, you're building request/response on top of a — pick the right tool.
  • Ordering is per-key, not global. All OrderPlaced events for the same order_id will arrive in order at the consumer if you by order_id. Across orders, no guarantee.
  • Idempotency in consumers is mandatory. delivery is the default; consumers must handle the same event twice (dedupe by event ID, or use idempotent writes).
  • Common misunderstanding: "EDA gives me real-time." Not directly — it gives you decoupling. The latency win comes from removing synchronous chains; the latency floor is the 's end-to-end propagation, typically tens to hundreds of ms.
  • Failure mode shift. In RPC, failures are synchronous and visible. In EDA, failures are async — a stuck consumer produces a growing lag. You must monitor the way you monitor RPC latency.
QUICK CHECK

A consumer in your event-driven system is processing OrderPlaced events, but due to a network hiccup, the broker re-delivers an event that was already processed. The consumer processes it again, creating a duplicate order record in the database. Which design principle was violated, and what is the standard remedy?

Choose one answer

4. Tradeoffs & Decisions

When to choose EDA:

  • You need to add new reactions to a business action without modifying the originator.
  • Multiple downstream systems need the same facts (analytics + ML + audit).
  • and have different scaling, latency, or availability characteristics.
  • You want services to fail independently.

When NOT to choose EDA:

  • You need an immediate result from a single downstream (RPC is correct).
  • You need a transactional read-your-write — the is eventually consistent.
  • The team has no operational maturity for async failure modes (lag, replays, dead-letter queues).
If you need...Pick...Choose the other when...
Decoupled fan-out, independent scalingEDASingle caller needs a synchronous answer
Immediate consistency between two servicesSynchronous RPC + DB transactionYou can live with eventual consistency
Replayable history, multiple readersLog-based broker (Kafka)Strict work-queue semantics — use SQS / RabbitMQ
One-shot work dispatch (each job done once)Queue with ack/nackYou need multiple independent consumers of the same event

Key tradeoff: decoupling vs traceability. Tracing a user action across an EDA system requires distributed tracing, structured logging with correlation IDs, and tooling discipline. In RPC, the call stack is the trace.

QUICK CHECK

A checkout service places an order and must immediately confirm to the user that their inventory reservation succeeded before returning a response. Which architectural approach best fits this requirement, and why?

Choose one answer

5. Interview & System Design Cheat Sheet

  • EDA is fundamentally about inversion of dependency: publishes facts, consumers subscribe — has no dependency on consumers.
  • Use past-tense, business-meaningful event names. OrderPlaced, not OrderPlacedHandler or ProcessOrder.
  • The event log is the system of record for what happened — that's why log-based brokers () dominate over fire-and-forget queues for serious EDA.
  • Consumers must be idempotent because the contract is . is achievable only at specific boundaries ( transactions, idempotent sinks).
  • The cost of EDA is eventual consistency, harder debugging, and the need for a schema-registry / observability stack. Pay it when the decoupling is worth it; skip it for simple two-service interactions.

Common follow-ups:

  • "How do you handle a that needs to call back the producer with a result?" — That's request/reply; do it with a correlation ID and a response , or use RPC directly. Don't fake RPC over events.
  • "What's the difference between event-driven and event-sourced?" — EDA is about communication between services. is a storage pattern where the events are the source of truth for one service's state. They compose well but solve different problems.
  • "How do you handle schema evolution?" — Backward-compatible additions only; deprecate fields before deleting; enforce via compatibility rules; version the if a breaking change is unavoidable.

If asked to design X, anchor on this: Identify which interactions are "tell me something happened" (events) vs "do this now and tell me the result" (commands/RPC). Events go on a ; commands stay synchronous. Mixing them is where most EDA designs go wrong.

QUICK CHECK

A checkout service publishes an OrderPlaced event. The inventory service consumes it and updates stock counts, but your broker guarantees at-least-once delivery. What property must the inventory service implement to ensure correctness?

Choose one answer