EDA for the Rest of Us: Event Design Patterns

The most important decision in event-driven architecture isn’t which technology to use—it’s what information to put in your events. Get this wrong, and you’ll end up with a system that’s more complex and tightly coupled than the monolith you were trying to escape.

Looking at successful and failed EDA implementations across companies like Netflix, Uber, and ING Bank¹, one thing becomes clear: event design patterns are what separate successful EDA implementations from expensive mistakes. The infrastructure will work regardless of whether you choose Kafka, EventBridge, or SQS. But the patterns you choose for generating and structuring events will determine whether your system becomes more maintainable or turns into a debugging nightmare.

Research shows that while EDA can provide tremendous benefits, “poor implementations of EDA can make both scalability and resiliency worse than before"². The difference often comes down to how events are designed and structured from the beginning.

Why Event Design Patterns Matter
The Fat Events Anti-Pattern
Pattern 1: Event Carried State Transfer
Pattern 2: Event Notification
- Event Notification with AWS Services
Pattern 3: Async Commands
- Async Commands with AWS Services
Pattern 4: Change Data Capture (CDC)
- Change Data Capture with AWS Services
Choosing the Right Pattern
- Pattern Selection Decision Matrix
- Key Tradeoffs Summary
The Foundation of Good EDA
References

Why Event Design Patterns Matter

Every event you publish is a contract with your consumers. Choose the wrong pattern, and you’ll face:

Consumers making cascading API calls because events lack necessary context
Breaking changes that ripple through your entire system when you need to modify event structure
Performance bottlenecks when high-volume events contain too much data
Security issues when events expose more information than intended
Operational complexity from managing multiple inconsistent event formats

The good news? Most of these problems are avoidable with the right patterns applied consistently.

The Fat Events Anti-Pattern

Before exploring specific patterns, we need to address one of the most common mistakes in event design: Fat Events. These are events that contain data not owned by the domain generating the event.

The order service publishes an “order submitted” event but includes the customer’s credit score, loyalty tier, and current inventory levels. The order service doesn’t own any of this data—it belongs to the customer and inventory services respectively.

This creates several problems:

Data Ownership Violations: Services end up publishing data they don’t actually own or control, violating domain boundaries.
Coupling and Fragility: Changes to customer or inventory data structures now require coordinating changes across multiple services that shouldn’t be coupled.
Data Staleness: The customer’s loyalty tier might have changed after the order service cached it but before the event was published.
Security and Governance: Events expose data that consumers might not be authorized to access, creating compliance and security issues.

The solution is simple: only include data your domain owns. If you need to reference other domains, include identifiers or URLs that consumers can use to fetch that data from the authoritative source.

This principle applies regardless of which pattern you choose. Whether you’re using Event Carried State Transfer, Event Notification, or any other pattern, resist the temptation to include “helpful” data from other domains. It will cause more problems than it solves.

Pattern 1: Event Carried State Transfer

Event Carried State Transfer (ECST) is the pattern where events include state information in the payload, reducing or eliminating the need for consumers to make additional API calls. But there are two distinct approaches, each serving different architectural needs.

Note: The “Full Entity State” and “Contextual State” approaches described here are practical interpretations of Martin Fowler’s Event Carried State Transfer pattern. The terminology helps distinguish between different scopes of state transfer, though these aren’t formally defined terms in the literature.

Full Entity State Approach

This approach includes the complete current state of an entity in every event, regardless of what specifically changed. A “Customer Updated” event would include the complete customer record with all current values—profile information, preferences, subscription details, address, and metadata. This transforms events into complete snapshots that enable consumers to rebuild their entire view of the entity from any single event.

When this works well:

Entity lifecycle events (CRUD operations) where consumers maintain replicas
Small to medium entities that fit comfortably in message size limits
Scenarios where different consumers need different subsets of entity data
Analytics and reporting systems that need complete entity snapshots

This pattern works well for user profile synchronization across multiple microservices, where marketing, support, and billing systems can all consume the same events and extract the data relevant to their needs without additional API calls.

The tradeoffs:

Massive payloads: Complex entities can create very large events, increasing bandwidth and storage costs
Irrelevant data exposure: Consumers get data they don’t need, potentially creating security and compliance issues
Schema coupling: Changes to entity structure require coordinating updates across all consumers
Evolution complexity: Adding fields to entities means updating event schemas and potentially breaking consumers

Contextual State Approach

This approach includes only the state information relevant to understanding and processing the specific business event. An “Order Submitted” event might include order details, payment information, and shipping address (all relevant to fulfillment processing) but not the customer’s marketing preferences or account history.

When this shines:

Business process events with focused workflows
Large entities where complete state would create oversized messages
Data governance scenarios where complete state would expose unnecessary sensitive information
High-volume events where payload optimization matters

The tradeoffs:

Design complexity: Requires careful analysis of what context each consumer actually needs
Still larger than notifications: More expensive than lightweight notification events
Potential over-inclusion: Easy to include “just in case” data that creates unnecessary coupling
Consumer assumption risk: If you guess wrong about what consumers need, they still end up making API calls

Event Carried State Transfer with AWS Services

When implementing ECST on AWS, the choice of service depends on your specific requirements and constraints.

Amazon SQS is often the starting point for ECST implementations, it excels at point-to-point delivery where each message goes to exactly one consumer:

Message limits: 256 KB standard (extendable to 2 GB with S3)
Throughput: Nearly unlimited for standard queues, 3,000 msg/s for FIFO
Best for: Simple fan-out patterns without strict ordering requirements
- Built-in dead letter queues for failed message handling
- Automatic scaling without configuration
- Long polling reduces API calls and costs

Amazon Kinesis Data Streams becomes the better choice for high-throughput scenarios, handling thousands of events per second with multiple consumers:

Capacity: 1 MB records, 1 MB/s per shard write, 2 MB/s read
Retention: 24 hours to 365 days with guaranteed ordering per shard
Advanced features: On-demand scaling up to 200 MB/s
- Multiple consumers can replay the same events
- Checkpointing for exactly-once processing
- Integration with analytics services (Kinesis Analytics, EMR)

Amazon EventBridge shines when you need sophisticated routing and filtering capabilities across distributed systems:

Event handling: 256 KB limit, 10,000 events/s (soft limit)
Routing capabilities: Content-based filtering and transformation
Enterprise features: Cross-account/region delivery with compliance
- Built-in schema registry with discovery
- Archive and replay for debugging
- Native integration with 20+ AWS services

For most ECST implementations, start with EventBridge + SQS: EventBridge handles the intelligent routing and filtering, while SQS provides reliable delivery to individual consumers. This combination gives you the flexibility to evolve your routing logic without changing consumer code. Move to Kinesis only when you hit throughput limits or need advanced streaming features like real-time analytics.

Pattern 2: Event Notification

Event Notification sends minimal information to notify that something happened, with consumers making additional API calls for details they need. A “Customer Profile Updated” event would include just the customer ID, timestamp, which fields changed, and links to fetch the complete customer data if needed.

When this works well:

High fan-out scenarios with many different consumers
Bandwidth or cost constraints
Varied consumer needs where only some require full details
Large state objects that would create oversized events
Security concerns where not all consumers should access complete data

This pattern works well for account balance changes in financial services, where compliance requires that balance queries always hit the authoritative source. Events become lightweight triggers that tell downstream systems “something changed with account X” without including the sensitive balance data.

The tradeoffs:

Additional complexity as consumers must implement API client logic
Higher latency from additional round-trips
Potential inconsistency if data changes between notification and fetch
API dependency creates coupling between event consumers and source services

Event Notification with AWS Services

When implementing Event Notification on AWS, choose based on your scale and routing requirements.

Amazon EventBridge is the recommended starting point for most event notification implementations:

Advanced routing: Rule-based filtering on any JSON attribute
Schema management: Automatic discovery and versioning
Built-in integrations: Direct delivery to 20+ AWS services
- 7-day archive retention for replay
- Event transformation before delivery
- Dead letter queues for failed deliveries

Amazon SNS excels at massive scale notification scenarios:

Scale: Millions of messages/second, 12.5M subscriptions per topic
Multi-protocol: SQS, Lambda, HTTP, email, SMS from one topic
FIFO topics: 300 TPS (3,000 with batching) for ordered delivery
- Message filtering reduces downstream processing
- Fan-out to thousands of subscribers
- Cross-region message replication

For most teams starting out, begin with EventBridge as your default choice. Move to SNS when you need massive scale or require its unique delivery protocols like email and SMS.

Pattern 3: Async Commands

Async Commands enable asynchronous task execution by sending command messages to queues for background processing. Unlike events which represent facts about the past, commands represent requests for future actions that need to be processed by specific handlers.

When this works well:

Task-based UIs where user actions map to specific business operations
Explicit intent modeling with validation and authorization before state changes
Asynchronous task execution that decouples requestors from executors
Complex business processes requiring orchestration

A “Process Payment” command would include the order ID, payment amount, currency, payment method reference, and who requested the processing. The command handler processes this request, validates business rules, and upon success emits events like PaymentProcessed or PaymentFailed.

The tradeoffs:

Additional complexity from command handling infrastructure
Tighter coupling than events as commands target specific handlers
More messaging overhead from command-event sequences
Need for robust error handling and retry mechanisms

Async Commands with AWS Services

When implementing Async Commands on AWS, SQS is the primary service for reliable command delivery.

Amazon SQS is purpose-built for command queues with reliable point-to-point delivery:

FIFO queues ensure commands are processed in order when required
Dead letter queues handle failed commands that exceed retry limits
Visibility timeouts prevent duplicate processing during command handling

For async command implementations, use SQS queues with your preferred command processing infrastructure. This provides reliable delivery, built-in error handling, and decouples command submission from processing.

Pattern 4: Change Data Capture (CDC)

CDC captures changes from existing databases and transforms them into events, enabling event-driven integration without modifying existing applications. When a customer updates their email address in the database, CDC would capture that change and publish an event showing the before and after values, along with metadata about the change.

When this works well:

Legacy system integration where you can’t modify existing applications
Database-first applications built around database operations
Real-time synchronization requirements
Audit and compliance needs for detailed change records

The tradeoffs:

High-change volume that can overwhelm downstream systems
Events may lack business context not available in database changes
Security concerns about exposing database-level changes
Additional operational complexity from CDC infrastructure

Change Data Capture with AWS Services

When implementing CDC on AWS, choose based on your database type and integration requirements.

Amazon DynamoDB Streams provides native CDC for DynamoDB tables:

Capture details: Real-time item changes with 24-hour retention
Throughput: 2 concurrent consumers per shard maximum
Integration: Direct Lambda triggers for serverless processing
- 400 KB record size (same as DynamoDB item limit)
- Guaranteed ordering per item
- Exactly-once processing with Lambda

AWS Database Migration Service (DMS) enables CDC for relational databases:

Database support: MySQL, PostgreSQL, Oracle, SQL Server, and more
Performance: Varies by instance (dms.r5.large ~4,000 TPS)
Flexibility: Full load + CDC or CDC only operation modes
- LOB columns up to 1 GB supported
- Built-in data transformation
- Multiple target endpoints (Kinesis, S3, RDS)

Amazon Kinesis Data Streams handles high-volume CDC event streaming:

High throughput: Process thousands of changes/second
Multiple consumers: Independent processing with replay capability
Direct integration: Receives CDC data from DMS
- 1 MB record size limit
- Guaranteed ordering per partition key
- Up to 365 days retention for replay

For most CDC implementations, start with native database streams (DynamoDB Streams) when possible. Use DMS for relational databases that need CDC capabilities without application changes. As documented in AWS Prescriptive Guidance, combining CDC with event-driven patterns enables gradual modernization of legacy systems.

Choosing the Right Pattern

Your choice depends on several factors:

Data Requirements: How much data do consumers need, and how fresh must it be?
Performance Needs: What are your latency and throughput requirements?
Consistency Model: What consistency guarantees do you need?
Integration Constraints: Working with existing systems or building greenfield?
Team Skills: What patterns can your team effectively implement and maintain?

Most successful systems use multiple patterns simultaneously, choosing the most appropriate one for each type of data and use case.

Pattern Selection Decision Matrix

Pattern	Event Size	Coupling	Latency	Throughput	Best Use Cases
Event Notification	Small (< 1 KB)	Lowest	Higher (requires API calls)	Highest	• Many diverse consumers • Security-sensitive data • Bandwidth constraints • Unknown consumer needs
ECST (Full)	Large (10-256 KB)	Medium	Lower	Medium	• Data replication • Analytics/reporting • Offline processing • Known consumer needs
ECST (Contextual)	Medium (1-50 KB)	Medium	Lower	Medium-High	• Business workflows • Specific use cases • Performance critical • Domain boundaries
Async Commands	Small-Medium	Higher	Medium	High	• Task execution • User actions • Orchestration • Explicit operations
CDC	Variable	Low	Lowest	Variable	• Legacy integration • Database sync • Audit trails • Real-time ETL

Key Tradeoffs Summary

Consideration	Event Notification	Event Carried State Transfer
Runtime Coupling	Higher (API dependency)	Lower (self-contained)
Schema Evolution	Easier (minimal contract)	Harder (full contract)
Network Efficiency	Lower (multiple calls)	Higher (single message)
Data Freshness	Always current	Point-in-time snapshot
Consumer Complexity	Higher (API client needed)	Lower (just parse event)
Publisher Complexity	Lower (just notify)	Higher (gather all data)

The foundation of a good Event-Driven Architecture

Event design patterns are the foundation that everything else in your event-driven architecture builds upon. Get these patterns right, and you’ll have events that tell clear stories about your business processes. Get them wrong, and you’ll spend more time debugging event flows than building features.

The goal isn’t to follow academic patterns perfectly—it’s to create events that make your system easier to understand, debug, and modify over time. When someone wakes up at 3 AM to debug a production issue, they should be able to read your event logs and understand the sequence of business activities that led to the problem.

Next week, we’ll explore communication patterns—how to choose between EventBridge, SNS, SQS, and Kafka for different scenarios, and when you still need request-response patterns in an event-driven world.

What event design challenges are you facing? Have you found patterns that work particularly well for your domain? I’d love to hear about your experiences.

References

Foundational Event-Driven Architecture Resources

What do you mean by “Event-Driven”? - Martin Fowler’s foundational article defining event-driven architecture patterns (2017)
Enterprise Integration Patterns: Messaging Patterns Overview - Gregor Hohpe and Bobby Woolf’s comprehensive messaging patterns catalog
Domain-Driven Design - Eric Evans’ concepts including domain events and bounded contexts

AWS Official Documentation

AWS Event-Driven Architecture - AWS’s official guide to building event-driven applications
AWS Well-Architected Framework: Event-Driven Architectures - Best practices for serverless event-driven systems
Building Event-Driven Architectures on AWS - AWS Prescriptive Guidance

AWS Service-Specific Documentation

Amazon SQS Developer Guide - Including message size limits and best practices
Amazon Kinesis Data Streams Developer Guide - Stream processing patterns and limits
Amazon EventBridge User Guide - Event routing and schema registry documentation
DynamoDB Streams Developer Guide - Change data capture for DynamoDB
AWS Database Migration Service User Guide - CDC implementation for relational databases

Pattern Implementation Guides

The Event-Carried State Transfer Pattern - Detailed implementation guide with examples
Event Notification Pattern in Microservices - Chris Richardson’s microservices patterns
CQRS (Command Query Responsibility Segregation) - Martin Fowler’s explanation of the CQRS pattern (relevant for Async Commands)
Enriching Event-Driven Architectures with AWS Event Ruler - AWS Compute Blog on event filtering

Industry Case Studies

Netflix: Scaling Event Sourcing for Millions of Devices - Netflix Technology Blog (2017)
Uber: Real-Time Data Infrastructure - Uber Engineering’s event processing architecture (2021)
Building Event-Driven Microservices at ING - InfoQ coverage of ING’s Kafka implementation (2018)

Tools and Specifications

CloudEvents Specification - CNCF standard for describing event data
AsyncAPI Specification - Industry standard for defining asynchronous APIs
EventCatalog - Open source documentation tool for event-driven architectures

Martin C. Richards

EDA for the Rest of Us: Event Design Patterns

Table of Contents

Why Event Design Patterns Matter

The Fat Events Anti-Pattern

Pattern 1: Event Carried State Transfer

Full Entity State Approach

Contextual State Approach

Event Carried State Transfer with AWS Services

Pattern 2: Event Notification

Event Notification with AWS Services

Pattern 3: Async Commands

Async Commands with AWS Services

Pattern 4: Change Data Capture (CDC)

Change Data Capture with AWS Services

Choosing the Right Pattern

Pattern Selection Decision Matrix

Key Tradeoffs Summary

The foundation of a good Event-Driven Architecture

References

Foundational Event-Driven Architecture Resources

AWS Official Documentation

AWS Service-Specific Documentation

Pattern Implementation Guides

Industry Case Studies

Academic and Research Papers

Additional AWS Resources

Books and Extended Resources

Tools and Specifications