Relational DynamoDB: When DynamoDB Stops Being Simple

Let’s be clear about something upfront: if you’re using DynamoDB as a simple key-value store—storing session data, caching API responses, or managing user preferences—you don’t need any of this. Just use the AWS SDK directly. Store your items, retrieve them by ID, and move on with your life.

But the moment you try to use DynamoDB as your primary database with a real data model—with relationships between entities, complex access patterns, and business logic—everything changes. Suddenly, you’re fighting the database at every turn.

This is where teams either give up on DynamoDB or level up their approach with single table design and proper tooling. I’ve watched this pattern play out dozens of times, and it always follows the same three phases.

The Three Phases of DynamoDB Complexity
The Fundamental Difference: Access Patterns vs Flexibility
The Mental Model Shift
Making the Decision
What’s Next

The Three Phases of DynamoDB Complexity

Phase 1: “This is easy!”

// Simple key-value storage - DynamoDB shines here
await dynamodb.putItem({
  TableName: 'Sessions',
  Item: {
    sessionId: { S: 'abc123' },
    userId: { S: 'user-456' },
    expiresAt: { N: '1735689600' }
  }
});

// Retrieve by ID - perfect!
const session = await dynamodb.getItem({
  TableName: 'Sessions',
  Key: { sessionId: { S: 'abc123' } }
});

At this stage, DynamoDB feels magical. Sub-millisecond reads. Predictable costs. No database servers to manage. You’re storing things and getting them back by ID. Life is good.

Phase 2: “Okay, getting complex…”

// Now we need users and their orders
// Still manageable but notice the boilerplate growing
await dynamodb.putItem({
  TableName: 'AppData',
  Item: {
    pk: { S: 'USER#user-456' },
    sk: { S: 'PROFILE' },
    email: { S: '[email protected]' },
    name: { S: 'John Doe' }
  }
});

await dynamodb.putItem({
  TableName: 'AppData',
  Item: {
    pk: { S: 'USER#user-456' },
    sk: { S: 'ORDER#2024-12-24#order-789' },
    orderId: { S: 'order-789' },
    total: { N: '99.99' },
    items: { L: [...] } // Complex type conversions
  }
});

Requirements evolved. Now you’re storing users and their orders. You’ve discovered partition keys and sort keys. You’re using generic names like pk and sk because someone on Stack Overflow said that’s what you do. The boilerplate is growing, but you’re managing.

Phase 3: “What have we done?”

// Multiple access patterns, derived keys, complex queries
// This is where teams either adopt proper patterns or suffer

// Need orders by date across all users?
// Need to find users by email?
// Need to list products in an order with their details?
// Need to handle many-to-many relationships?

// Suddenly you're managing:
// - Multiple GSIs with overloaded keys
// - Complex key generation strategies
// - Manual type conversions everywhere
// - Inconsistent patterns across the team
// - No type safety
// - Pages of boilerplate for every operation

This is where the wheels come off. Your product manager asks for a simple feature: “Can users search for orders by date?” Simple in SQL. In DynamoDB? You realize you didn’t design for that access pattern. You need a Global Secondary Index (GSI). But which keys? How do you handle filtering? What about pagination?

Different engineers solve the same problems differently. One person uses USER#123 as their partition key format. Another uses user-123. A third uses 123#USER. Nobody can find anything. Code reviews become battles over key naming conventions.

The transition from Phase 1 to Phase 3 is where DynamoDB’s learning curve becomes a cliff. You’re no longer just storing and retrieving items—you’re implementing a complex data model without joins, without flexible queries, and without the tools you’re used to from SQL databases.

This is exactly where you need to make a choice:

Stay simple: Keep DynamoDB for key-value patterns, use PostgreSQL for complex queries
Level up: Embrace single table design and use proper tooling
Suffer: Try to force relational patterns onto DynamoDB (don’t do this)

Before you choose option 2, you need to understand why DynamoDB is so different from what you’re used to.

The Fundamental Difference: Access Patterns vs Flexibility

Here’s the thing that breaks people coming from SQL: you cannot model your data generically in DynamoDB.

In a relational database, you normalize your data into clean tables with foreign keys, then figure out your queries later. Need to find all orders from California customers who bought electronics in the last 30 days? No problem—write a JOIN with a WHERE clause. The database will figure it out.

DynamoDB doesn’t work that way. At all.

There are no JOINs. There’s no flexible WHERE clause that can filter on any attribute. You can’t just “figure out the queries later.” Instead, you must know every single access pattern upfront and design your data specifically to support those patterns.

This sounds insane if you’re coming from SQL. How can you possibly know every query before you build the system? What if requirements change? But this constraint is actually DynamoDB’s superpower—by eliminating flexibility, it can guarantee consistent performance at any scale.

The Relational Approach

graph TB
    subgraph "Relational Database Flow"
        A1[Design normalized tables] --> B1[Create foreign keys]
        B1 --> C1[Write application]
        C1 --> D1[Create queries as needed]
        D1 --> E1[Database figures out execution plan]
    end

    style E1 fill:#ffcccc

The relational approach optimizes for flexibility. You design your schema once, then write whatever queries you need. The database optimizer figures out how to execute them efficiently. This works great until it doesn’t—when your data grows large enough that the optimizer starts making expensive decisions.

The DynamoDB Approach

graph TB
    subgraph "DynamoDB Flow"
        A2[List ALL access patterns] --> B2[Design table for those patterns]
        B2 --> C2[Write application]
        C2 --> D2[Use predefined access patterns]
        D2 --> E2[Predictable performance]
    end

    style E2 fill:#ccffcc

The DynamoDB approach optimizes for performance at scale. You list every access pattern upfront, design your table to support exactly those patterns, and get predictable performance regardless of data size. The tradeoff? You lose flexibility. New access patterns require schema changes.

The Comparison That Matters

Aspect	Relational (SQL)	DynamoDB
Data Organization	Normalized across many tables	Denormalized in one table
Query Flexibility	Any query possible with JOINs	Only pre-planned access patterns
Performance at Scale	Degrades with data size	Consistent at any scale
Schema Evolution	Add columns anytime	Must plan for new patterns
Filtering	WHERE clause on any column	Only on primary/GSI keys
Development Speed	Fast to start, slow to scale	Slow to start, fast to scale
Cost Model	Pay for compute (complex queries = $$)	Pay for reads/writes (predictable)
Best For	Unknown access patterns, complex queries	Known patterns, predictable performance

You can’t have both. This isn’t a limitation—it’s a fundamental architectural choice.

The Mental Model Shift

To really understand DynamoDB, you need to unlearn three fundamental assumptions from relational databases:

1. “Each entity type gets its own table” → Everything goes in one table

In SQL, you’d have a users table, a repos table, and an issues table. In DynamoDB, they all go in the same table. A single DynamoDB table might have dozens of different entity types, distinguished only by attributes like entityType or prefix patterns in their keys.

This feels wrong at first. Your instinct says “separate concerns!” But remember: DynamoDB doesn’t have joins. The whole point of putting related data in the same table is to fetch it all in a single request.

2. “Normalize to eliminate redundancy” → Denormalize to optimize access patterns

SQL teaches us that duplicating data is bad. DynamoDB says duplication is fine if it serves an access pattern. You might store a user’s display name both in their User item and denormalized into every Comment they create.

Why? Because when you load a comment thread, you need the author names right there. Making separate requests for each user would destroy your performance. Storage is cheap. Network round-trips are expensive.

3. “Filter with WHERE clauses” → Pre-compute your filters

Need to show open issues? In SQL, you’d do WHERE status = 'open'. In DynamoDB, you create a GSI that groups open issues together. The filtering happens at write time, not read time.

If you didn’t plan for a filter up front, you can’t efficiently query for it later. You’d need to scan the entire table and filter in your application code—slow, expensive, and definitely not what you want in production.

Seeing It In Practice

Here’s what this looks like in code:

// SQL Mindset (what doesn't work in DynamoDB)
SELECT * FROM issues
WHERE repo_id = 123
  AND created_at > '2024-01-01'
  AND author = 'john'
  AND label = 'bug'
ORDER BY priority DESC;

// DynamoDB Mindset (what you actually do)
// Option 1: This pattern was planned - use a carefully designed GSI
await table.query({
  IndexName: 'RepoIssuesByLabel',
  KeyConditionExpression: 'GSI1PK = :pk AND begins_with(GSI1SK, :sk)',
  ExpressionAttributeValues: {
    ':pk': 'REPO#123#LABEL#bug',
    ':sk': '2024'
  }
});

// Option 2: This pattern wasn't planned - you're stuck with a scan
// This is slow and expensive - avoid at all costs!
await table.scan({
  FilterExpression: 'repo_id = :repo AND author = :author',
  // This examines EVERY item in your table!
});

The lesson? In DynamoDB, you can’t discover new access patterns through clever queries. You either designed for it upfront, or you’re out of luck.

This is why the DynamoDB modeling process is so different from SQL. You start with access patterns, not entity relationships.

Making the Decision

After understanding the fundamental differences, here’s my framework for deciding whether single table design is right for your project:

Use DynamoDB with Single Table Design When:

Your access patterns are well-defined - You know the queries you need and they won’t change dramatically
Performance at scale is critical - You need consistent sub-10ms responses regardless of data size
You’re building a microservice - Clear boundaries make access patterns predictable
Your team understands DynamoDB - The learning curve is real; team buy-in is essential
Cost predictability matters - DynamoDB’s pricing is predictable (unlike query-based pricing)

Skip Single Table Design When:

You’re prototyping - Requirements will change rapidly
You need ad-hoc queries - “Can you just pull a report of…” will be painful
You’re using GraphQL - Resolvers break the single-request benefit
Analytics is a primary use case - Use a data warehouse instead
Your team is struggling - Better to use multiple tables correctly than one table poorly

The Hybrid Approach (Often the Right Answer)

Many successful systems use DynamoDB for operational workloads and something else for flexibility:

// Operational data in DynamoDB (fast, predictable)
const repo = await RepoEntity.get({
  owner: 'aws',
  name: 'toolkit'
});

// Analytics in Redshift (flexible, powerful)
SELECT COUNT(*) FROM repos
WHERE language = 'TypeScript'
  AND stars > 1000
  AND created_at > '2024-01-01';

// Search in Elasticsearch (full-text, fuzzy matching)
GET /repos/_search
{
  "query": {
    "match": {
      "description": "react hooks typescript"
    }
  }
}

This is often the smartest path. Use each database for what it’s best at:

DynamoDB: Fast operational queries with known patterns
PostgreSQL/Redshift: Complex analytics and reporting
Elasticsearch: Full-text search and fuzzy matching

You don’t have to choose just one database. In fact, you probably shouldn’t.

The Bottom Line

Single table design with DynamoDB is powerful but not universal. It’s a specialized tool for a specific problem: maintaining predictable performance at scale for known access patterns.

Here’s the truth: single table design is learnable, but it will always feel slightly alien compared to relational modeling. That’s okay—you’re optimizing for completely different things. The patterns become familiar over time, but they never feel as natural as normalized tables with JOINs. Accept this cognitive dissonance as part of the tradeoff.

If predictable performance at scale is your problem, the complexity is worth it. If that’s not your problem, use PostgreSQL. Seriously. It’s a fantastic database that will handle 90% of applications beautifully. Don’t use DynamoDB just because it’s “web scale”—use it because you need its specific guarantees.

What’s Next

If you’ve made it this far and you’re thinking “okay, I understand the tradeoffs—show me how this actually works,” then you’re ready for Part 2.

In the next post, we’ll implement something real: GitHub’s entire backend in a single DynamoDB table. Repositories, users, organizations, issues, pull requests, stars—all the complex relationships you’d normally handle with JOINs. We’ll use DynamoDB Toolbox to make the implementation clean and type-safe, and I’ll show you exactly how to design your keys and indexes for real access patterns.

Continue to Part 2: Building GitHub’s Backend in DynamoDB →

Martin C. Richards

Table of Contents