In Part 1, we covered why DynamoDB forces you to think differently—you design for access patterns, not flexibility. Now let’s put that into practice by building something real: GitHub’s backend.
We’re talking about the whole data model—repositories, users, organizations, issues, pull requests, comments, and stars. All the relationships you’d normally handle with JOINs and foreign keys. Except we’re doing it in a single DynamoDB table.
This isn’t a toy example. These are the exact patterns you’d use in production.
Table of Contents
- Why We Need Better Tooling
 - The Modeling Process
 - Setting Up DynamoDB Toolbox
 - Implementing the Data Model
 - Query Patterns That Actually Work
 - Working Example
 - What’s Next
 
Why We Need Better Tooling
Before we dive into the implementation, let me show you why raw DynamoDB code becomes unmaintainable fast. Here’s creating a simple repository:
// Raw DynamoDB SDK - this is what you're avoiding
const putRepo = {
  TableName: 'GitHub',
  Item: {
    pk: { S: 'REPO#aws#dynamodb-toolbox' },
    sk: { S: 'REPO#aws#dynamodb-toolbox' },
    GSI1PK: { S: 'REPO#aws#dynamodb-toolbox' },
    GSI1SK: { S: 'REPO#aws#dynamodb-toolbox' },
    GSI2PK: { S: 'ACCOUNT#aws' },
    GSI2SK: { S: '#2024-12-24T10:30:00Z' },
    entityType: { S: 'Repo' },
    owner: { S: 'aws' },
    name: { S: 'dynamodb-toolbox' },
    description: { S: 'A set of tools for working with DynamoDB' },
    stars: { N: '1234' },
    createdAt: { S: '2024-01-15T08:00:00Z' },
    updatedAt: { S: '2024-12-24T10:30:00Z' }
  }
};
Every attribute needs type annotations ({ S: value }, { N: value }). Keys are manually computed. There’s no type safety. Change one key format and you break queries across your codebase. Teams end up building their own abstraction layers, each slightly different.
DynamoDB Toolbox solves this. It’s not an ORM—it doesn’t try to make DynamoDB look like SQL. Instead, it provides a type-safe way to define entities while embracing DynamoDB’s patterns.
The Modeling Process
Remember from Part 1; you can’t just wing it with DynamoDB. You need a disciplined process.
Step 1: Create an Entity-Relationship Diagram
Yes, you still create an ERD like with relational databases. But these entities become different item types within your single table:
erDiagram
    User ||--o{ Repo : owns
    Organization ||--o{ Repo : owns
    User ||--o{ Organization : memberOf
    Repo ||--o{ Issue : contains
    Repo ||--o{ PullRequest : contains
    Issue ||--o{ IssueComment : has
    PullRequest ||--o{ PRComment : has
    User ||--o{ Reaction : creates
    Issue ||--o{ Reaction : receives
    PullRequest ||--o{ Reaction : receives
    IssueComment ||--o{ Reaction : receives
    PRComment ||--o{ Reaction : receives
    User ||--o{ Star : stars
    Repo ||--o{ Star : receivedBy
    Repo ||--o{ Fork : forkedFrom
    Repo ||--o{ Fork : forkedTo
Step 2: List Every Access Pattern
This is the step that doesn’t exist in relational modeling. You enumerate every way your application will access data:
| Entity | Access Pattern | Query Type | 
|---|---|---|
| Repository | Get a repository by owner and name | Direct get | 
| List all repositories for an account (user or org) | Query GSI1 | |
| List repositories sorted by creation date | Query GSI2 | |
| Get repository with recent issues and PRs | Query main table (item collection) | |
| Issue | Get an issue by repo and number | Direct get | 
| List issues for a repository (with status filter) | Query GSI4 | |
| List recent open issues for a repository | Query GSI4 with beginsWith | |
| Star | List repositories a user has starred | Query main table | 
| List users who starred a repository | Query GSI3 | |
| Check if user has starred a repo | Direct get | 
Write these down. Be specific. This list becomes your contract.
Step 3: Design Your Keys
Now comes the interesting part. You organize data so items needed together live together. In DynamoDB, items with the same partition key are stored together and can be retrieved with a single query.
Here’s our key design:
| Entity | Partition Key (PK) | Sort Key (SK) | Why This Design | 
|---|---|---|---|
| Repo | REPO#<owner>#<name> | 
          REPO#<owner>#<name> | 
          Unique identifier | 
| Issue | REPO#<owner>#<name> | 
          ISSUE#<number> | 
          Lives with parent repo | 
| PR | REPO#<owner>#<name> | 
          PR#<number> | 
          Lives with parent repo | 
| IssueComment | REPO#<owner>#<name> | 
          ISSUE#<number>#COMMENT#<id> | 
          Lives with issue and repo | 
| PRComment | REPO#<owner>#<name> | 
          PR#<number>#COMMENT#<id> | 
          Lives with PR and repo | 
| Reaction | REPO#<owner>#<name> | 
          REACTION#<type>#<target>#<user>#<emoji> | 
          Lives with repo, targets content | 
| User | ACCOUNT#<username> | 
          ACCOUNT#<username> | 
          Unique identifier | 
| Org | ACCOUNT#<orgname> | 
          ACCOUNT#<orgname> | 
          Unique identifier | 
| Star | ACCOUNT#<username> | 
          STAR#<owner>#<name>#<date> | 
          User’s starred repos | 
| Fork | REPO#<owner>#<name> | 
          FORK#<fork_owner> | 
          Original repo’s forks | 
The magic: a repository, its issues, and its pull requests all share the same partition key. One query returns everything.
Step 4: Add Global Secondary Indexes
Your primary key design handles many patterns, but not all. GSIs give you alternative ways to query your data. The trick? Overload them just like your main table:
| GSI | Purpose | Example Query | 
|---|---|---|
| GSI1 | Pull request queries | “List PRs for this repo” | 
| GSI2 | Repo self-reference | For future access patterns | 
| GSI3 | Repos by account + timestamp | “List all repos owned by aws, sorted by creation date” | 
| GSI4 | Issues/PRs by status | “Open issues for this repo” | 
Setting Up DynamoDB Toolbox
Let’s start with the foundation. First, install the dependencies:
npm install dynamodb-toolbox @aws-sdk/client-dynamodb @aws-sdk/lib-dynamodb
Now define the table structure:
import { Table } from 'dynamodb-toolbox/table';
import { DynamoDBClient } from '@aws-sdk/client-dynamodb';
import { DynamoDBDocumentClient } from '@aws-sdk/lib-dynamodb';
// Define the table structure (without client initially)
const GitHubTable = new Table({
  name: 'GitHubTable',
  partitionKey: { name: 'PK', type: 'string' },
  sortKey: { name: 'SK', type: 'string' },
  indexes: {
    GSI1: {
      type: 'global',
      partitionKey: { name: 'GSI1PK', type: 'string' },
      sortKey: { name: 'GSI1SK', type: 'string' }
    },
    GSI2: {
      type: 'global',
      partitionKey: { name: 'GSI2PK', type: 'string' },
      sortKey: { name: 'GSI2SK', type: 'string' }
    },
    GSI3: {
      type: 'global',
      partitionKey: { name: 'GSI3PK', type: 'string' },
      sortKey: { name: 'GSI3SK', type: 'string' }
    },
    GSI4: {
      type: 'global',
      partitionKey: { name: 'GSI4PK', type: 'string' },
      sortKey: { name: 'GSI4SK', type: 'string' }
    }
  }
});
// Initialize with DynamoDB client
const client = new DynamoDBClient({});
GitHubTable.documentClient = DynamoDBDocumentClient.from(client, {
  marshallOptions: {
    removeUndefinedValues: true,
    convertEmptyValues: false
  }
});
Notice the generic names: PK, SK, GSI1PK, etc. This is the single table design pattern—one physical schema supports multiple logical entity types.
Implementing the Data Model
Now let’s implement our entities. DynamoDB Toolbox v2 uses a clean schema API with linked keys.
Repository Entity
import { Entity } from 'dynamodb-toolbox/entity';
import { item } from 'dynamodb-toolbox/schema/item';
import { string } from 'dynamodb-toolbox/schema/string';
import { number } from 'dynamodb-toolbox/schema/number';
import { boolean } from 'dynamodb-toolbox/schema/boolean';
const RepoEntity = new Entity({
  name: 'Repository',
  table: GitHubTable,
  schema: item({
    // Business attributes
    owner: string()
      .required()
      .validate((value: string) => /^[a-zA-Z0-9_-]+$/.test(value))
      .key(),
    repo_name: string()
      .required()
      .validate((value: string) => /^[a-zA-Z0-9_.-]+$/.test(value))
      .key(),
    description: string().optional(),
    is_private: boolean().required().default(false),
    language: string().optional()
  }).and((_schema) => ({
    // DynamoDB keys - automatically computed from business attributes
    PK: string()
      .key()
      .link<typeof _schema>(
        ({ owner, repo_name }) => `REPO#${owner}#${repo_name}`
      ),
    SK: string()
      .key()
      .link<typeof _schema>(
        ({ owner, repo_name }) => `REPO#${owner}#${repo_name}`
      ),
    // GSI1: Repo self-reference
    GSI1PK: string().link<typeof _schema>(
      ({ owner, repo_name }) => `REPO#${owner}#${repo_name}`
    ),
    GSI1SK: string().link<typeof _schema>(
      ({ owner, repo_name }) => `REPO#${owner}#${repo_name}`
    ),
    // GSI2: Repo self-reference
    GSI2PK: string().link<typeof _schema>(
      ({ owner, repo_name }) => `REPO#${owner}#${repo_name}`
    ),
    GSI2SK: string().link<typeof _schema>(
      ({ owner, repo_name }) => `REPO#${owner}#${repo_name}`
    ),
    // GSI3: Query repos by account with timestamp ordering
    GSI3PK: string().link<typeof _schema>(
      ({ owner }) => `ACCOUNT#${owner}`
    ),
    GSI3SK: string()
      .default(() => new Date().toISOString())
      .savedAs('GSI3SK')
  }))
} as const);
Look at what this gives you:
- Type safety: TypeScript knows exactly what fields are required
 - Linked keys: Keys are automatically computed using 
.link() - Automatic timestamps: GSI3SK timestamp set via 
.default()for sorting - Validation: Regex validation on owner and repo names
 - Clean API: No manual type conversions
 
Note: In our implementation, created and modified timestamps are managed by the Entity layer (e.g., RepositoryEntity), not in the DynamoDB schema. This keeps business logic separate from storage concerns.
Understanding the Schema API
The .and() pattern separates business logic from DynamoDB internals:
schema: item({
  // Your business domain - what your app cares about
  owner: string().required().key(),
  repo_name: string().required().key(),
  description: string().optional()
}).and((_schema) => ({
  // DynamoDB keys - infrastructure concerns
  PK: string().key().link<typeof _schema>(
    ({ owner, repo_name }) => `REPO#${owner}#${repo_name}`
  )
}))
The _schema type reference makes .link() type-safe. Change your business attributes and TypeScript will catch broken key computations.
Issue Entity with Smart Status Filtering
Here’s where it gets interesting. We want to efficiently query issues by status (open/closed):
import { set } from 'dynamodb-toolbox/schema/set';
const IssueEntity = new Entity({
  name: 'Issue',
  table: GitHubTable,
  schema: item({
    owner: string()
      .required()
      .validate((value: string) => /^[a-zA-Z0-9_-]+$/.test(value))
      .key(),
    repo_name: string()
      .required()
      .validate((value: string) => /^[a-zA-Z0-9_.-]+$/.test(value))
      .key(),
    issue_number: number().required().key(),
    title: string()
      .required()
      .validate((value: string) => value.length <= 255),
    body: string().optional(),
    status: string().required().default('open'),
    author: string().required(),
    assignees: set(string()).optional(),
    labels: set(string()).optional()
  }).and((_schema) => ({
    // Lives with parent repo
    PK: string()
      .key()
      .link<typeof _schema>(
        ({ owner, repo_name }) => `REPO#${owner}#${repo_name}`
      ),
    // Padded for consistent sorting
    SK: string()
      .key()
      .link<typeof _schema>(
        ({ issue_number }) => `ISSUE#${String(issue_number).padStart(6, '0')}`
      ),
    // GSI4: Optimized for status queries
    GSI4PK: string().link<typeof _schema>(
      ({ owner, repo_name }) => `REPO#${owner}#${repo_name}`
    ),
    GSI4SK: string().link<typeof _schema>(({ issue_number, status }) => {
      if (status === 'open') {
        // Reverse numbering: higher numbers come first
        const reverseNumber = String(999999 - issue_number).padStart(6, '0');
        return `ISSUE#OPEN#${reverseNumber}`;
      }
      // Closed issues sort after open (# prefix sorts last)
      const paddedNumber = String(issue_number).padStart(6, '0');
      return `#ISSUE#CLOSED#${paddedNumber}`;
    })
  }))
} as const);
The GSI4SK pattern is clever:
- Open issues use reverse numbering so newer issues appear first
 - The 
#prefix for closed issues ensures they always sort after open issues - One query can return both open and closed issues in the right order
 
User and Organization Entities
Both are “accounts” in GitHub’s model:
const UserEntity = new Entity({
  name: 'User',
  table: GitHubTable,
  schema: item({
    username: string()
      .required()
      .validate((value: string) => /^[a-zA-Z0-9_-]+$/.test(value))
      .key(),
    email: string()
      .required()
      .validate((value: string) => /^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(value)),
    bio: string().optional(),
    payment_plan_id: string().optional()
  }).and((_schema) => ({
    PK: string()
      .key()
      .link<typeof _schema>(({ username }) => `ACCOUNT#${username}`),
    SK: string()
      .key()
      .link<typeof _schema>(({ username }) => `ACCOUNT#${username}`),
    GSI1PK: string().link<typeof _schema>(
      ({ username }) => `ACCOUNT#${username}`
    ),
    GSI1SK: string().link<typeof _schema>(
      ({ username }) => `ACCOUNT#${username}`
    )
  }))
} as const);
const OrganizationEntity = new Entity({
  name: 'Organization',
  table: GitHubTable,
  schema: item({
    org_name: string()
      .required()
      .validate((value: string) => /^[a-zA-Z0-9_-]+$/.test(value))
      .key(),
    description: string().optional(),
    payment_plan_id: string().optional()
  }).and((_schema) => ({
    PK: string()
      .key()
      .link<typeof _schema>(({ org_name }) => `ACCOUNT#${org_name}`),
    SK: string()
      .key()
      .link<typeof _schema>(({ org_name }) => `ACCOUNT#${org_name}`),
    GSI1PK: string().link<typeof _schema>(
      ({ org_name }) => `ACCOUNT#${org_name}`
    ),
    GSI1SK: string().link<typeof _schema>(
      ({ org_name }) => `ACCOUNT#${org_name}`
    )
  }))
} as const);
Notice they share the same key pattern (ACCOUNT#<name>). This is intentional—repos don’t care if their owner is a user or an org.
Star Entity - The Many-to-Many Pattern
Stars represent a many-to-many relationship. We use the adjacency list pattern:
const StarEntity = new Entity({
  name: 'Star',
  table: GitHubTable,
  schema: item({
    user_name: string().required().key(),
    repo_owner: string().required().key(),
    repo_name: string().required().key(),
    starred_at: string()
      .default(() => new Date().toISOString())
      .savedAs('starred_at')
  }).and((_schema) => ({
    // Direction 1: User -> Repos they've starred
    PK: string()
      .key()
      .link<typeof _schema>(({ user_name }) => `ACCOUNT#${user_name}`),
    SK: string()
      .key()
      .link<typeof _schema>(
        ({ repo_owner, repo_name, starred_at }) =>
          `STAR#${repo_owner}#${repo_name}#${starred_at}`
      ),
    // Direction 2: Repo -> Users who starred it (via GSI)
    GSI1PK: string().link<typeof _schema>(
      ({ repo_owner, repo_name }) => `REPO#${repo_owner}#${repo_name}`
    ),
    GSI1SK: string().link<typeof _schema>(
      ({ user_name, starred_at }) => `STAR#${user_name}#${starred_at}`
    )
  }))
} as const);
The beauty: query in either direction efficiently. Want repos a user starred? Query the main table. Want users who starred a repo? Query GSI1.
Query Patterns That Actually Work
Now for the payoff. Let’s implement GitHub’s actual access patterns.
Creating Items
import { PutItemCommand } from 'dynamodb-toolbox';
// Create a repository - keys computed automatically
await RepoEntity.build(PutItemCommand)
  .item({
    owner: 'aws',
    repo_name: 'dynamodb-toolbox',
    description: 'A set of tools for working with DynamoDB',
    is_private: false,
    language: 'TypeScript'
  })
  .send();
// Create an issue - lives with its repo
await IssueEntity.build(PutItemCommand)
  .item({
    owner: 'aws',
    repo_name: 'dynamodb-toolbox',
    issue_number: 42,
    title: 'Add TypeScript support',
    status: 'open',
    author: 'developer123'
  })
  .send();
No manual key generation. No type conversions. Just business logic.
The Collection Fetch Pattern
This is where single table design shines—fetching heterogeneous items in one query:
import { QueryCommand } from 'dynamodb-toolbox';
async function getRepoWithActivity(owner: string, name: string) {
  // Query for all items with this partition key
  const response = await GitHubTable.build(QueryCommand)
    .query({
      partition: `REPO#${owner}#${name}`
    })
    .options({
      limit: 50
    })
    .send();
  const items = response.Items || [];
  // Parse different entity types from raw DynamoDB items
  const repo = items.find(item => item.entity === 'Repository');
  const issues = items
    .filter(item => item.entity === 'Issue')
    .map(item => ({
      ...item,
      issueNumber: parseInt(item.SK.replace(/^ISSUE#0*/, ''))
    }));
  const pullRequests = items
    .filter(item => item.entity === 'PullRequest')
    .map(item => ({
      ...item,
      prNumber: parseInt(item.SK.replace(/^PR#0*/, ''))
    }));
  return {
    repo,
    issues,
    pullRequests,
    totalItems: items.length
  };
}
// Usage
const result = await getRepoWithActivity('aws', 'dynamodb-toolbox');
console.log(`Found ${result.issues.length} issues`);
One query. One network request. Everything you need.
Querying by Status
async function getOpenIssues(owner: string, repoName: string) {
  const response = await GitHubTable.build(QueryCommand)
    .entities(IssueEntity)  // Type-safe entity filtering
    .query({
      index: 'GSI4',
      partition: `REPO#${owner}#${repoName}`,
      range: {
        beginsWith: 'ISSUE#OPEN#'
      }
    })
    .options({
      limit: 20
    })
    .send();
  return response.Items || [];
}
The reverse numbering in GSI4SK means you get the most recent open issues first. The .entities() method ensures type safety and automatic entity parsing.
Many-to-Many Queries
async function getUserStarredRepos(username: string) {
  // Direction 1: User -> Repos
  const response = await GitHubTable.build(QueryCommand)
    .query({
      partition: `ACCOUNT#${username}`,
      range: {
        beginsWith: 'STAR#'
      }
    })
    .send();
  return response.Items;
}
async function getRepoStargazers(owner: string, repoName: string) {
  // Direction 2: Repo -> Users (via GSI1)
  const response = await GitHubTable.build(QueryCommand)
    .query({
      index: 'GSI1',
      partition: `REPO#${owner}#${repoName}`,
      range: {
        beginsWith: 'STAR#'
      }
    })
    .send();
  return response.Items;
}
Both directions work efficiently because we designed for them upfront.
Pagination Done Right
async function listReposPaginated(accountName: string, pageSize: number = 20) {
  const response = await GitHubTable.build(QueryCommand)
    .entities(RepoEntity)
    .query({
      index: 'GSI3',
      partition: `ACCOUNT#${accountName}`,
      range: { lt: 'ACCOUNT#' }  // Only repos (timestamps < "ACCOUNT#")
    })
    .options({
      limit: pageSize,
      reverse: true  // Newest repos first
    })
    .send();
  return {
    items: response.Items || [],
    // URL-safe base64 encode the continuation token
    nextPageToken: response.LastEvaluatedKey
      ? encodeURIComponent(btoa(JSON.stringify(response.LastEvaluatedKey)))
      : undefined
  };
}
// Next page
async function getNextPage(accountName: string, pageToken: string) {
  const lastKey = JSON.parse(atob(decodeURIComponent(pageToken)));
  const response = await GitHubTable.build(QueryCommand)
    .entities(RepoEntity)
    .query({
      index: 'GSI3',
      partition: `ACCOUNT#${accountName}`,
      range: { lt: 'ACCOUNT#' }
    })
    .options({
      exclusiveStartKey: lastKey,
      reverse: true
    })
    .send();
  return response.Items || [];
}
URL-safe base64 encoding prevents clients from tampering with pagination tokens while ensuring they work in query parameters.
What Real Items Look Like
To make this concrete, here’s what’s actually stored in the table:
// Item 1: Repository
{
  "PK": "REPO#aws#dynamodb-toolbox",
  "SK": "REPO#aws#dynamodb-toolbox",
  "entity": "Repository",
  "owner": "aws",
  "repo_name": "dynamodb-toolbox",
  "description": "Toolbox for DynamoDB",
  "is_private": false,
  "language": "TypeScript",
  "GSI1PK": "REPO#aws#dynamodb-toolbox",
  "GSI1SK": "REPO#aws#dynamodb-toolbox",
  "GSI2PK": "REPO#aws#dynamodb-toolbox",
  "GSI2SK": "REPO#aws#dynamodb-toolbox",
  "GSI3PK": "ACCOUNT#aws",
  "GSI3SK": "2024-01-15T08:00:00.000Z"
}
// Item 2: Issue (same partition key!)
{
  "PK": "REPO#aws#dynamodb-toolbox",
  "SK": "ISSUE#000042",
  "entity": "Issue",
  "owner": "aws",
  "repo_name": "dynamodb-toolbox",
  "issue_number": 42,
  "title": "Add TypeScript support",
  "body": "We should add full TypeScript type definitions",
  "status": "open",
  "author": "developer123",
  "labels": ["enhancement", "typescript"],
  "GSI4PK": "REPO#aws#dynamodb-toolbox",
  "GSI4SK": "ISSUE#OPEN#999958"
}
// Item 3: Star (many-to-many relationship)
{
  "PK": "ACCOUNT#john",
  "SK": "STAR#aws#dynamodb-toolbox#2024-12-01T10:00:00Z",
  "entity": "Star",
  "user_name": "john",
  "repo_owner": "aws",
  "repo_name": "dynamodb-toolbox",
  "starred_at": "2024-12-01T10:00:00Z",
  "GSI1PK": "REPO#aws#dynamodb-toolbox",
  "GSI1SK": "STAR#john#2024-12-01T10:00:00Z"
}
Notice how the repository and its issue share the same partition key (REPO#aws#dynamodb-toolbox). That’s the item collection that makes single-query fetches possible.
Working Example
All the code from this post is available in a working repository: github-ddb
The repository includes:
- Complete entity definitions with DynamoDB Toolbox v2
 - Working query examples for all access patterns
 - Type-safe implementations
 - Tests demonstrating the patterns in action
 
Clone it, run the examples, and experiment with the patterns. Seeing the code run makes the concepts click.
What’s Next
You’ve now seen single table design in action. We’ve implemented GitHub’s core data model with proper access patterns, type safety, and clean code.
But we’re not done. In Part 3, we’ll tackle the hard problems:
- Hot partitions and how to prevent them
 - Migrations when requirements change
 - Debugging your overloaded table
 - Cost optimization strategies
 - When to walk away from single table design
 
These are the production realities that separate successful DynamoDB implementations from disasters. See you in Part 3.
Continue to Part 3: Advanced Patterns and Production Realities →
Series Navigation
- Part 1: When DynamoDB Stops Being Simple
 - Part 2: Building GitHub’s Backend in DynamoDB
 - Part 3: Advanced Patterns and Production Realities
 
Additional Resources
Code Examples
- github-ddb Repository - Working implementation of all patterns from this post