JSON Schema Migrations: Versioning, Transform Functions, and Rollback

Last updated:

JSON schema migrations transform stored documents from one schema version to another — a schemaVersion field in each document drives which migration functions to apply. Additive changes (new optional fields) are backward-compatible and require no migration. Breaking changes (rename, remove, type change) need a migration function that transforms each document and bumps schemaVersion.

This guide covers migration function patterns, up/down migration symmetry, batch migration scripts, MongoDB document migration, event sourcing JSON schema evolution, and testing migration correctness. Related background on schema structure is available in the JSON Schema versioning and JSON Schema patterns guides.

Schema Version Field Pattern

The foundational pattern for JSON document migration is embedding a schemaVersion integer in every document at creation time. The field acts as a sentinel: when application code reads a document, it compares doc.schemaVersion against the current version constant and decides whether to apply migration functions before processing.

// Document written at schema version 1
{
  "id": "user-123",
  "schemaVersion": 1,
  "name": "Alice",
  "email": "alice@example.com"
}

// Migration registry — maps version N to its up/down functions
const migrations = {
  1: {
    up: (doc) => ({
      ...doc,
      schemaVersion: 2,
      fullName: doc.name,          // rename: name → fullName
      name: undefined,             // mark old field for removal
    }),
    down: (doc) => ({
      ...doc,
      schemaVersion: 1,
      name: doc.fullName,
      fullName: undefined,
    }),
  },
}

const CURRENT_VERSION = 2

A migration registry is a plain object (or Map) keyed by the source version number. Each entry holds an up function (version N to N+1) and a down function (version N+1 to N). The registry grows monotonically — never delete old entries, because documents at old versions may still exist in the database and need the migration chain. A sentinel value strategy starts all documents at schemaVersion: 0 when no version field is present, so legacy pre-versioning documents are handled without a separate code path.

Additive vs Breaking Changes

Not every schema change requires a migration function. Understanding the taxonomy prevents unnecessary migration work and keeps the registry lean.

Change typeExamplesMigration needed?Consumer impact
Additive — new optional fieldAdd middleName?NoNone — old consumers ignore unknown fields
Additive — new required field with defaultAdd status: "active"RecommendedOld docs lack field; migration backfills default
Breaking — field renamename fullNameYesOld consumers read wrong field name
Breaking — field removalRemove legacyIdYesConsumers reading deleted field get undefined
Breaking — type changeage: string age: numberYesType mismatch causes runtime errors

A safe deprecation period for breaking changes follows a three-phase approach: (1) add the new field alongside the old one (dual-write period), (2) migrate all consumers to read from the new field, (3) remove the old field and write the migration function. This prevents breaking deployed consumers before they are updated. See the JSON data validation guide for schema validation patterns that catch type mismatches early.

Migration Function Pattern

A well-designed migrate(doc, fromVersion, toVersion)orchestrator walks the registry in order, applying each version's up or down function in sequence. Up/down symmetry is the invariant that makes rollback safe: for any document, down(up(doc)) must equal doc. Write both functions together when you author a migration, and test the round-trip before merging.

// migrate.ts — generic orchestrator
import { migrations, CURRENT_VERSION } from './migrations-registry'

export function migrate<T extends { schemaVersion: number }>(
  doc: T,
  targetVersion = CURRENT_VERSION,
): T {
  let current = { ...doc }

  if (current.schemaVersion < targetVersion) {
    // Migrate up
    for (let v = current.schemaVersion; v < targetVersion; v++) {
      if (!migrations[v]?.up) throw new Error(`No up migration from v${v}`)
      current = migrations[v].up(current)
    }
  } else if (current.schemaVersion > targetVersion) {
    // Migrate down (rollback)
    for (let v = current.schemaVersion; v > targetVersion; v--) {
      if (!migrations[v - 1]?.down) throw new Error(`No down migration to v${v - 1}`)
      current = migrations[v - 1].down(current)
    }
  }

  return current as T
}

Each migration function in the chain receives the output of the previous one. This means each function only needs to handle one version increment — no migration function should know about versions it does not own. Keep migration functions pure (no side effects, no database calls) so they can be composed and tested in isolation. Use spread syntax orstructuredClone to avoid mutating the input document.

Lazy Migration on Read

Lazy migration (migrate-on-access) defers document transformation until the moment a document is read from the database. The application checks schemaVersion on every read and runs the migration chain in memory if the version is stale. This eliminates the risk of a big-bang migration that locks tables or fails midway on large collections.

// user-repository.ts
import { migrate } from './migrate'
import { CURRENT_VERSION } from './migrations-registry'
import { db } from './db'

export async function getUserById(id: string) {
  const raw = await db.collection('users').findOne({ id })
  if (!raw) return null

  // Migrate in-memory to current version
  const doc = migrate(raw, CURRENT_VERSION)

  // Write-back: persist migrated document so next read is free
  if (raw.schemaVersion !== CURRENT_VERSION) {
    await db.collection('users').updateOne(
      { id },
      { $set: doc },
    )
  }

  return doc
}

The write-back step is optional. In-memory-only lazy migration is appropriate when the collection is read-heavy and you want to avoid extra write load. Write-back lazy migration gradually eliminates old-version documents from the database — after all documents have been read once, the entire collection is at the current version. The performance impact of the migration chain is usually negligible for small version gaps; for large gaps (e.g., v1 to v10), consider a background batch migration to pre-warm the collection before the next deployment.

Batch Migration Scripts

When lazy migration is not viable — for example, a breaking schema change that must be complete before the new application version deploys — a batch migration script runs directly against the database. MongoDB and PostgreSQL JSONB both support in-place field transforms without loading documents into application memory.

// MongoDB batch migration: rename "name" → "fullName", bump schemaVersion 1 → 2
db.users.updateMany(
  { schemaVersion: 1 },
  [
    { $set: { fullName: "$name", schemaVersion: 2 } },
    { $unset: "name" },
  ]
)

// PostgreSQL JSONB batch migration using jsonb_set
UPDATE users
SET doc = jsonb_set(
  doc #- '{name}',           -- remove old key
  '{fullName}',              -- set new key
  doc->'name'               -- copy value
) || '{"schemaVersion": 2}'::jsonb
WHERE (doc->>'schemaVersion')::int = 1;

Idempotency is critical: a batch migration script must produce the same result whether run once or ten times. Filter by schemaVersion before the transform so already-migrated documents are skipped. For progress tracking on large collections, process documents in batches of 1,000–10,000 using _id cursors and log the last processed ID so the script can resume after a failure. Always back up the collection or table before running a destructive batch migration. See the JSON config management guide for patterns around versioned configuration documents.

Event Sourcing JSON Schema Evolution

Event sourcing stores every state change as an immutable event in an append-only log. The constraint — past events must never be mutated — makes schema migration a read-time concern rather than a write-time one. The two standard patterns are upcasters and new event types.

// Upcaster: transforms old UserRegistered v1 payload to v2 at read time
const upcasters: Record<string, Record<number, (payload: unknown) => unknown>> = {
  UserRegistered: {
    1: (payload: any) => ({
      ...payload,
      version: 2,
      fullName: payload.name,   // rename on the fly
      name: undefined,
    }),
  },
}

function upcastEvent(event: StoredEvent): StoredEvent {
  const eventUpcasters = upcasters[event.type]
  if (!eventUpcasters) return event

  let payload = event.payload
  let version = event.version ?? 1

  while (eventUpcasters[version]) {
    payload = eventUpcasters[version](payload)
    version++
  }

  return { ...event, payload, version }
}

// Snapshot versioning: bump snapshotVersion when aggregate shape changes
type Snapshot = {
  aggregateId: string
  snapshotVersion: number   // schema version of the snapshot payload
  eventSequence: number     // last event applied when snapshot was taken
  state: unknown
}

When the structural change is too large for an upcaster, introduce a new event type (e.g., UserRegisteredV2) and keep both projector handlers active. Old events replay through the v1 handler; new events through the v2 handler. This approach never mutates the event log and allows full replay correctness across all history. Snapshot versioning follows the same pattern: when an aggregate's state shape changes, bump snapshotVersion and discard snapshots at the old version so the projector replays from scratch.

Testing JSON Migrations

Migration functions are pure data transforms, which makes them straightforward to test without a running database. Four test categories cover correctness, regression, and idempotency.

// __fixtures__/user.v1.json
{ "id": "u1", "schemaVersion": 1, "name": "Alice", "email": "alice@example.com" }

// __fixtures__/user.v2.json
{ "id": "u1", "schemaVersion": 2, "fullName": "Alice", "email": "alice@example.com" }

// migrations.test.ts
import v1Fixture from './__fixtures__/user.v1.json'
import v2Fixture from './__fixtures__/user.v2.json'
import { migrate } from './migrate'

describe('User migration v1 → v2', () => {
  test('forward migration produces expected v2 shape', () => {
    const result = migrate(v1Fixture, 2)
    expect(result).toEqual(v2Fixture)
  })

  test('round-trip: up then down returns original document', () => {
    const up   = migrate(v1Fixture, 2)
    const down = migrate(up, 1)
    expect(down).toEqual(v1Fixture)
  })

  test('idempotency: migrating an already-migrated document is a no-op', () => {
    const first  = migrate(v2Fixture, 2)
    const second = migrate(first, 2)
    expect(second).toEqual(v2Fixture)
  })
})

// Property-based test using fast-check
import fc from 'fast-check'

test('round-trip holds for arbitrary v1 documents', () => {
  fc.assert(fc.property(
    fc.record({ id: fc.string(), schemaVersion: fc.constant(1), name: fc.string(), email: fc.emailAddress() }),
    (doc) => {
      const up   = migrate(doc, 2)
      const down = migrate(up, 1)
      expect(down).toEqual(doc)
    }
  ))
})

Fixture files per version serve as living documentation of what each schema version looks like. Commit them alongside the migration functions so reviewers can see exactly what changed. Property-based testing with libraries such as fast-check generates hundreds of random documents and confirms the round-trip invariant holds across the full input space, catching edge cases that handwritten fixtures miss.

Key Term Definitions

Schema version
An integer field (typically schemaVersion) embedded in every JSON document that records which version of the document schema the document was written against. The application reads this field to decide whether migration functions need to run before the document can be safely used.
Migration function
A pure function that transforms a JSON document from schema version N to version N+1 (up migration) or from N+1 to N (down migration). Migration functions are stored in a registry keyed by version number and composed by an orchestrator to handle arbitrary version gaps.
Additive change
A schema change that adds new optional fields or relaxes constraints without removing or renaming existing fields. Additive changes are backward-compatible: old documents remain valid under the new schema, and old consumers that do not know about the new fields continue to work correctly.
Breaking change
A schema change that renames, removes, or changes the type of an existing field. Breaking changes invalidate old consumers immediately because they reference field names or types that no longer exist or have different semantics. A migration function is required to transform existing documents to the new shape.
Lazy migration
A strategy that defers document schema migration until the document is read from the database, rather than migrating all documents upfront in a batch. The migration chain runs in application memory on access. An optional write-back step persists the migrated document so subsequent reads are free.
Upcaster
In event sourcing, an upcaster is a function that transforms an old event payload to a newer schema at read time without mutating the event log. Upcasters are keyed by event type and version and are chained to handle multiple version gaps. They preserve the immutability of the event store while allowing aggregate projectors to work with a consistent event shape.
Idempotent migration
A migration script or function that produces the same result whether applied once or multiple times to the same document or collection. Idempotency is achieved by filtering only documents at the source schemaVersion before transforming them, so already-migrated documents are skipped on re-runs.

FAQ

How do I version JSON documents in a database?

Embed a schemaVersion integer in every document at write time. When you read a document, check its schemaVersion against the current version. If they differ, run the chain of migration functions to bring it up to date before using it. This lazy migration pattern avoids a big-bang batch update and lets you migrate documents incrementally as they are accessed. See the JSON Schema patterns guide for schema design best practices.

What is a JSON schema migration function?

A migration function transforms a JSON document from one schema version to the next. It takes a document at version N and returns the document at version N+1 (an “up” migration) or version N-1 (a “down” migration for rollback). A registry maps each version number to its up and down functions. The migrate() orchestrator calls the chain of functions needed to move a document from its current version to the target version.

What is the difference between additive and breaking JSON schema changes?

Additive changes add new optional fields to a JSON document or schema. Existing consumers that do not know about the new field continue to work — they simply ignore it. Breaking changes include renaming a field, removing a required field, or changing a field's type. These break existing consumers immediately because they still reference the old field name or expect the old type. Breaking changes require a migration function to transform affected documents and bump their schemaVersion.

How do I migrate JSON documents in MongoDB?

Use MongoDB's updateMany with $set, $rename, and $unset to transform documents in place. For a field rename: { $rename: { "oldField": "newField" } }. For a field removal: { $unset: { "deprecatedField": "" } }. Filter by { schemaVersion: 1 } so already-migrated documents are skipped, making the script idempotent. Always back up the collection before running destructiveupdateMany operations.

What is lazy migration for JSON documents?

Lazy migration defers document transformation until the document is read rather than migrating all documents upfront in a batch. When a document is fetched, the application checks its schemaVersion. If it is below the current version, the migration chain runs in memory. Optionally the transformed document is written back to the database so the next read skips the migration. Lazy migration is safe to deploy alongside old code because documents at old versions remain valid until accessed.

How do I handle JSON schema evolution in event sourcing?

In event sourcing, past events are immutable — never modify the event store. Instead, use upcaster functions that run at read time to transform old event payloads to the current format. An upcaster is keyed by event type and version; when the projector encounters an old version, it passes the payload through the upcaster chain before applying it to the aggregate. For large structural changes, introduce a new event type rather than evolving the old one, and keep both projector handlers in place until old events are no longer replayed.

How do I test JSON migration functions?

Keep fixture JSON files for each schema version in a __fixtures__ directory. Write a forward migration test that loads the v1 fixture, runs migrate(doc, 1, currentVersion), and asserts the output matches the expected current-version fixture. Write a round-trip test that migrates forward then backward and asserts the result equals the original document. For property-based testing, generate random valid documents and confirm the round-trip invariant with a library such as fast-check. Idempotency tests confirm that running the same migration twice produces the same result.

How do I rollback a JSON schema migration?

Implement a “down” migration function for every “up” migration. The down function is the inverse transform: it renames fields back, restores removed fields from a backup column, or reverts type changes. To rollback, run the down functions in reverse order and decrement schemaVersion. For MongoDB batch migrations, always keep a backup collection before running updateMany. For lazy migrations, the down path runs automatically if you redeploy old code that sets a lower target version. Test down migrations with the same fixture-based approach as up migrations to verify the round-trip invariant.

Validate your migrated JSON documents against your updated schema — paste both into Jsonic's JSON Formatter for instant structural feedback.

Open JSON Formatter

Further reading and primary sources