JSON Microservices Communication: Schema Versioning, Circuit Breaker & gRPC

Last updated:

Microservices communicate via JSON over HTTP/REST or message queues (Kafka, RabbitMQ) — JSON's human readability and universal tooling make it the default choice despite 3–5× larger payloads compared to Protobuf. Each service should own its JSON schema — service A's response schema version 2 must remain backward compatible with service B's client expecting version 1 (additive changes only: new optional fields, no removed or renamed required fields). Schema versioning with a JSON Schema $schema field enables automated compatibility checking in CI.

This guide covers service-to-service JSON communication patterns, event envelope design for message queues, schema versioning strategy, Circuit Breaker for failed JSON calls, gRPC vs JSON tradeoffs, and OpenTelemetry trace propagation via JSON headers.

Service-to-Service JSON Communication: REST vs Message Queues

Service-to-service JSON communication splits into two models: synchronous REST (caller blocks until the downstream service responds) and asynchronous messaging (caller publishes a JSON event and returns immediately). Choose REST when the calling service needs to branch on the response — accept or reject an order based on payment authorization. Choose async messaging when the calling service only needs to record that something happened — the downstream consumer's availability does not block the producer.

JSON serialization strategy differs between the two. For REST, use Content-Type: application/json and validate request/response bodies with JSON Schema validation at service boundaries. For async messaging, every message body is a JSON event envelope — a consistent outer structure that wraps the business payload and carries metadata (event ID, version, trace context). Both channels benefit from a shared schema package: define JSON Schema files once in a Git monorepo and import them in each service as an npm package or Python library.

// Synchronous REST: POST /orders (caller blocks for response)
// Use when: payment auth, inventory check, user lookup — need immediate answer

// Request JSON
{
  "userId":      "u-abc123",
  "amountCents": 2999,
  "currency":    "USD",
  "items":       [{ "sku": "PRO-PLAN", "qty": 1 }]
}

// Response JSON
{
  "orderId":   "ord-xyz789",
  "status":    "created",
  "createdAt": "2026-01-18T10:00:00.000Z"
}

// ─────────────────────────────────────────────────────────────────
// Async messaging: publish to Kafka "orders" topic and return immediately
// Use when: confirmation emails, audit log writes, search indexing

{
  "eventId":       "550e8400-e29b-41d4-a716-446655440000",
  "eventType":     "order.created",
  "version":       1,
  "timestamp":     "2026-01-18T10:00:00.000Z",
  "source":        "order-service",
  "correlationId": "req-abc-123",
  "payload": {
    "orderId":     "ord-xyz789",
    "userId":      "u-abc123",
    "amountCents": 2999,
    "currency":    "USD"
  }
}

// ─────────────────────────────────────────────────────────────────
// Decision matrix
// REST:   payment auth, inventory checks, user lookups, read APIs
// Async:  order emails, audit log writes, search index updates,
//         downstream data replication, analytics pipelines

JSON is the default serialization for both channels because every language runtime ships a JSON parser and HTTP tooling (curl, Postman, browser DevTools) handles it natively. The cost is payload size: a JSON object with string keys, quotes, and whitespace is 3–5× larger than equivalent Protobuf binary. For internal services handling thousands of calls per second, that overhead accumulates into meaningful CPU and network costs — which is the primary motivation for evaluating gRPC. See the JSON API design guide for REST contract patterns.

JSON Event Envelope Design for Event-Driven Microservices

A JSON event envelope is the standard outer structure that wraps every async message — it carries metadata the broker infrastructure needs (deduplication, routing, tracing, versioning) separately from the business payload. A consistent envelope across all services lets generic consumer infrastructure operate without parsing business payloads. The seven required envelope fields are: eventId, eventType, version, timestamp, source, correlationId, and payload.

// Standard JSON event envelope — all seven required fields
{
  "eventId":       "550e8400-e29b-41d4-a716-446655440000",  // UUID v4 — deduplication key
  "eventType":     "order.created",                          // dot-namespaced: domain.verb
  "version":       1,                                        // integer schema version
  "timestamp":     "2026-01-18T10:00:00.000Z",              // ISO 8601 UTC with milliseconds
  "source":        "order-service",                          // producing service name
  "correlationId": "req-abc-123",                            // distributed trace correlation
  "payload": {
    "orderId":     "ord-xyz789",
    "userId":      "u-abc123",
    "amountCents": 2999,
    "currency":    "USD",
    "items": [
      { "sku": "PRO-PLAN", "qty": 1, "priceCents": 2999 }
    ]
  }
}
// TypeScript: envelope type + producer factory
import { randomUUID } from 'crypto'

interface EventEnvelope<T = Record<string, unknown>> {
  eventId:       string   // UUID v4
  eventType:     string   // "order.created"
  version:       number   // integer
  timestamp:     string   // ISO 8601 UTC
  source:        string   // producing service name
  correlationId: string
  payload:       T
}

function createEvent<T>(
  eventType: string,
  version: number,
  payload: T,
  source: string,
  correlationId: string,
): EventEnvelope<T> {
  return {
    eventId: randomUUID(),
    eventType,
    version,
    timestamp: new Date().toISOString(),
    source,
    correlationId,
    payload,
  }
}

// Publish to Kafka — use entity ID as message key to preserve ordering
const event = createEvent(
  'order.created',
  1,
  { orderId: 'ord-xyz789', userId: 'u-abc123', amountCents: 2999 },
  'order-service',
  req.headers['x-correlation-id'] as string,
)

await kafkaProducer.send({
  topic: 'orders',
  messages: [{ key: event.payload.orderId, value: JSON.stringify(event) }],
})

Use dot-namespaced eventType strings — "order.created", "order.cancelled", "payment.failed" — so consumers can subscribe to a domain prefix without inspecting the payload. The version integer enables routing to version-specific handlers when breaking schema changes are unavoidable. The Kafka message key must be the entity ID (e.g. orderId) — this ensures all events for the same order land on the same partition, preserving temporal ordering. See the JSON webhooks guide for envelope patterns in HTTP callback contexts.

Schema Versioning: Backward-Compatible JSON Evolution

Schema versioning is the discipline of changing JSON contracts without breaking deployed consumers. Because services in a microservices system deploy independently, a producer may be running schema version 2 while some consumers still expect version 1. The golden rule for backward compatibility: only additive changes. Adding a new optional field is safe — old consumers simply ignore it. Removing a field, renaming a field, or changing a field type breaks every consumer that reads it.

// ✅ Backward-compatible: additive change (new optional field)
// v1 consumers ignore "currency"; v1.1 consumers read it

// v1 payload
{ "orderId": "ord-1", "userId": "u-1", "amountCents": 2999 }

// v1.1 payload — "currency" is optional, safe to add
{ "orderId": "ord-2", "userId": "u-1", "amountCents": 2999, "currency": "USD" }

// ─────────────────────────────────────────────────────────────────
// ❌ Breaking changes — require version bump to v2

// 1. Removing a field:  remove "userId"  → consumers reading it break
// 2. Renaming a field:  "userId" → "customerId" → consumers break
// 3. Type change:       "amountCents": 2999 → "amount": "29.99" → breaks parsing
// 4. New required field: old producers omit it → new consumers reject it

// ─────────────────────────────────────────────────────────────────
// v2 payload — breaking change: "amount" (string) replaces "amountCents" (integer)
// Increment version integer; run v1 and v2 handlers side-by-side during migration
{
  "eventType": "order.created",
  "version":   2,
  "payload": {
    "orderId":  "ord-3",
    "userId":   "u-1",
    "amount":   "29.99",    // string decimal (breaking: was integer cents)
    "currency": "USD"
  }
}
// Consumer: version-aware routing with Zod schemas
import { z } from 'zod'

// v1 schema — optional currency for backward compat with v1.1 producers
const v1Schema = z.object({
  orderId:     z.string(),
  userId:      z.string(),
  amountCents: z.number().int(),
  currency:    z.string().optional(),
}).passthrough()  // silently ignore unknown fields from future minor versions

// v2 schema — breaking change: string decimal amount
const v2Schema = z.object({
  orderId:  z.string(),
  userId:   z.string(),
  amount:   z.string(),
  currency: z.string(),
}).passthrough()

const envelopeSchema = z.object({
  eventId:       z.string().uuid(),
  eventType:     z.string(),
  version:       z.number().int(),
  timestamp:     z.string(),
  source:        z.string(),
  correlationId: z.string(),
  payload:       z.unknown(),
})

async function handleOrderCreated(raw: unknown) {
  const envelope = envelopeSchema.parse(raw)

  switch (envelope.version) {
    case 1: {
      const data = v1Schema.parse(envelope.payload)
      return processOrder({ orderId: data.orderId, amountCents: data.amountCents })
    }
    case 2: {
      const data = v2Schema.parse(envelope.payload)
      const amountCents = Math.round(parseFloat(data.amount) * 100)
      return processOrder({ orderId: data.orderId, amountCents })
    }
    default:
      throw new Error(`Unknown order.created schema version: ${envelope.version}`)
  }
}

Register schemas in a schema registry and configure compatibility mode: BACKWARD means new schemas must be readable by old consumers (the safer default for event streaming). Run compatibility checks as a CI gate — a PR that removes a field or changes a type must fail the build before merge. The $schema field in JSON Schema documents links to the exact schema version, enabling automated tooling to verify compatibility without human review. See the JSON Schema validation guide for Ajv-based validation setup.

Circuit Breaker Pattern for JSON API Calls

The Circuit Breaker pattern prevents cascade failures: when service A calls service B with a JSON REST request and service B is slow or down, without a Circuit Breaker every thread in service A blocks waiting for a timeout response, exhausting the connection pool and taking down service A too. The Circuit Breaker sits in front of the JSON call and opens after a failure threshold — rejecting calls immediately with a fallback JSON response instead of passing them through to the failing downstream service.

The circuit has three states: Closed (normal operation — all JSON calls go through), Open (tripped — calls are rejected immediately without hitting the downstream service, preserving resources), and Half-Open (probe state — one test call is allowed to check if the downstream service recovered). A typical configuration: open after 5 consecutive failures, wait 30 seconds before transitioning to Half-Open, return to Closed after 2 consecutive successes in Half-Open.

// Circuit Breaker with opossum (Node.js)
import CircuitBreaker from 'opossum'
import axios from 'axios'

// Wrap the JSON call that may fail
async function callInventoryService(sku: string) {
  const response = await axios.get(`http://inventory-service/stock/${sku}`, {
    timeout: 3000,  // 3 second timeout per call
    headers: { Accept: 'application/json' },
  })
  return response.data
}

// Circuit Breaker configuration
const breaker = new CircuitBreaker(callInventoryService, {
  timeout:              3000,   // call timeout (ms)
  errorThresholdPercentage: 50, // open after 50% of calls fail
  resetTimeout:         30000,  // wait 30s in Open before trying Half-Open
  volumeThreshold:      5,      // minimum calls before computing error %
})

// Fallback: return degraded JSON response when circuit is Open
breaker.fallback((sku: string) => ({
  sku,
  stockLevel: -1,
  available:  false,
  status:     'degraded',
  message:    'inventory service temporarily unavailable',
  fallback:   true,
}))

// Event hooks for structured logging and metrics
breaker.on('open',     () => log('warn', 'circuit-breaker-open',      { service: 'inventory-service' }))
breaker.on('halfOpen', () => log('info', 'circuit-breaker-half-open', { service: 'inventory-service' }))
breaker.on('close',    () => log('info', 'circuit-breaker-closed',    { service: 'inventory-service' }))
breaker.on('fallback', (result) => log('warn', 'circuit-breaker-fallback', { result }))

// Usage — transparent to caller; fallback JSON returned when open
async function getStockLevel(sku: string) {
  return breaker.fire(sku)
  // Returns real JSON when Closed, fallback JSON when Open
}
// Fallback JSON response structure — return to caller when circuit is Open
// Callers must handle the "fallback: true" flag
{
  "sku":        "PRO-PLAN",
  "stockLevel": -1,
  "available":  false,
  "status":     "degraded",
  "message":    "inventory service temporarily unavailable",
  "fallback":   true
}

// Expose circuit state in /health endpoint for monitoring
app.get('/health', (req, res) => {
  res.json({
    status: 'ok',
    circuits: {
      inventoryService: breaker.opened ? 'open' : breaker.halfOpen ? 'half-open' : 'closed',
    },
  })
})

Include a fallback: true flag in the Circuit Breaker fallback JSON so downstream callers can distinguish real data from degraded responses — they may choose to show a "temporarily unavailable" message to the user instead of stale data. Expose circuit state in your /health endpoint and alert when any circuit opens — an open circuit is a signal that a downstream service is failing, not a normal operational event. For Java services, Resilience4j provides the same Circuit Breaker primitives with Micrometer metrics integration.

gRPC vs JSON REST: When to Switch

gRPC uses Protobuf binary serialization over HTTP/2 and outperforms JSON REST by 7–10× throughput on internal service networks. The performance gap comes from three sources: Protobuf binary is 3–5× smaller than equivalent JSON, HTTP/2 multiplexing sends multiple concurrent streams over one TCP connection (vs one request per connection in HTTP/1.1 keep-alive), and generated typed client code eliminates runtime JSON parsing overhead. However, gRPC adds operational complexity — .proto schema files require a build step, browser clients need grpc-web proxying, and debugging binary payloads is harder than reading JSON.

// JSON REST vs gRPC — same OrderService, different wire formats

// ── JSON REST ────────────────────────────────────────────────────
// POST http://order-service/orders
// Content-Type: application/json
{
  "userId":      "u-abc123",
  "amountCents": 2999,
  "currency":    "USD",
  "items":       [{ "sku": "PRO-PLAN", "qty": 1 }]
}
// Response: 201 Created
{
  "orderId":   "ord-xyz789",
  "status":    "created",
  "createdAt": "2026-01-18T10:00:00.000Z"
}

// ── gRPC equivalent ──────────────────────────────────────────────
// order.proto
syntax = "proto3";

service OrderService {
  rpc CreateOrder(CreateOrderRequest) returns (CreateOrderResponse);
}

message CreateOrderRequest {
  string user_id      = 1;
  int64  amount_cents = 2;
  string currency     = 3;
  repeated OrderItem items = 4;
}

message OrderItem {
  string sku = 1;
  int32  qty = 2;
}

message CreateOrderResponse {
  string order_id   = 1;
  string status     = 2;
  string created_at = 3;
}

// ─────────────────────────────────────────────────────────────────
// Performance comparison (same payload, internal network)
// JSON REST: ~2,000 requests/second per instance
// gRPC:      ~18,000 requests/second per instance (9x throughput)
// JSON payload: ~280 bytes  |  Protobuf: ~60 bytes (4.7x smaller)
// Decision matrix: JSON REST vs gRPC

// Use JSON REST when:
// ✅ Public-facing API (browsers, mobile apps, third parties)
// ✅ Team values human-readable payloads for debugging
// ✅ Call volume < 1,000 req/s — JSON overhead is negligible
// ✅ Rapid iteration — no .proto schema build step
// ✅ Service already has OpenAPI spec and JSON Schema validation

// Use gRPC when:
// ✅ Internal service-to-service only (no direct browser calls)
// ✅ High volume: > 5,000 req/s where JSON overhead matters
// ✅ Latency target: < 10ms p99 (binary + HTTP/2 multiplexing helps)
// ✅ Bidirectional streaming (gRPC streams vs SSE/WebSocket for REST)
// ✅ Strongly-typed clients: generated code catches contract errors at compile time

// Hybrid: gRPC internally + JSON REST externally (API gateway translates)
// grpc-gateway: generates REST JSON API from .proto definitions automatically

A common hybrid: use JSON REST for external-facing APIs and gRPC for high-volume internal service-to-service calls — an API gateway (Kong, Envoy, AWS API Gateway) translates between the two wire formats. The grpc-gateway tool generates a REST/JSON API from .proto definitions automatically, so you maintain one schema and serve both. For services where gRPC is overkill but JSON payload size is still a concern, consider JSON with HTTP/2 using fetch()over h2 or MessagePack as a drop-in binary replacement that retains JSON's data model.

OpenTelemetry Trace Context Propagation in JSON

OpenTelemetry (OTel) is the CNCF standard for distributed tracing, metrics, and logging. In a microservices architecture, trace context propagation via JSON headers stitches together spans from multiple services into a single distributed trace — letting engineers see the full call chain, latency at each hop, and where failures occur. The W3C traceparent HTTP header carries the trace context for synchronous REST calls; JSON event envelopes carry it explicitly for async messaging.

// W3C traceparent header format
// "00-{traceId(32hex)}-{parentSpanId(16hex)}-{flags(2hex)}"
// Example:
// traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01

// ── OpenTelemetry auto-instrumentation (Node.js) ─────────────────
import { NodeSDK } from '@opentelemetry/sdk-node'
import { HttpInstrumentation } from '@opentelemetry/instrumentation-http'
import { KafkaJsInstrumentation } from '@opentelemetry/instrumentation-kafkajs'
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http'

const sdk = new NodeSDK({
  traceExporter: new OTLPTraceExporter({ url: 'http://otel-collector:4318/v1/traces' }),
  instrumentations: [
    new HttpInstrumentation(),      // auto-propagates traceparent on HTTP calls
    new KafkaJsInstrumentation(),   // auto-injects context into Kafka message headers
  ],
})

sdk.start()

// After SDK init: HTTP calls automatically inject traceparent
// Kafka messages automatically get _otTrace headers injected by the instrumentation
// Manual trace context injection for async JSON events
// (when auto-instrumentation does not cover your messaging library)

import { context, propagation, trace } from '@opentelemetry/api'
import { randomUUID } from 'crypto'

function createTracedEvent<T>(
  eventType: string,
  version: number,
  payload: T,
  source: string,
): EventEnvelope<T> & { traceContext: Record<string, string> } {
  // Extract current OTel trace context into a plain object
  const traceContext: Record<string, string> = {}
  propagation.inject(context.active(), traceContext)
  // traceContext now contains: { traceparent: "00-abc...", tracestate: "" }

  const span = trace.getActiveSpan()
  const spanContext = span?.spanContext()

  return {
    eventId:       randomUUID(),
    eventType,
    version,
    timestamp:     new Date().toISOString(),
    source,
    correlationId: traceContext['traceparent'] ?? randomUUID(),
    traceContext,  // embed W3C context in envelope for async consumers
    payload,
  }
}

// Consumer: restore trace context from JSON envelope
async function consumeWithTrace(envelope: EventEnvelope & { traceContext?: Record<string, string> }) {
  const parentCtx = envelope.traceContext
    ? propagation.extract(context.active(), envelope.traceContext)
    : context.active()

  // Run consumer handler as a child span of the producer's trace
  return context.with(parentCtx, async () => {
    const tracer = trace.getTracer('consumer')
    return tracer.startActiveSpan(`process:${envelope.eventType}`, async (span) => {
      try {
        await handleEvent(envelope)
        span.setStatus({ code: 1 }) // OK
      } catch (err) {
        span.recordException(err as Error)
        span.setStatus({ code: 2, message: (err as Error).message }) // ERROR
        throw err
      } finally {
        span.end()
      }
    })
  })
}

Embed trace context in every structured JSON logging line — traceId, spanId, and service.name — so log aggregators (Datadog, Grafana Loki, CloudWatch) can correlate logs with traces without a separate APM query. With OTel auto-instrumentation, HTTP calls propagate traceparent automatically; Kafka and RabbitMQ require the matching OTel instrumentation library or manual injection as shown above. Store the traceContext object directly in the JSON event envelope so consumers can restore the full parent trace context and create child spans, linking producer and consumer spans into a single distributed trace.

JSON Schema Registry for Multi-Service Validation

A JSON schema registry is the single source of truth for JSON Schema definitions shared across multiple microservices. Without one, schema definitions diverge: service A's definition of an Orderobject drifts from service B's, and breaking changes slip through because no automated tooling compares them. A schema registry enforces compatibility rules — BACKWARD, FORWARD, or FULL — and blocks deployments that violate them.

// Git-based schema registry: shared schemas as an npm package
// packages/schemas/src/order.schema.json

{
  "$schema": "https://json-schema.org/draft/2020-12",
  "$id":     "https://schemas.acme.com/order/v1",
  "title":   "Order",
  "type":    "object",
  "required": ["orderId", "userId", "amountCents", "currency"],
  "additionalProperties": false,
  "properties": {
    "orderId":     { "type": "string" },
    "userId":      { "type": "string" },
    "amountCents": { "type": "integer", "minimum": 0 },
    "currency":    { "type": "string", "enum": ["USD", "EUR", "GBP"] },
    "createdAt":   { "type": "string", "format": "date-time" }
  }
}
// CI compatibility check: compare new schema against published version
// scripts/check-schema-compat.ts

import Ajv from 'ajv'
import addFormats from 'ajv-formats'
import { readFileSync } from 'fs'

const ajv = new Ajv({ strict: true })
addFormats(ajv)

const publishedSchema = JSON.parse(readFileSync('schemas/published/order.v1.json', 'utf8'))
const candidateSchema = JSON.parse(readFileSync('schemas/src/order.schema.json', 'utf8'))

// Backward compatibility check: all valid v1 documents must also be valid against candidate
// Generate sample documents that satisfy publishedSchema, validate against candidateSchema
function checkBackwardCompat(published: object, candidate: object): boolean {
  const validateCandidate = ajv.compile(candidate)

  // Rule 1: No required fields removed (candidate required ⊇ published required)
  const pubRequired  = (published as { required?: string[] }).required ?? []
  const candRequired = (candidate as { required?: string[] }).required ?? []
  const removedRequired = pubRequired.filter((f) => !candRequired.includes(f))
  if (removedRequired.length > 0) {
    throw new Error(`Breaking: removed required fields: ${removedRequired.join(', ')}`)
  }

  // Rule 2: No new required fields added (old producers won't include them)
  const newRequired = candRequired.filter((f) => !pubRequired.includes(f))
  if (newRequired.length > 0) {
    throw new Error(`Breaking: new required fields: ${newRequired.join(', ')}`)
  }

  // Rule 3: No field type changes
  const pubProps  = (published  as { properties?: Record<string, { type?: string }> }).properties ?? {}
  const candProps = (candidate  as { properties?: Record<string, { type?: string }> }).properties ?? {}
  for (const [field, schema] of Object.entries(pubProps)) {
    if (candProps[field] && candProps[field].type !== schema.type) {
      throw new Error(`Breaking: field "${field}" type changed from ${schema.type} to ${candProps[field].type}`)
    }
  }

  console.log('✓ Schema is backward-compatible')
  return true
}

checkBackwardCompat(publishedSchema, candidateSchema)

For Kafka-based systems, use the Confluent Schema Registry with BACKWARD compatibility mode — register schemas at publish time, and the registry rejects any schema that fails the compatibility check before the producer deploys. For REST APIs, use Spectral with a custom JSON Schema ruleset to lint OpenAPI spec changes in CI. Publish the shared schema package to a private npm registry (GitHub Packages, Verdaccio) so each microservice pins a schema version in package.json — this makes schema version upgrades an explicit, reviewable dependency bump rather than an implicit drift. See the JSON Schema validation guide for Ajv setup and performance optimization.

Key Terms

Event envelope
The outer JSON structure that wraps every async message in a microservices architecture. Contains metadata fields shared across all event types: eventId (UUID v4 for deduplication), eventType (dot-namespaced string identifying the event), version (integer schema version), timestamp (ISO 8601 UTC), source (producing service name), correlationId (for distributed tracing), and payload (the business data object). A consistent envelope structure allows generic consumer infrastructure — routers, deduplication logic, DLQ handlers — to operate without parsing business payloads. The Kafka message key should be the entity ID (e.g. orderId) to preserve ordering across all events for the same entity.
Schema versioning
The discipline of changing a JSON schema over time while maintaining compatibility with existing producers and consumers. Because microservices deploy independently, a producer may be running schema version 2 while consumers still expect version 1. Backward-compatible changes are additive only: adding new optional fields is safe; removing fields, renaming fields, or changing field types breaks all consumers that read them. The version integer in the event envelope signals breaking changes, enabling consumers to route to version-specific handlers. Schema registries (Confluent, AWS Glue, Git-based) enforce compatibility rules automatically and block incompatible schema deployments.
Circuit Breaker
A resilience pattern that prevents cascade failures in microservices by stopping JSON API calls to a failing downstream service after a configurable failure threshold. The circuit has three states: Closed (normal — all calls pass through), Open (tripped — calls are rejected immediately with a fallback JSON response), and Half-Open (probe — one test call checks if the service recovered). Typical configuration: open after 5 consecutive failures, wait 30 seconds before Half-Open, close after 2 consecutive successes. Without a Circuit Breaker, a slow downstream service causes all upstream threads to block waiting for timeout, exhausting the connection pool and taking down the upstream service too. Node.js library: opossum; Java: Resilience4j; .NET: Polly.
Backward compatibility
A schema change is backward compatible if old consumers can still read data produced by new producers. In practice this means: new producers may add optional fields (old consumers ignore unknown fields), but must never remove fields, rename fields, or change field types. A schema registry BACKWARD compatibility mode enforces this rule — new schemas are validated against old schemas, and any change that would cause old consumers to fail is rejected at deploy time. Configure JSON Schema consumers with additionalProperties: true or Zod consumers with .passthrough() to silently tolerate new fields added in future producer minor versions.
gRPC
A high-performance RPC framework developed by Google that uses Protobuf binary serialization over HTTP/2. gRPC outperforms JSON REST by 7–10× throughput for equivalent payloads on internal networks, because Protobuf binary is 3–5× smaller than JSON and HTTP/2 multiplexing allows concurrent requests over a single connection. Services are defined in .proto schema files; code generators produce strongly-typed client and server stubs in every major language. gRPC is best for high-volume internal service-to-service calls; JSON REST is better for public-facing APIs (browser clients cannot call gRPC directly without grpc-web proxying).
OpenTelemetry
A CNCF open standard for distributed tracing, metrics, and logging in microservices. OpenTelemetry SDKs instrument services to emit spans (units of work with start/end timestamps, attributes, and status), which are collected by an OTel Collector and exported to backends like Jaeger, Zipkin, Datadog, or Grafana Tempo. Trace context is propagated between services via the W3C traceparent HTTP header for synchronous REST calls, and embedded as a traceContext field in JSON event envelopes for async messaging. Embedding traceId and spanId in every structured JSON log line correlates logs with traces in log aggregators.
Schema registry
A centralized service or repository that stores, versions, and enforces compatibility rules for JSON Schema or Protobuf definitions shared across multiple microservices. Options include Confluent Schema Registry (Kafka-native, REST API for schema registration and compatibility checks), AWS Glue Schema Registry (managed, integrates with MSK and Kinesis), and Git-based registries (shared schema files published as versioned npm packages or Python libraries). A schema registry prevents schema drift by making schema changes an explicit, reviewable artifact — producers must register a new schema version before deploying, and the registry rejects incompatible changes based on the configured compatibility mode.

FAQ

How do microservices communicate with JSON?

Microservices communicate with JSON over two primary channels: synchronous HTTP REST for request-response interactions (user lookups, payment authorization, inventory checks) and asynchronous message queues like Kafka or RabbitMQ for fire-and-forget or event-driven flows (order confirmations, audit logs, search indexing). For REST, each service exposes an OpenAPI spec and validates request/response bodies with JSON Schema at service boundaries. For async messaging, a standard JSON event envelope — containing eventId, eventType, version, timestamp, source, correlationId, and payload— travels on the wire. JSON is the default because every language runtime ships a JSON parser and HTTP tooling is universal. The primary tradeoff is payload size: JSON is 3–5× larger than equivalent Protobuf binary, which matters for high-volume internal services where switching to gRPC can yield 7–10× throughput gains.

How do I design a JSON event envelope for message queues?

A JSON event envelope wraps every async message with metadata the broker infrastructure needs, separate from business payload. Required fields: eventId (UUID v4 — deduplication key), eventType (dot-namespaced string: "order.created"), version (integer schema version), timestamp (ISO 8601 UTC), source (producing service name), correlationId (for distributed tracing), and payload (the business data object). Keep the envelope flat — do not nest envelope metadata inside payload. Use the Kafka message key as the entity ID (e.g. orderId) so all events for the same entity land on the same partition, preserving ordering. The eventId serves as the idempotency key: consumers store processed IDs in a deduplication table (24–72 hour TTL) and skip re-delivered duplicates. A consistent envelope structure lets generic consumer infrastructure operate without parsing business payloads.

How do I version JSON schemas across microservices?

Schema versioning follows one golden rule: backward-compatible changes are additive only. You can freely add new optional fields — existing consumers that do not know about the new field simply ignore it. You must never remove a field, never rename a field, and never change a field type — any of these break existing consumers immediately. To signal a breaking change, increment the version integer in the event envelope (version: 2) or bump the REST API URL (/v2/users). Consumers implement version-aware handlers using a switch on the version field. Register schemas in a schema registry (Confluent, AWS Glue, or a Git-based registry) and add a CI gate that runs compatibility checks on every PR. Configure consumers with .passthrough() in Zod or additionalProperties: true in JSON Schema so they silently tolerate new optional fields added in future producer minor versions.

What is the Circuit Breaker pattern for JSON API calls?

The Circuit Breaker pattern prevents cascade failures by stopping JSON API calls to a failing downstream service after a threshold of consecutive failures. The circuit has three states: Closed (normal — all calls go through), Open (tripped — calls are rejected immediately without hitting the downstream service), and Half-Open (probe — one test call checks if the service recovered). A typical configuration: open after 5 consecutive failures, wait 30 seconds in Open state before moving to Half-Open, return to Closed after 2 consecutive successes in Half-Open. Without a Circuit Breaker, a slow or down downstream service causes all upstream threads to block waiting for timeout, exhausting the connection pool and taking down the upstream service too. Return a fallback JSON response in the Open state — include a fallback: true flag so callers can distinguish real data from degraded responses. Libraries: opossum (Node.js), Resilience4j (Java), Polly (.NET).

When should I use gRPC instead of JSON REST between services?

Switch from JSON REST to gRPC for internal service-to-service calls when throughput or latency is a bottleneck. gRPC uses Protobuf binary serialization (3–5× smaller payloads than JSON) over HTTP/2 multiplexing (multiple concurrent streams on one connection) and generates strongly-typed client/server code from .proto schema files. Benchmarks show gRPC outperforms JSON REST by 7–10× throughput on internal networks. Use gRPC when services are high-volume (more than 5,000 calls per second), latency is critical (under 10ms p99 target), the service is internal-only (no browser clients), or you need bidirectional streaming. Keep JSON REST when the API is public-facing (browsers cannot call gRPC directly without grpc-web), the team values human-readable payloads for debugging, or the call volume does not justify the schema maintenance overhead of .proto files. A hybrid architecture — gRPC internally with a JSON REST gateway externally — is common.

How do I propagate trace context in JSON microservice calls?

Use the W3C traceparent header (format: "00-{{traceId}}32hex-{{parentSpanId}}16hex-{{flags}}2hex}") for distributed tracing across HTTP calls, and embed a traceContext field directly in JSON event envelopes for async messaging. With OpenTelemetry SDK auto-instrumentation, traceparent propagation is automatic for HTTP calls — but you must manually inject context into Kafka message headers and embed it in JSON event envelopes. Use propagation.inject(context.active(), traceContext) (OTel API) to extract the current trace context into a plain object, then store it in the envelope. In consumers, use propagation.extract(context.active(), envelope.traceContext) to restore the parent trace context and create child spans. Log traceId and spanId in every structured JSON log line so log aggregators can correlate logs with traces.

How do I validate JSON schemas across multiple services?

Centralize schema definitions in a schema registry — a Git monorepo with shared JSON Schema files published as a versioned npm package is the simplest starting point. Each service imports the shared schema package and validates incoming JSON at service boundaries using Ajv (JavaScript) or jsonschema (Python) before processing. Add a CI gate that runs schema compatibility checks on every PR: check that no required fields are removed, no field types change, and no new required fields are added without a version bump. For Kafka, configure the Confluent Schema Registry with BACKWARD compatibility mode — producers must register a new schema version before deploying, and the registry rejects incompatible changes. Configure consumers with additionalProperties: true or .passthrough() (Zod) so they tolerate new optional fields added by producers in future minor versions.

How do I handle backward-incompatible JSON schema changes?

When a breaking JSON schema change is unavoidable (removing a field, renaming a field, changing a field type), follow a blue-green migration: (1) Deploy the new producer that emits version: 2 events — optionally dual-write both version 1 and version 2 during the migration window. (2) Deploy new consumers that handle both versions using a switch on the version field. (3) Monitor that all consumers are processing version 2 events before stopping version 1 dual-write. (4) Remove the version 1 dual-write from the producer after all consumers have updated. For REST APIs, run /v1 and /v2 endpoints simultaneously, route traffic via API gateway, and set a deprecation sunset date using the Sunset HTTP response header. Never remove a field without a migration window — internal services may have consumers you are not aware of.

Further reading and primary sources