OpenAI Structured Outputs: 100% Schema-Compliant JSON with json_schema

Last updated:

OpenAI Structured Outputs is the response_formatmode that compiles a JSON Schema into the model's decoder and guarantees the response matches that schema key for key, type for type. Released on August 6, 2024 with gpt-4o-2024-08-06, it replaces the older JSON mode for any case where you already know the shape — OpenAI claims 100% schema adherence on supported models because token sampling is constrained at generation time, so a non-conforming character cannot be emitted. The cost is a restricted JSON Schema subset: every object must set additionalProperties: false, every property must be listed in required, and most value-range keywords from full JSON Schema are unavailable. Pydantic and Zod helpers in the official SDKs handle the schema generation; on the response side you must check the new refusal field before reading the parsed object. This guide walks the payload anatomy, the supported keyword subset, the optional-field workaround, and the error modes you will hit in production.

Hand-rolling the schema and the model keeps rejecting it? Paste the schema into Jsonic's JSON Schema Validator — it flags missing additionalProperties: false, unlisted required keys, and unsupported keywords with exact JSON Pointer locations.

Validate JSON Schema

Structured Outputs vs JSON mode: what changed in August 2024

Before August 6, 2024 the only way to force OpenAI models into JSON was response_format: { type: "json_object" } — what the API calls JSON mode. JSON mode guaranteed syntactic validity (no trailing prose, no broken quotes) but said nothing about the shape. You declared the keys you wanted in the prompt, hoped the model followed, and validated on the way back. Real-world adherence sat around 95–98% on capable models, which sounds high until you ship a pipeline that processes a million requests and 30,000 of them are malformed.

Structured Outputs uses the same response_format field with a new type value of json_schema. You hand the API a JSON Schema and OpenAI compiles it into a constrained decoder — at every token step the model is allowed to sample only from tokens that keep the response schema-conformant. When you set strict: true, the conformance claim is total: every successful response matches the schema exactly.

For a side-by-side on the legacy approach, see our OpenAI JSON mode guide. For the broader space of techniques across providers, see LLM JSON output strategies.

FeatureJSON mode (json_object)Structured Outputs (json_schema)
Guaranteed syntactic JSONYesYes
Guaranteed schema matchNoYes (100% on supported models)
Schema declarationIn prompt onlyAs json_schema.schema field
JSON Schema subsetn/aRestricted — see below
Refusal field on responseNoYes (message.refusal)
SDK helperNoneparse() (Python), zodResponseFormat (Node)
Supported modelsMost chat models from gpt-3.5-turbo-1106 onwardgpt-4o-2024-08-06 and later, gpt-4o-mini, o1, o3, etc.

Anatomy of a json_schema response_format payload

The payload is a three-level object. response_format.type is the literal string json_schema. response_format.json_schema is a wrapper that carries the schema and metadata. json_schema.schema is the actual JSON Schema. Here is the minimum working shape:

{
  "model": "gpt-4o-2024-08-06",
  "messages": [
    { "role": "system", "content": "Extract the user's contact info." },
    { "role": "user", "content": "Email me at jane@example.com — Jane Doe." }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "contact_info",
      "strict": true,
      "schema": {
        "type": "object",
        "properties": {
          "name":  { "type": "string" },
          "email": { "type": "string" }
        },
        "required": ["name", "email"],
        "additionalProperties": false
      }
    }
  }
}

The four required fields inside json_schema:

  • name — a short identifier (snake_case is conventional). OpenAI uses this in logs and the constrained-decoder cache key, so reusing the same name across deploys helps with the cold-start compile.
  • schema — the JSON Schema document, written under the supported subset rules.
  • strict — set to true for the 100% adherence guarantee. Setting it to false degrades the call to a JSON-mode-like best-effort match and disables the helpful schema-validation errors.
  • description (optional) — a free-text hint the model sees alongside the schema. Useful when key names alone do not convey intent.

For a comparison with the OpenAI tool-call parameters shape (which uses the same subset), see function calling schemas.

Schema subset: which JSON Schema keywords are supported

OpenAI's strict-mode subset deliberately leaves out most value-constraint keywords from the full JSON Schema draft. The compiled decoder needs every keyword to translate into a token-mask rule; keywords that constrain values without shrinking the token set (numeric ranges, string patterns, format strings) were not supported at launch. Through late 2024 and into 2025 OpenAI shipped support for several of those, so always check the current docs before assuming a keyword is rejected. The structural keywords listed below have been supported since launch.

KeywordStrict mode statusNotes
typeSupportedSingle type or a list (e.g., ["string", "null"] for nullables)
propertiesSupportedEvery key must also appear in required
requiredSupportedMust list every key in properties — no real optional fields
additionalPropertiesSupported (must be false)Required on every object node — defaults are rejected
itemsSupportedSingle schema only — tuple form (array of schemas) is not
enumSupportedUp to 500 values total across the schema
anyOfSupportedEach branch is a full schema; oneOf and allOf are not supported
$defs / $refSupportedInternal references only; cross-file refs not allowed
minLength, maxLength, patternAdded post-launch — verify in current docsNot available at the August 2024 launch
minimum, maximum, multipleOfAdded post-launch — verify in current docsNot available at the August 2024 launch
format, uniqueItems, constUnsupported (check current docs)Express via prompt or post-validate on the client
oneOf, allOf, notUnsupportedRewrite to anyOf or flatten

Hard size limits: at most 100 total properties across the schema (including nested objects), at most 5 levels of nesting in the generated instance, at most 500 enum values in total, and at most 64 characters per field name. Exceeding any of these returns a schema-validation error from the API before the request reaches the model.

For a refresher on JSON Schema itself, see our JSON Schema basics guide.

additionalProperties: false is mandatory — and why

The most common schema-rejection error is some variant of additionalProperties must be false at #/properties/address. The default JSON Schema behavior is to allow keys that are not in properties; the strict-mode decoder needs a closed key set on every object node so it knows when an object is finished and can start emitting the closing brace. Half-closed schemas would make the token-mask calculation ambiguous, so the compiler refuses them.

The fix is mechanical: walk the schema tree and add additionalProperties: false to every type: "object" node. That includes objects inside anyOf branches and definitions inside $defs. A single nested object missing the field is enough to fail the whole request.

{
  "type": "object",
  "properties": {
    "user": {
      "type": "object",
      "properties": {
        "name":    { "type": "string" },
        "address": {
          "type": "object",
          "properties": {
            "street": { "type": "string" },
            "city":   { "type": "string" }
          },
          "required": ["street", "city"],
          "additionalProperties": false
        }
      },
      "required": ["name", "address"],
      "additionalProperties": false
    }
  },
  "required": ["user"],
  "additionalProperties": false
}

Three nested objects, three repetitions of the field. The Pydantic and Zod helpers in the official SDKs inject the field automatically; hand-written schemas need a linter or a quick recursive walk to catch every node.

Required fields, optional fields, and the null-union trick

The second strict-mode rule that catches teams is the requirement that every key in properties also appears in required. There is no native optional field. Trying to ship a schema with a property that is not in required returns a validation error before the call even runs.

The documented workaround is the null union: declare the type as a two-element list containing the real type and null. The model can then emit a null literal to signal absence, and your application code interprets null as the optional case.

{
  "type": "object",
  "properties": {
    "name":         { "type": "string" },
    "phone_number": { "type": ["string", "null"] }
  },
  "required": ["name", "phone_number"],
  "additionalProperties": false
}

In Pydantic, phone_number: Optional[str] (or str | None in modern Python) maps to the same null-union schema when the SDK helper converts the model. In Zod, z.string().nullable() produces the same shape via zodResponseFormat. The model emits null rather than omitting the key, and your downstream code collapses that into a missing-key representation if your domain model needs the distinction. For a deeper walkthrough of Pydantic-from-JSON workflows, see Pydantic from JSON.

One side effect: every response carries every key, with possible null values. If you serialize the response into a database column that distinguishes null from missing, the schema will produce null values where the old free-form approach would have omitted the key. Document this in your data layer or normalize at the service boundary.

Using Pydantic / Zod to auto-generate schemas

Writing JSON Schema by hand for anything beyond a flat object is tedious and easy to get wrong. The Python and Node SDKs ship helpers that generate the schema from a Pydantic model or a Zod schema, inject additionalProperties: false everywhere, list every property in required, and parse the response back into a typed object.

Python — client.beta.chat.completions.parse:

from openai import OpenAI
from pydantic import BaseModel
from typing import Optional

client = OpenAI()

class Address(BaseModel):
    street: str
    city: str
    postal_code: Optional[str] = None  # becomes type: ["string", "null"]

class Contact(BaseModel):
    name: str
    email: str
    address: Address

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "Extract contact info."},
        {"role": "user", "content": "Jane Doe, jane@example.com, 1 Main St, Berlin"},
    ],
    response_format=Contact,
)

message = completion.choices[0].message
if message.refusal:
    print(f"Refused: {message.refusal}")
else:
    contact: Contact = message.parsed
    print(contact.name, contact.email, contact.address.city)

Node / TypeScript — zodResponseFormat:

import OpenAI from "openai"
import { zodResponseFormat } from "openai/helpers/zod"
import { z } from "zod"

const Address = z.object({
  street: z.string(),
  city: z.string(),
  postal_code: z.string().nullable(),
})

const Contact = z.object({
  name: z.string(),
  email: z.string(),
  address: Address,
})

const client = new OpenAI()

const completion = await client.beta.chat.completions.parse({
  model: "gpt-4o-2024-08-06",
  messages: [
    { role: "system", content: "Extract contact info." },
    { role: "user", content: "Jane Doe, jane@example.com, 1 Main St, Berlin" },
  ],
  response_format: zodResponseFormat(Contact, "contact"),
})

const message = completion.choices[0].message
if (message.refusal) {
  console.log("Refused:", message.refusal)
} else {
  const contact = message.parsed // typed as z.infer<typeof Contact>
  console.log(contact.name, contact.email, contact.address.city)
}

Both helpers set strict: true automatically and infer a sensible name from the type. For a fuller treatment of Zod parsing patterns on the TypeScript side, see Zod parsing in TypeScript.

Error handling: refusal field, max_tokens truncation, schema misses

Structured Outputs adds a new response field (refusal) and a new failure mode (truncation mid-object) that did not exist on free-form responses. Production code needs to handle three branches, not one.

1. The refusal branch. When safety policies block the response, message.refusal is a string and message.parsed is null. Always check refusal first.

const message = completion.choices[0].message

if (message.refusal) {
  // Model declined for safety reasons
  logger.warn({ refusal: message.refusal, requestId }, "model refused")
  return { ok: false, reason: "refused", text: message.refusal }
}

if (completion.choices[0].finish_reason === "length") {
  // Hit max_tokens mid-object — the parsed field is unreliable
  logger.error({ requestId }, "structured output truncated")
  return { ok: false, reason: "truncated" }
}

return { ok: true, data: message.parsed }

2. The truncation branch. Strict mode guarantees schema match if the response completes. If max_tokens cuts the response mid-object, finish_reason is length and the parsed object may be missing required keys. The SDK helpers raise a parse error in this case; raw HTTP callers must check finish_reason explicitly. Budget tokens generously — token-counted JSON is denser than free-form text but not by much, and repeat keys eat the budget fast on long lists.

3. The schema-validation branch. This fires before the model runs, at the API gateway. Common causes: missing additionalProperties: false, an unsupported keyword (e.g., oneOf), a property missing from required, or exceeding the 100-property / 5-level / 500-enum / 64-char limits. The error response has status 400 and a code field that pinpoints the violating JSON Pointer. Cache validated schemas in your build pipeline so the failure surfaces at CI time, not in production.

Pricing and latency overhead vs free-form output

Structured Outputs uses the same per-token pricing as free-form responses on the same model — there is no per-call surcharge for switching on json_schema. The cost differences come from two places.

Cold-start schema compile. The first time OpenAI sees a new schema in a given region, the constrained decoder compiles it. The compile is not billed but it does add latency to that one call — typically 1–3 seconds for small schemas, more for deeply nested ones with many $refs. Subsequent calls with the same schema (matched by the json_schema.name plus a hash of the schema field) reuse the cached compile and pay no extra latency. The implication: a deploy that ships ten new schemas hits a one-time compile cost per schema per region, but steady-state requests are not slower than free-form responses.

Token efficiency. JSON output is denser than free-form natural language for the same data — fewer filler words, no prose framing. On extraction tasks the response token count typically drops 20–40% versus a free-form prompt that asks the model to write the data as a paragraph. That savings usually outweighs the small overhead of carrying the schema in the request, even when the schema is large.

Steady-state latency. Once the schema is compiled, end-to-end latency is within a few percent of free-form output on the same model. The constrained decoder runs in parallel with token sampling; the masking step itself is cheap.

For reliability comparisons across the broader JSON pipeline (validation, retries, idempotency), see LLM JSON output strategies.

Key terms

Structured Outputs
OpenAI's response_format mode (type: "json_schema") that compiles a JSON Schema into the decoder and guarantees the response matches the schema exactly. Released August 6, 2024 with gpt-4o-2024-08-06.
strict mode
The strict: true flag inside json_schema. Enables the 100% schema-adherence guarantee and the schema-subset restrictions (closed additionalProperties, fully-listed required, supported keywords only).
refusal field
A response-only field at message.refusal that carries a string explanation when the model declines a Structured Outputs request for safety reasons. When refusal is set, message.parsed is null.
null-union pattern
The strict-mode idiom for optional fields: declare the type as a list containing both the real type and null (e.g., type: ["string", "null"]). Required by the rule that every key must appear in required.
constrained decoder
The token-sampling component that enforces the schema during generation by masking out tokens that would break conformance. The mechanism behind the 100% adherence claim.
parse helper
The SDK convenience methods — client.beta.chat.completions.parse (Python) and zodResponseFormat with client.beta.chat.completions.parse (Node) — that generate the JSON Schema from a Pydantic/Zod definition and deserialize the response into a typed object.

Frequently asked questions

What is the difference between Structured Outputs and JSON mode?

JSON mode (response_format type json_object) guarantees only that the model returns syntactically valid JSON — it does not constrain the shape. The model can return any keys, any nesting, any types; you still write a validator on your side and retry on mismatches. Structured Outputs (response_format type json_schema with strict true) guarantees the response matches a JSON Schema you supply, key for key, type for type. OpenAI claims 100% schema adherence on supported models because the schema is compiled into a constrained decoder that masks tokens at sampling time, so the model literally cannot emit a non-conforming character. Practically, Structured Outputs replaces JSON mode for any production use case where you have a known shape. Keep JSON mode only for free-form JSON where the keys vary per request and you cannot pre-declare a schema. The two features live in the same response_format field and switching between them is a one-line change.

Why does my schema fail with 'additionalProperties must be false'?

Strict mode requires every object in the schema (top-level and nested) to set additionalProperties to false. The default JSON Schema behavior is to allow extra keys, and OpenAI rejects schemas that rely on that default — the constrained decoder needs a closed key set to know when an object is finished. Add additionalProperties false to every type object node, including objects inside anyOf branches and inside $defs definitions. If you autogenerate the schema from Pydantic or Zod, the OpenAI SDK helpers (zodResponseFormat in the Node SDK and the beta parse method in Python) inject the field for you. If you write the schema by hand, treat additionalProperties false as a non-optional sibling of properties on every object. A single missed nested object is enough to trigger the error, so the fix is mechanical: walk the tree, add the field everywhere.

Can I make a field optional in Structured Outputs?

Strict mode treats every key in the properties object as required — you must list every property name in the required array. There is no native optional field. The workaround OpenAI documents is the null union: declare the type as a list containing both the real type and null, e.g. type: ["string", "null"]. The model can then emit null to signal absence, and your application code interprets null as the optional case. With Pydantic this maps to Optional[str] or str | None, which the SDK helper converts to the type list automatically. With Zod, use z.string().nullable() and the Node SDK helper handles the conversion. Treat the null union as the canonical way to express optionality and your application layer becomes the place where null collapses back into a missing key if your domain model needs that distinction.

Which JSON Schema keywords does OpenAI support in strict mode?

The supported subset covers structural keywords: type, properties, required, additionalProperties (must be false on objects), items, enum, anyOf, $defs, and $ref. The subset deliberately excludes most value-constraint keywords from full JSON Schema — minLength, maxLength, pattern, format, minimum, maximum, exclusiveMinimum, exclusiveMaximum, multipleOf, and uniqueItems were all unsupported at launch. Through late 2024 and into 2025 OpenAI shipped support for several of those (numeric ranges and a few string constraints), so check the current docs before assuming a keyword is rejected. Composition keywords other than anyOf — specifically oneOf, allOf, and not — remain unsupported. The hard limits to remember: maximum 100 total properties across the schema, maximum 5 levels of nesting, maximum 500 enum values, and key names capped at 64 characters. Hitting any of those returns a schema-validation error before the request reaches the model.

How do I handle the 'refusal' field in the response?

When a Structured Outputs call hits a safety policy, the model returns a response where message.refusal is a string explaining the refusal and message.parsed (or message.content) is null. Your code must check refusal first on every response, before trying to use the parsed object — accessing parsed when refusal is set will throw or read null. The typical pattern is an if-else: if response.choices[0].message.refusal is truthy, log it, surface a user-facing error, and skip the downstream pipeline; otherwise proceed with the parsed object. The refusal field is a regular string and you can show it to end users when appropriate, but treat it as model-generated text — do not pipe it back into another prompt without sanitization. The refusal field exists only on Structured Outputs responses; JSON mode and free-form responses do not have it.

Can I use $ref and definitions in my schema?

Yes, $defs and $ref are supported in strict mode and are the only way to express recursion or shared subschemas. Define reusable types under a top-level $defs object and reference them with #/$defs/TypeName from anywhere in the schema. A recursive type — a tree node that contains a list of child tree nodes — looks like $defs: { TreeNode: { type: object, properties: { value: { type: string }, children: { type: array, items: { $ref: "#/$defs/TreeNode" } } }, required: ["value", "children"], additionalProperties: false } }. Pydantic generates this pattern automatically when you reference a model recursively. The 5-level nesting limit applies to the unrolled instance, not to the schema definition, so a recursive schema is still safe as long as the actual generated JSON does not exceed the depth cap. References to schemas in other files are not supported — every $ref must resolve inside the request payload.

Does Structured Outputs work with parallel tool calling?

Yes, but with one caveat: when the model invokes multiple tools in a single response (parallel_tool_calls true, which is the default on gpt-4o and gpt-4-turbo), each tool call still has its arguments validated against the tool schema. The strict flag on a tool definition has the same effect as strict on response_format json_schema — the constrained decoder enforces the schema during generation, so every emitted tool-call arguments object is guaranteed to match. The caveat is that strict mode adds compile time the first time a schema is seen (typically a second or two), and that compile happens per-schema per-API-region. Cold-start latency therefore shows up on the first call with a new schema, not on subsequent calls. If you have many tools each with strict schemas, expect the first request after a deploy to be slower than steady-state requests. Tool definitions and response_format json_schema can be combined in the same request.

Why does my Pydantic model fail to validate even though OpenAI returned matching JSON?

The most common cause is a Pydantic feature the SDK helper cannot translate to JSON Schema. Examples: validators that reformat strings (e.g., stripping whitespace), computed fields, custom Field constraints that map to unsupported keywords (regex patterns, length limits before they were added), or types like datetime that Pydantic accepts in multiple formats. The SDK strips these from the schema sent to OpenAI, so the model returns a value the schema accepts but Pydantic then rejects on the round-trip parse. Three fixes apply. First, narrow the Pydantic model to types and constraints the JSON Schema subset can express. Second, run the model through model_construct or model_validate with strict false when the cause is a format-flexible parser. Third, when a validator is essential, do the OpenAI call against a stripped-down dataclass-like model and copy the values into your real Pydantic model afterward. The mismatch is always at the schema-translation boundary, never at the model itself.

Further reading and primary sources