OpenAI JSON Mode and Function Calling: Structured JSON Output

Last updated:

Getting reliable structured JSON from OpenAI GPT models requires understanding three distinct mechanisms: JSON mode, Structured Outputs, and function calling. Each offers different guarantees, model requirements, and use cases. This guide covers all three with complete Python and Node.js examples, plus validation patterns with Zod and Pydantic.

JSON Mode — Guaranteed Valid JSON

JSON mode guarantees the model returns parseable JSON. Enable it by setting response_format: { type: "json_object" }. You must also mention "JSON" in the system or user message — the API returns an error if you do not.

from openai import OpenAI
import json

client = OpenAI()  # uses OPENAI_API_KEY env var

# JSON mode — guarantees parseable JSON, not necessarily your schema
response = client.chat.completions.create(
    model="gpt-4o-mini",
    response_format={"type": "json_object"},
    messages=[
        {
            "role": "system",
            "content": (
                "You are a data extraction assistant. "
                "Always respond with valid JSON matching this schema: "
                '{"name": string, "age": number, "email": string}'
            )
        },
        {
            "role": "user",
            "content": "Extract: Alice Smith, 30 years old, alice@example.com"
        },
    ],
)

raw = response.choices[0].message.content
data = json.loads(raw)   # always valid JSON in json_object mode
print(data)  # → {"name": "Alice Smith", "age": 30, "email": "alice@example.com"}

Node.js equivalent:

import OpenAI from 'openai'

const client = new OpenAI()

const response = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  response_format: { type: 'json_object' },
  messages: [
    {
      role: 'system',
      content: 'Extract entities as JSON: {"name": string, "age": number, "email": string}',
    },
    { role: 'user', content: 'Alice Smith, 30 years old, alice@example.com' },
  ],
})

const data = JSON.parse(response.choices[0].message.content!)

Structured Outputs — Schema-Validated JSON

Structured Outputs guarantee the response matches your JSON Schema exactly. Requires GPT-4o-2024-08-06 or later. The OpenAI SDK integrates with Pydantic (Python) and Zod (Node.js) to parse directly into typed models.

from openai import OpenAI
from pydantic import BaseModel

client = OpenAI()

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

# Structured Outputs with Pydantic model (SDK parses automatically)
response = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",  # structured outputs requires this model+
    response_format=CalendarEvent,
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {"role": "user", "content": "Alice and Bob are meeting on Friday March 15th."},
    ],
)

event = response.choices[0].message.parsed  # → CalendarEvent instance
print(event.name)          # "Meeting"
print(event.date)          # "2026-03-15"
print(event.participants)  # ["Alice", "Bob"]
// Node.js Structured Outputs with Zod
import OpenAI from 'openai'
import { zodResponseFormat } from 'openai/helpers/zod'
import { z } from 'zod'

const client = new OpenAI()

const CalendarEventSchema = z.object({
  name: z.string(),
  date: z.string(),
  participants: z.array(z.string()),
})

const response = await client.beta.chat.completions.parse({
  model: 'gpt-4o-2024-08-06',
  response_format: zodResponseFormat(CalendarEventSchema, 'calendar_event'),
  messages: [
    { role: 'system', content: 'Extract event information.' },
    { role: 'user', content: 'Alice and Bob meet on March 15.' },
  ],
})

const event = response.choices[0].message.parsed   // typed as z.infer<typeof CalendarEventSchema>

Function Calling (Tool Use)

Function calling lets the model generate a JSON arguments object to invoke a real function in your code. It is the foundation of agentic workflows where the model selects and calls tools across multiple steps.

import json
from openai import OpenAI

client = OpenAI()

# Define tools (functions the model can call)
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "City name, e.g., 'London'"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature unit"
                    }
                },
                "required": ["city"],
                "additionalProperties": False
            }
        }
    }
]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
    tools=tools,
    tool_choice="auto",  # or "required" to force tool use
)

message = response.choices[0].message
if message.tool_calls:
    tool_call = message.tool_calls[0]
    args = json.loads(tool_call.function.arguments)
    # → {"city": "Tokyo", "unit": "celsius"}

    # Execute the function with the extracted args
    weather_result = get_weather(args["city"], args.get("unit", "celsius"))

    # Continue conversation with tool result
    messages = [
        {"role": "user", "content": "What's the weather in Tokyo?"},
        message,  # assistant's tool_call message
        {
            "role": "tool",
            "tool_call_id": tool_call.id,
            "content": json.dumps(weather_result),
        },
    ]
    final_response = client.chat.completions.create(model="gpt-4o", messages=messages)

JSON Schema for Structured Outputs

You can pass a raw JSON Schema directly without Pydantic or Zod. Use strict: True to enforce the most reliable schema adherence — all properties must be in required and additionalProperties: false must appear at every level.

# Manual JSON Schema for structured outputs (no Pydantic)
response = client.chat.completions.create(
    model="gpt-4o-2024-08-06",
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "product_extraction",
            "strict": True,   # strict mode: model must follow schema exactly
            "schema": {
                "type": "object",
                "properties": {
                    "products": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "name": {"type": "string"},
                                "price": {"type": "number"},
                                "in_stock": {"type": "boolean"},
                                "category": {
                                    "type": "string",
                                    "enum": ["electronics", "clothing", "food"]
                                }
                            },
                            "required": ["name", "price", "in_stock", "category"],
                            "additionalProperties": False
                        }
                    },
                    "total_count": {"type": "integer"}
                },
                "required": ["products", "total_count"],
                "additionalProperties": False
            }
        }
    },
    messages=[
        {"role": "user", "content": "Extract products from: iPhone 15 ($999, in stock, electronics), Levi jeans ($79, out of stock, clothing)"}
    ]
)

Note on strict: True: all schema properties must be listed in required, and additionalProperties: false must be set at every level.

Validate LLM JSON Output with Zod / Pydantic

Always validate — models can still hallucinate values even with Structured Outputs. Zod's safeParse() returns a result object rather than throwing, making it easy to handle validation failures gracefully.

// Always validate — models can still hallucinate values even with structured outputs
import { z } from 'zod'

const ProductSchema = z.object({
  name: z.string().min(1),
  price: z.number().positive(),
  in_stock: z.boolean(),
  category: z.enum(['electronics', 'clothing', 'food']),
})

type Product = z.infer<typeof ProductSchema>

async function extractProducts(text: string): Promise<Product[]> {
  const response = await client.chat.completions.create({
    model: 'gpt-4o-mini',
    response_format: { type: 'json_object' },
    messages: [
      { role: 'system', content: 'Extract products as JSON: {"products": [...]}' },
      { role: 'user', content: text },
    ],
  })

  const raw = JSON.parse(response.choices[0].message.content!)
  const result = z.object({ products: z.array(ProductSchema) }).safeParse(raw)

  if (!result.success) {
    console.error('Validation failed:', result.error.issues)
    return []
  }
  return result.data.products
}

Batch JSON Extraction at Scale

The OpenAI Batch API processes requests asynchronously at 50% of the standard price with a 24-hour completion window. Ideal for large dataset extraction.

# OpenAI Batch API — 50% cost reduction for async processing
import json
from pathlib import Path

# Prepare batch of requests
requests = []
texts = ["text 1...", "text 2...", "text 3..."]

for i, text in enumerate(texts):
    requests.append({
        "custom_id": f"request-{i}",
        "method": "POST",
        "url": "/v1/chat/completions",
        "body": {
            "model": "gpt-4o-mini",
            "response_format": {"type": "json_object"},
            "messages": [
                {"role": "system", "content": "Extract data as JSON."},
                {"role": "user", "content": text},
            ],
        }
    })

# Write JSONL file
batch_file = Path("batch_requests.jsonl")
batch_file.write_text("\n".join(json.dumps(r) for r in requests))

# Upload and submit
with open(batch_file, "rb") as f:
    uploaded = client.files.create(file=f, purpose="batch")

batch = client.batches.create(
    input_file_id=uploaded.id,
    endpoint="/v1/chat/completions",
    completion_window="24h"
)

print(f"Batch ID: {batch.id}")  # poll status with client.batches.retrieve(batch.id)

Comparison: JSON Mode vs Structured Outputs vs Function Calling

FeatureJSON ModeStructured OutputsFunction Calling
GuaranteeValid JSONSchema adherenceArguments match schema
ModelsGPT-3.5-turbo-1106+GPT-4o-2024-08-06+GPT-3.5+
Schema sourcePrompt (free-form)JSON Schema in APIJSON Schema in tools
Strict modeNoYesYes (with strict: true)
Best forSimple extractionData extraction, formsActions, multi-step agents
SDK helperNone neededbeta.chat.completions.parse()Tool loop pattern
Cost overhead~5-15%~10-20 tokens/field~50-100 tokens/tool

Definitions

JSON mode
An OpenAI chat completions setting (response_format: { type: "json_object" }) that guarantees the model's response can be parsed as JSON; does not guarantee schema conformance.
Structured Outputs
An OpenAI feature (response_format: { type: "json_schema" }) that guarantees the response conforms to a provided JSON Schema; available on GPT-4o-2024-08-06+.
function calling
An OpenAI capability where the model generates a JSON arguments object matching a declared function schema; enables agentic workflows where the model can invoke real functions.
strict mode
A Structured Outputs option (strict: true in the schema) that requires all properties to be listed in required and additionalProperties: false at every level; enables the most reliable schema adherence.
finish_reason
The field in a completion response indicating why the model stopped generating: "stop" (normal), "length" (hit max_tokens — JSON may be truncated), "tool_calls" (function calling), "content_filter" (content policy).

FAQ

What is OpenAI JSON mode?

JSON mode is enabled by setting response_format: { type: "json_object" }. It guarantees the model returns output that can be parsed as JSON — no markdown fences, no prose wrapping. You must mention "JSON" in the system or user message, or the API returns an error. It works with GPT-4o, GPT-4-turbo, and GPT-3.5-turbo-1106+. Important: JSON mode guarantees valid JSON but does NOT guarantee your specific schema — always validate with Zod or Pydantic to confirm the shape matches what you expect.

What is the difference between JSON mode and Structured Outputs?

JSON mode guarantees valid JSON in any shape — the model decides the structure. Structured Outputs guarantees the response matches your JSON Schema exactly, including required fields, types, and enum values. Structured Outputs requires GPT-4o-2024-08-06+ and strict: true in the schema. Use Structured Outputs for production data extraction; use JSON mode as a fallback for older models that do not support Structured Outputs.

How do function calling and JSON mode differ?

Function calling is designed for executing actions: the model generates a JSON arguments object so your code can call a real function. JSON mode is designed for data extraction: the model returns data as a JSON object. Function calling enables multi-step agentic workflows; JSON mode is one-shot extraction. You can combine them — define a tool whose parameters match the data shape you need, giving you function calling reliability with structured schema adherence.

Should I validate OpenAI JSON output even with Structured Outputs?

Yes — always validate. Structured Outputs guarantees schema adherence (correct types, required fields present, enum values respected) but cannot guarantee semantic correctness. The model may hallucinate values that match the declared type but are factually wrong: an incorrect city name, an off-by-magnitude price, a date in the wrong format. Use Zod's safeParse() or Pydantic's model_validate_json() for runtime type checks, and add business logic validation on top for semantic correctness.

How many tokens does JSON mode add to my request?

JSON mode itself adds no prompt tokens, but structured output typically increases output token count by 5–15% versus prose. Function calling tool definitions add ~50–100 tokens per tool to the prompt. Structured Outputs json_schema definitions add ~10–20 tokens per field defined. For high-volume workloads, use gpt-4o-mini (significantly cheaper) and the Batch API (50% cost reduction for async processing).

How do I handle OpenAI API errors when using JSON mode?

Check response.choices[0].finish_reason. A value of "length" means the model hit max_tokens before finishing — in JSON mode this produces truncated, unparseable JSON; increase max_tokens. A value of "content_filter" means the model was blocked by content policy. Always wrap JSON.parse() / json.loads() in try/catch. For 429 and 500 errors, retry with exponential backoff — the OpenAI SDK has built-in retry support via the max_retries parameter.

Can I use JSON mode with streaming?

Yes — stream: true works with JSON mode and Structured Outputs. Accumulate all chunks into a buffer and call JSON.parse() only after the stream closes (finish_reason === "stop"). Do not attempt to parse partial JSON from individual chunks — it will fail. Use streaming with JSON mode primarily to show a loading indicator while the model processes, not for incremental JSON consumption.

How do I use the OpenAI Batch API for JSON extraction?

Create a JSONL file where each line is a JSON object with custom_id, method, url, and body (the full chat completions request). Upload with client.files.create(purpose="batch"), submit with client.batches.create(), poll until status === "completed", then download results from output_file_id. The Batch API is 50% cheaper than synchronous calls with a 24-hour processing window — ideal for large dataset extraction where latency does not matter.

Further reading and primary sources