Anthropic Claude: Tool Use Blocks for Reliable JSON Output

Last updated:

Anthropic Claude has no response_format parameter and no JSON mode flag on the Messages API — by design. The supported path to structured JSON output is tool use: you declare a tool whose input_schema is the JSON shape you want, you pin tool_choice to force that tool, and the model replies with a tool_use content block whose input field is a parsed object validated against your schema. Tool use has been generally available since the Claude 3 launch in March 2024 and is the unified primitive across claude-opus-4, claude-sonnet-4-5, and claude-haiku-4-5. This guide walks through the response shape, the schema constraints, forced tool_choice, streaming with input_json_delta, parallel calls, and the prefill trick — with Python and Node examples for each pattern.

Building a schema for Claude tool use? Paste it into Jsonic's JSON Schema Validator to confirm it parses cleanly and matches example payloads before you ship the prompt.

Validate your input_schema

Why Claude has no JSON mode — and uses tool use instead

OpenAI's API exposes two output-format levers — response_format: { type: 'json_object' } for any-JSON and response_format: { type: 'json_schema' }for strict schema enforcement. Anthropic took a different design call: there is one primitive, tool use, and structured output is just a degenerate case of it — a tool the model is forced to call. Where OpenAI splits "give me JSON" from "call a function", Anthropic treats both as the same operation: pick a tool, fill its arguments, return them as a tool_use content block.

The practical effect is one API surface to learn. The same code path that runs a weather lookup also gets you a structured product extraction or a typed form result — you just stop sending the result back as a tool_result and use the parsed inputdirectly. The trade-off is conceptual overhead: you have to think of "return JSON" as "fill a tool I'm never going to actually call," which feels indirect coming from response_format.

For background on the broader landscape of LLM JSON output across providers, see our general LLM JSON output guide; for a side-by-side with OpenAI's approach, see the OpenAI JSON mode comparison and OpenAI Structured Outputs.

Anatomy of a tool_use response block

The Messages API always returns content as an array of typed blocks. For a tool call, the relevant block looks like this:

{
  "id": "msg_018X...",
  "type": "message",
  "role": "assistant",
  "model": "claude-sonnet-4-5",
  "stop_reason": "tool_use",
  "content": [
    {
      "type": "text",
      "text": "I'll extract the product details now."
    },
    {
      "type": "tool_use",
      "id": "toolu_01ABC...",
      "name": "extract_product",
      "input": {
        "name": "Anker 737 Power Bank",
        "price_usd": 149.99,
        "in_stock": true,
        "tags": ["electronics", "charging", "portable"]
      }
    }
  ],
  "usage": { "input_tokens": 412, "output_tokens": 87 }
}

Three fields matter on the tool_use block:

  • id — a unique identifier (toolu_...). Echo it back in a follow-up tool_result if you continue the conversation; for one-shot JSON extraction you can ignore it.
  • name — the tool the model chose. When you force a single tool via tool_choice, this always matches the name you pinned.
  • input — the parsed JSON object. SDKs return it already deserialized; the raw HTTP body sends it as a JSON value, not a string.

The top-level stop_reason is tool_use when the model decided to call a tool — useful for routing logic when tool_choice is auto. Under forced tool use, stop_reason will always be tool_use.

Defining input_schema for a forced-JSON tool

The input_schema field accepts a JSON Schema object — the same draft-07 subset used by OpenAPI request bodies and standard validators. The top level should be type: "object" with a properties map and a required array. Descriptions on each property double as model instructions — write them in plain language.

{
  "name": "extract_product",
  "description": "Extract structured product information from raw text. Always call this tool with the fields you can confidently identify.",
  "input_schema": {
    "type": "object",
    "properties": {
      "name": {
        "type": "string",
        "description": "Product name as it would appear on a listing page."
      },
      "price_usd": {
        "type": "number",
        "description": "Price in US dollars. Omit if not stated in the source text."
      },
      "in_stock": {
        "type": "boolean",
        "description": "True if explicitly described as available; false if backordered or sold out."
      },
      "tags": {
        "type": "array",
        "items": { "type": "string" },
        "description": "Short category tags — 3 to 6 items."
      },
      "category": {
        "type": "string",
        "enum": ["electronics", "apparel", "home", "books", "other"]
      }
    },
    "required": ["name", "in_stock"]
  }
}

Keywords that behave well with Claude: type, properties, required, items, enum, description, nested objects. Keywords to avoid: anyOf/oneOf (accepted but not strictly enforced), if/then/else (unreliable), $ref (limited support), pattern (best-effort). Keep schemas flat. For deeper coverage of which keywords are safe across LLM tool schemas, see our function calling schemas reference and the broader JSON Schema tutorial.

Forcing tool use with tool_choice: { type: 'tool', name }

tool_choice controls whether the model must call a tool and which one. For JSON extraction you want the third form — pin the model to one named tool — because auto lets it answer in text instead.

tool_choiceBehaviorWhen to use
{ type: 'auto' }Default. Model decides between text reply and tool call.Agent loops where text is also a valid answer.
{ type: 'any' }Model must call some tool, but picks which.Router prompts with several specialized tools.
{ type: 'tool', name: 'X' }Model must call tool X. Exactly one tool_use block.Structured JSON extraction — the standard pattern.
{ type: 'none' }Tools are visible to the model but cannot be called.Rare — useful for showing tool definitions for reasoning context only.

Here is the full Python pattern with the official anthropic SDK, forcing one named tool and reading input back:

from anthropic import Anthropic

client = Anthropic()  # reads ANTHROPIC_API_KEY from env

response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    tools=[{
        "name": "extract_product",
        "description": "Extract structured product information.",
        "input_schema": {
            "type": "object",
            "properties": {
                "name": {"type": "string"},
                "price_usd": {"type": "number"},
                "in_stock": {"type": "boolean"},
            },
            "required": ["name", "in_stock"],
        },
    }],
    tool_choice={"type": "tool", "name": "extract_product"},
    messages=[{
        "role": "user",
        "content": "The Anker 737 Power Bank is $149.99 and shipping today.",
    }],
)

# Find the tool_use block and read its parsed input
tool_block = next(b for b in response.content if b.type == "tool_use")
product = tool_block.input  # already a dict
print(product["name"], product["price_usd"], product["in_stock"])

Node version with the official SDK:

import Anthropic from '@anthropic-ai/sdk'

const client = new Anthropic()  // reads ANTHROPIC_API_KEY

const response = await client.messages.create({
  model: 'claude-sonnet-4-5',
  max_tokens: 1024,
  tools: [{
    name: 'extract_product',
    description: 'Extract structured product information.',
    input_schema: {
      type: 'object',
      properties: {
        name: { type: 'string' },
        price_usd: { type: 'number' },
        in_stock: { type: 'boolean' },
      },
      required: ['name', 'in_stock'],
    },
  }],
  tool_choice: { type: 'tool', name: 'extract_product' },
  messages: [{ role: 'user', content: 'The Anker 737 Power Bank is $149.99 and shipping today.' }],
})

const toolBlock = response.content.find(b => b.type === 'tool_use')
if (toolBlock?.type === 'tool_use') {
  const product = toolBlock.input as { name: string; price_usd?: number; in_stock: boolean }
  console.log(product.name, product.price_usd, product.in_stock)
}

Raw HTTP body, for callers not using an SDK — the anthropic-version header is required; without it the request is rejected:

POST https://api.anthropic.com/v1/messages
x-api-key: sk-ant-...
anthropic-version: 2023-06-01
content-type: application/json

{
  "model": "claude-sonnet-4-5",
  "max_tokens": 1024,
  "tools": [{
    "name": "extract_product",
    "input_schema": {
      "type": "object",
      "properties": {
        "name": { "type": "string" },
        "price_usd": { "type": "number" },
        "in_stock": { "type": "boolean" }
      },
      "required": ["name", "in_stock"]
    }
  }],
  "tool_choice": { "type": "tool", "name": "extract_product" },
  "messages": [
    { "role": "user", "content": "The Anker 737 Power Bank is $149.99 and shipping today." }
  ]
}

If you authenticate against a downstream service after parsing, the Authorization header on the Vercel side may be a bearer JWT — see our JWT decoder guide if you need to inspect those.

Handling streaming tool_use deltas (input_json_delta)

When you set stream: true, the API switches to server-sent events. Tool input arrives in fragments — a content_block_start announces a new tool_use block (with its id and name), then a run of content_block_delta events with delta.type equal to input_json_delta carry partial_json string pieces, and a content_block_stop closes the block. You cannot parse fragments individually — concatenate them and parse once at the end.

import Anthropic from '@anthropic-ai/sdk'

const client = new Anthropic()

const stream = client.messages.stream({
  model: 'claude-sonnet-4-5',
  max_tokens: 1024,
  tools: [/* same tool definition */],
  tool_choice: { type: 'tool', name: 'extract_product' },
  messages: [{ role: 'user', content: 'The Anker 737...' }],
})

let toolName = ''
let buffer = ''

for await (const event of stream) {
  if (event.type === 'content_block_start' && event.content_block.type === 'tool_use') {
    toolName = event.content_block.name
    buffer = ''
  } else if (
    event.type === 'content_block_delta' &&
    event.delta.type === 'input_json_delta'
  ) {
    buffer += event.delta.partial_json
    // Optionally: show progress to the user — but do NOT try to parse buffer yet
  } else if (event.type === 'content_block_stop' && toolName) {
    const parsed = JSON.parse(buffer)
    console.log('Tool', toolName, 'called with', parsed)
    toolName = ''
  }
}

The Anthropic SDKs ship a higher-level helper that does the buffering and final parsing for you (stream.finalMessage() in Node, MessageStream in Python). Use it when you do not need per-token callbacks. Streaming is mostly a UX win — render a skeleton or progress indicator while the JSON assembles — since the parsed object is only safe to consume at content_block_stop.

Multiple tools, parallel tool calls, and tool_choice values

When you define more than one tool and set tool_choice to auto or any, Claude can emit several tool_use blocks in a single response — parallel tool calls. Each block has its own id and input; iterate them all rather than taking the first match.

response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    tools=[
        {
            "name": "get_weather",
            "input_schema": {
                "type": "object",
                "properties": {"city": {"type": "string"}},
                "required": ["city"],
            },
        },
        {
            "name": "get_flights",
            "input_schema": {
                "type": "object",
                "properties": {
                    "from": {"type": "string"},
                    "to": {"type": "string"},
                },
                "required": ["from", "to"],
            },
        },
    ],
    tool_choice={"type": "any"},  # force some tool, model picks which
    messages=[{
        "role": "user",
        "content": "Weather in Paris and any flights from SFO to CDG tomorrow?",
    }],
)

# Multiple tool_use blocks expected — handle all of them
for block in response.content:
    if block.type == "tool_use":
        print(block.name, "->", block.input)

To disable parallelism — say you want exactly one tool call per turn for easier reasoning — pass disable_parallel_tool_use: true on the request. Under forced single-tool tool_choice ({ type: 'tool', name: 'X' }), parallel calls already collapse to one block, so the flag mostly matters with auto and any.

The other reason to define multiple tools when you actually only want JSON output: giving the model a cannot_extract tool with an empty schema as a fallback, so it has somewhere to land when the input is genuinely off-topic. Then switch from forced single-tool to any to let the model choose.

Validating output against your schema (Zod / Pydantic)

The API enforces structure on the model side — Claude cannot emit a tool_use block whose input violates the JSON Schema you supplied. But defensive validation on your side is still worth the few lines: it gives you typed values downstream, it catches drift across model upgrades, and it fails loudly if a future schema change is incompatible with old code.

// Node — Zod
import { z } from 'zod'

const ProductSchema = z.object({
  name: z.string(),
  price_usd: z.number().optional(),
  in_stock: z.boolean(),
  tags: z.array(z.string()).optional(),
})

const toolBlock = response.content.find(b => b.type === 'tool_use')
if (toolBlock?.type !== 'tool_use') throw new Error('Expected tool_use block')

// Validate and get a typed result
const product = ProductSchema.parse(toolBlock.input)
// product is now typed: { name: string; price_usd?: number; in_stock: boolean; tags?: string[] }
# Python — Pydantic v2
from pydantic import BaseModel

class Product(BaseModel):
    name: str
    price_usd: float | None = None
    in_stock: bool
    tags: list[str] = []

tool_block = next(b for b in response.content if b.type == "tool_use")
product = Product.model_validate(tool_block.input)  # typed, validated
print(product.name, product.in_stock)

Keep the validator shape in lockstep with the input_schema you send. One trick: write the Zod or Pydantic model first, then generate the JSON Schema from it (Zod has zod-to-json-schema, Pydantic ships .model_json_schema()). That gives you one source of truth and the schemas cannot drift. For a deeper Zod workflow see our Zod schema validation guide.

When prefill beats tool use: prefilling JSON open brace

Prefill is the older technique of seeding the assistant turn with the start of a JSON value — usually { — so Claude continues from there instead of opening with prose. You add an assistant message at the end of the messages array whose content is just {; the API treats it as the start of the model's reply and the model continues from that token.

response = client.messages.create(
    model="claude-haiku-4-5",
    max_tokens=512,
    stop_sequences=["}"],   # halt at the closing brace
    messages=[
        {"role": "user", "content": "Summarize this article as JSON with title and tags fields..."},
        {"role": "assistant", "content": "{"},  # the prefill
    ],
)

# Reassemble: prepend the opening brace, append the stop sequence
raw = "{" + response.content[0].text + "}"
data = json.loads(raw)

Why prefill still has a role when tool use exists:

  • Latency. Prefill skips the tool-selection step — slightly faster on Haiku for trivial extractions.
  • Free-form JSON. When the shape is loose and you do not have a stable schema, defining a tool with permissive types feels heavier than just asking for JSON.
  • Streaming text. Prefill streams as regular text deltas, which some UIs handle more naturally than input_json_delta.

Why tool use wins by default: the schema is enforced server-side, the SDK returns a parsed object so you skip a JSON.parse, and the response cannot end mid-object the way a prefilled completion can if it hits the token cap before closing all braces. Use prefill for low-stakes cases where the model is already reliable at the format; use tool use whenever correctness matters or the schema is complex.

Key terms

tool_use block
A content block in a Messages API response with type: "tool_use", an id, a name, and a parsed input object. The supported way to get structured JSON out of Claude.
input_schema
A JSON Schema (draft-07 subset) attached to each tool definition. Describes the shape of the input the model will fill. The API enforces structure against this schema on the model side.
tool_choice
Request-level control over tool calling. Accepts { type: 'auto' } (default), { type: 'any' }, { type: 'tool', name } for a forced specific tool, or { type: 'none' }.
input_json_delta
The streaming event delta type carrying fragments of a tool's input JSON. Arrives inside content_block_delta events between content_block_start and content_block_stop; concatenate partial_json across the run and parse once at the end.
prefill
An older pattern: include a trailing assistant message whose content begins the desired JSON (typically {). The model continues from that token. Does not enforce a schema; useful for free-form JSON and on Haiku for low-latency extraction.
Messages API
Anthropic's primary API for chat-style requests: POST https://api.anthropic.com/v1/messages with x-api-key and anthropic-version headers. Returns a message object whose content field is an array of typed blocks.

Frequently asked questions

Does Claude have a JSON mode like OpenAI?

No. Anthropic has never shipped a top-level response_format or json_mode parameter on the Messages API. The official way to get structured JSON out of Claude is tool use: you define a tool whose input_schema describes the JSON shape you want, set tool_choice to force that tool, and read the parsed object back from the tool_use content block in the response. The model fills the tool input rather than free-typing JSON into the text channel, so the output is guaranteed to be valid JSON and conform to the schema you supplied. Tool use has been generally available since the Claude 3 launch in March 2024 and is the supported pattern across claude-opus-4, claude-sonnet-4-5, and claude-haiku-4-5. A secondary technique — prefilling the assistant turn with an opening brace — gets you JSON without tool use but does not enforce the schema and is best reserved for free-form structures the model is good at producing on its own.

How do I force Claude to always return JSON?

Define one tool whose input_schema is the JSON shape you want, then pass tool_choice as the object form { type: 'tool', name: 'your_tool_name' }. That second part is what forces the issue — without it, Claude decides whether to call a tool or reply in text. With it pinned, the model must emit a tool_use content block whose input field is a JSON object validated against your schema. Parse the response by iterating response.content, finding the block where type === 'tool_use', and reading block.input. The input is already a parsed object on the SDK return — you do not need to JSON.parse it again. For maximum reliability, also validate the result with Zod (Node) or Pydantic (Python) on your side: the API enforces structure but your downstream code should still defensively confirm the shape it expects, especially after model version upgrades.

What goes in the input_schema field of a tool definition?

input_schema accepts a JSON Schema object — the same draft-07 subset that powers OpenAPI request bodies and most validators. The top level should be type: 'object' with a properties map and a required array. Each property entry takes a type (string, number, integer, boolean, array, object, null), an optional description (which the model reads as part of its prompt — describe each field in plain language), and constraints like enum, minimum, maxLength, items for arrays, and nested properties for sub-objects. Claude does not support every JSON Schema keyword — anyOf, oneOf, $ref, and pattern are accepted but not strictly enforced; conditional validation (if/then/else) is best avoided. Keep schemas flat and explicit. For deeper coverage of which keywords behave reliably across LLM providers, see our JSON Schema tutorial and the function calling schemas guide.

How do I parse a tool_use block from the Messages API response?

The Messages API returns response.content as an array of content blocks. Each block has a type field: 'text' for normal assistant prose and 'tool_use' for a structured tool call. To extract the JSON, iterate the array and find the first block where block.type === 'tool_use'. That block has three fields you care about: id (a unique identifier you would echo back in a tool_result if you continued the conversation), name (the tool name the model chose), and input (the parsed JSON object, already deserialized by the SDK). Read block.input directly — it is a plain dict in Python or a plain object in TypeScript, not a JSON string. If multiple tool_use blocks appear in the same response (parallel tool calls), collect them all. If you forced tool_choice to a specific tool, you can usually assume the first matching block is the one you want.

Can Claude call multiple tools in one response?

Yes — Claude supports parallel tool calls. When you set tool_choice: { type: 'any' } or 'auto', and define multiple tools, the model can emit several tool_use content blocks in a single response, one after another in the content array. Each block has its own id and input. This matters when a user asks something that requires two independent actions (look up a flight and a hotel, fetch weather for two cities) — the model can request both in parallel rather than serializing them across turns. Parallel tool calls are off by default for some workflows; you can disable them explicitly by passing disable_parallel_tool_use: true in the request. When you force a single named tool via tool_choice: { type: 'tool', name }, parallel calls collapse to one — the model emits exactly one tool_use block matching that tool, which is the pattern you want for forced-JSON output.

How do I stream a tool_use input_json incrementally?

Set stream: true on the request and consume the server-sent event stream. For tool_use content, you receive a content_block_start event announcing a new tool_use block (with its id and name), then a sequence of content_block_delta events with delta.type === 'input_json_delta' carrying partial_json string fragments, and finally a content_block_stop. The partial_json fragments are pieces of the JSON string — concatenate them in order, then JSON.parse the accumulated buffer at content_block_stop. You cannot parse fragments individually; the JSON is only valid once complete. Streaming is useful for UX (show a spinner that updates as the model commits to fields) but the input object itself is only safe to consume at the end. The Anthropic SDKs ship a higher-level streaming helper (stream.finalMessage in Node, MessageStream in Python) that handles the buffering for you.

What is the prefill trick for getting JSON without tool use?

Prefill is the technique of seeding the assistant turn with the opening of a JSON value — typically `{` — so Claude continues from there rather than starting with prose like 'Sure, here is the JSON'. You include an assistant message at the end of the messages array whose content is just an opening brace; the API treats it as the start of the model's reply and the model continues from that token. The response then begins with the rest of the JSON, which you concatenate with `{` on your side. Prefill does not enforce a schema — it nudges the model to start in JSON shape but the model can still close the brace early or produce invalid output. Use prefill for low-stakes free-form JSON where you trust the model, and tool use when correctness matters. Note: when prefilling, set stop_sequences to `]` or `}` so generation halts at the closing token instead of trailing into commentary.

Does the Anthropic API support response_format like OpenAI?

No. The Messages API has no response_format field. OpenAI exposes two layers — response_format: { type: 'json_object' } for any-JSON mode and response_format: { type: 'json_schema', json_schema: {...} } for strict structured outputs — but Anthropic takes a different design: structured output is unified with tool use rather than treated as a separate output mode. Where OpenAI splits 'I want JSON' from 'I want a function call', Anthropic treats both as the same primitive — a model picking a tool and filling its arguments. The practical upshot: if you are porting a prompt from OpenAI's response_format to Claude, rewrite it as a single tool with the same schema and force it via tool_choice. The output is equivalent in correctness but the wire format differs — read input from a tool_use block instead of parsing message.content as a JSON string. See our OpenAI JSON mode comparison for a side-by-side.

Further reading and primary sources