PostgreSQL JSON Queries: jsonb, GIN Indexes, JSON Path & pg_jsonschema

Last updated:

PostgreSQL has two JSON types: json (stores text as-is, re-parses on every access) and jsonb (stores binary parsed form, supports GIN indexes — 10–100× faster for queries). The -> operator returns a JSON value as JSON; ->> returns it as text. jsonb @> '{"status":"active"}' uses a GIN index to find all documents where status is active — this query runs in O(log n) instead of O(n) full-table scan. This guide covers json vs jsonb type selection, GIN index creation and query patterns, JSON path expressions (PostgreSQL 12+), jsonb operators and functions, pg_jsonschema for database-level validation, and Prisma/Drizzle ORM JSON field handling. Every query includes an EXPLAIN plan comparison.

json vs jsonb: Choosing the Right PostgreSQL JSON Type

The choice between json and jsonb has significant performance implications. json is a thin wrapper — it validates syntax on write and returns the raw text on read, re-parsing it every time an operator or function accesses the content. jsonb decomposes JSON into a binary tree on write and navigates that tree on read — no re-parsing, direct byte-offset access to any key. For indexed queries on large tables, the difference is not marginal: containment queries on a 10-million-row table go from seconds to milliseconds.

-- json: raw text storage — validates syntax, preserves everything
CREATE TABLE audit_log (
    id         bigserial PRIMARY KEY,
    payload    json NOT NULL   -- preserves duplicate keys, whitespace, key order
);

-- jsonb: binary storage — the right choice for almost all use cases
CREATE TABLE orders (
    id         bigserial PRIMARY KEY,
    created_at timestamptz NOT NULL DEFAULT now(),
    data       jsonb NOT NULL DEFAULT '{}'
);

-- Key behavioral differences
-- json: preserves duplicate keys (both stored)
SELECT '{"a":1,"a":2}'::json;     -- {"a":1,"a":2}

-- jsonb: de-duplicates (last value wins)
SELECT '{"a":1,"a":2}'::jsonb;    -- {"a": 2}

-- json: preserves key order
SELECT '{"z":1,"a":2}'::json;     -- {"z":1,"a":2}

-- jsonb: sorts keys alphabetically
SELECT '{"z":1,"a":2}'::jsonb;    -- {"a": 2, "z": 1}

-- Storage size comparison on a real document:
-- json  ≈ raw text bytes (e.g., 312 bytes)
-- jsonb ≈ 30–50% larger (e.g., 430 bytes) — binary overhead

-- EXPLAIN comparison: same query, json vs jsonb
-- json column (no index possible):
EXPLAIN SELECT * FROM audit_log WHERE payload->>'status' = 'paid';
--   Seq Scan on audit_log  (cost=0.00..25000.00 rows=1 width=312)
--   Filter: ((payload->>'status') = 'paid')   <- re-parses every row

-- jsonb column with GIN index:
EXPLAIN SELECT * FROM orders WHERE data @> '{"status":"paid"}';
--   Bitmap Heap Scan on orders  (cost=12.50..45.00 rows=10 width=430)
--     Recheck Cond: (data @> '{"status": "paid"}')
--     -> Bitmap Index Scan on idx_orders_gin
--          Index Cond: (data @> '{"status": "paid"}')
-- O(log n) instead of O(n) — milliseconds vs seconds on 10M rows

-- Cast between types freely
SELECT '{"a":1}'::json::jsonb;   -- json -> jsonb
SELECT '{"a":1}'::jsonb::json;   -- jsonb -> json (normalised text)

-- When to use json (rare):
-- 1. Compliance audit logs — signed webhook bodies where byte-exact
--    preservation is required (altering whitespace breaks HMAC signatures)
-- 2. Legacy integrations that depend on duplicate key behavior

Default to jsonb. The 30–50% storage overhead is justified by every query that touches the column — even a single GIN-indexed containment query on a large table recovers the storage cost many times over in CPU and I/O savings. The only legitimate use for json is compliance-driven audit logging. For JSON Schema validation at the application layer before inserting, see the linked guide.

JSON Operators: ->, ->> , #> , @>, and ?

Understanding operator return types is the most important thing to get right when writing jsonb queries. -> returns jsonb — suitable for chaining operators but will cause a type error if compared to a plain string in a WHERE clause. ->> returns text — suitable for string comparisons and casting to numeric types. Using -> in a WHERE clause is the single most common jsonb bug.

-- Setup
INSERT INTO orders (data) VALUES
  ('{"status":"paid","amount":149.99,"user":{"id":42,"name":"Alice"},"tags":["priority","express"]}'),
  ('{"status":"pending","amount":29.99,"user":{"id":17,"name":"Bob"},"tags":["standard"]}');

-- ── Field access: -> vs ->> ─────────────────────────────────────────
-- ->  returns jsonb (keep chaining, or use in jsonb functions)
SELECT data->'user'           FROM orders;  -- {"id":42,"name":"Alice"}  (jsonb)
SELECT data->'user'->'id'     FROM orders;  -- 42  (jsonb)

-- ->> returns text (use in WHERE, ORDER BY, and casts)
SELECT data->>'status'        FROM orders;  -- paid  (text, no quotes)
SELECT data->'user'->>'name'  FROM orders;  -- Alice  (text)

-- Array element access by index (0-based)
SELECT data->'tags'->0        FROM orders;  -- "priority"  (jsonb)
SELECT data->'tags'->>0       FROM orders;  -- priority    (text)

-- ── Path operators ───────────────────────────────────────────────────
-- #>  get value at path as jsonb
SELECT data #> '{user,name}'  FROM orders;  -- "Alice"  (jsonb)

-- #>> get value at path as text
SELECT data #>> '{user,name}' FROM orders;  -- Alice   (text)

-- ── Containment: @> (GIN-indexed) ────────────────────────────────────
-- @> tests: does left contain all key-value pairs of right?
SELECT * FROM orders WHERE data @> '{"status":"paid"}';
SELECT * FROM orders WHERE data @> '{"user":{"id":42}}';   -- works on nested objects
SELECT * FROM orders WHERE data @> '{"tags":["express"]}'; -- array element containment

-- WRONG: type mismatch — data->'status' is jsonb, not text
-- SELECT * FROM orders WHERE data->'status' = 'paid';  -- ERROR

-- CORRECT with ->>:
SELECT * FROM orders WHERE data->>'status' = 'paid';

-- CORRECT with @> (also GIN-indexed — prefer this):
SELECT * FROM orders WHERE data @> '{"status":"paid"}';

-- ── Key existence: ? ?| ?& ────────────────────────────────────────────
SELECT * FROM orders WHERE data ? 'amount';               -- key exists at top level
SELECT * FROM orders WHERE data ?| ARRAY['discount','amount']; -- any key exists
SELECT * FROM orders WHERE data ?& ARRAY['status','amount'];   -- all keys exist

-- ── Numeric comparison: always cast ->> output ───────────────────────
SELECT * FROM orders WHERE (data->>'amount')::numeric > 100;
SELECT * FROM orders ORDER BY (data->>'amount')::numeric DESC;

-- ── SQL/JSON path operators (PostgreSQL 12+) ─────────────────────────
-- @? tests if path expression returns any value
SELECT * FROM orders WHERE data @? '$.user.id ? (@ == 42)';

-- @@ evaluates a boolean path expression
SELECT * FROM orders WHERE data @@ '$.amount > 100';

-- ── Modify operators ──────────────────────────────────────────────────
-- || merge (shallow — top-level keys only)
UPDATE orders SET data = data || '{"shipped_at":"2026-01-25"}' WHERE id = 1;

-- - remove key
UPDATE orders SET data = data - 'shipped_at' WHERE id = 1;

-- #- remove nested key
UPDATE orders SET data = data #- '{user,name}' WHERE id = 1;

The @> containment operator is usually the best choice for equality filters on jsonb — it is GIN-indexed, expressive for nested objects, and works with arrays. Use ->> for range queries and ORDER BY where you need to cast the value (::numeric, ::timestamptz). See our JSON data validation guide for validating field types before inserting.

GIN Indexes for Fast jsonb Queries

A GIN (Generalized Inverted Index) on a jsonb column indexes every key and value path in the document. Without it, every @> or ? query does a full sequential scan — O(n). With a GIN index, those queries run in O(log n). On a 10-million-row table, that is the difference between a 30-second scan and a 5-millisecond lookup. The critical constraint: GIN indexes on jsonb accelerate containment and key-existence operators, not the -> path extraction operators.

-- ── Full GIN index: jsonb_ops (default operator class) ──────────────
-- Supports: @>  <@  ?  ?|  ?&  @?  @@
CREATE INDEX idx_orders_gin ON orders USING GIN (data);

-- ── Compact GIN index: jsonb_path_ops ────────────────────────────────
-- Supports: @>  @?  @@  only  (NOT ?  ?|  ?&)
-- ~30% smaller index; slightly faster for @> queries
CREATE INDEX idx_orders_jpo ON orders USING GIN (data jsonb_path_ops);

-- ── B-tree index on extracted field ──────────────────────────────────
-- Best for equality/range queries on one known field
-- Supports: =  <  >  BETWEEN  ORDER BY  on that field
CREATE INDEX idx_orders_status ON orders ((data->>'status'));
CREATE INDEX idx_orders_amount ON orders (((data->>'amount')::numeric));

-- Partial index: GIN only on rows matching a condition
CREATE INDEX idx_orders_paid_gin ON orders USING GIN (data)
WHERE data->>'status' = 'paid';

-- ── Verify index usage ────────────────────────────────────────────────
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE data @> '{"status":"paid"}';
-- With GIN index:
--   Bitmap Heap Scan on orders  (cost=16.25..48.75 rows=10 width=200)
--     -> Bitmap Index Scan on idx_orders_gin
--          Index Cond: (data @> '{"status": "paid"}')
-- Without GIN index:
--   Seq Scan on orders  (cost=0.00..250000.00 rows=10000 width=200)
--     Filter: (data @> '{"status": "paid"}')  <- scans every row

-- ── Which index class to choose ───────────────────────────────────────
-- Query type               jsonb_ops    jsonb_path_ops
-- @>  (containment)           YES           YES
-- <@  (contained by)          YES           NO
-- ?   (key exists)            YES           NO
-- ?|  (any key)               YES           NO
-- ?&  (all keys)              YES           NO
-- @?  (path exists)           YES           YES
-- @@  (path match)            YES           YES
-- Index size                 100%           ~70%

-- ── GIN maintenance settings ──────────────────────────────────────────
-- GIN writes are batched by default (fastupdate = on)
-- Increase pending list for write-heavy tables to delay merge cost:
ALTER INDEX idx_orders_gin SET (fastupdate = on);

-- For consistent read latency (no pending list staleness), disable:
ALTER INDEX idx_orders_gin SET (fastupdate = off);

-- Rebuild a bloated GIN index without locking:
REINDEX INDEX CONCURRENTLY idx_orders_gin;

-- ── Expression index for a specific nested field ──────────────────────
-- GIN accelerates @> on full document; for a single nested field,
-- a B-tree on the expression is faster and smaller:
CREATE INDEX idx_orders_user_id ON orders (((data->'user'->>'id')::int));

-- Query uses expression index:
EXPLAIN SELECT * FROM orders WHERE (data->'user'->>'id')::int = 42;
-- Index Scan using idx_orders_user_id

A common mistake is creating a GIN index and then writing queries that use data->>'status' = 'paid' — this does NOT use the GIN index, it performs a sequential scan. Use data @> '{"status":"paid"}' or create a separate B-tree index on (data->>'status'). Always confirm with EXPLAIN (ANALYZE, BUFFERS). For JSON performance benchmarking methodology, see the linked guide.

JSON Path Expressions (PostgreSQL 12+)

SQL/JSON path expressions (added in PostgreSQL 12) provide a declarative language for navigating and filtering jsonb structures. The key advantage over chained -> operators: path expressions can filter nested arrays in a single function call, support arithmetic and string methods, and work with the @? and @@ operators that are accelerated by jsonb_path_ops GIN indexes. The path language is based on the SQL/JSON standard (ISO/IEC 9075-2:2016).

-- Sample data with nested arrays
INSERT INTO orders (data) VALUES ('{
  "id": 1001,
  "customer": "Alice",
  "items": [
    {"product": "Widget", "qty": 3, "price": 19.99},
    {"product": "Gadget", "qty": 1, "price": 149.99}
  ],
  "tags": ["priority", "express"],
  "score": 9.5
}');

-- ── Basic navigation ──────────────────────────────────────────────────
-- $.key  — navigate to a field
SELECT jsonb_path_query(data, '$.customer') FROM orders;       -- "Alice"

-- $.array[*] — iterate all array elements
SELECT jsonb_path_query(data, '$.items[*]') FROM orders;
-- {"product":"Widget","qty":3,"price":19.99}
-- {"product":"Gadget","qty":1,"price":149.99}

-- $.array[0] — access by index
SELECT jsonb_path_query(data, '$.items[0].product') FROM orders; -- "Widget"

-- $.array[last] — last element
SELECT jsonb_path_query(data, '$.items[last]') FROM orders;

-- ── Filtering with ? (@condition) ─────────────────────────────────────
-- Find items with price > 100
SELECT jsonb_path_query(data, '$.items[*] ? (@.price > 100)') FROM orders;
-- {"product":"Gadget","qty":1,"price":149.99}

-- Find orders where any item price > 100 (WHERE clause)
SELECT * FROM orders WHERE data @? '$.items[*].price ? (@ > 100)';

-- Combined condition: qty > 1 AND price < 50
SELECT jsonb_path_query(data, '$.items[*] ? (@.qty > 1 && @.price < 50)') FROM orders;

-- String starts with (case-sensitive)
SELECT jsonb_path_query(data, '$.tags[*] ? (@ starts with "exp")') FROM orders;
-- "express"

-- ── jsonb_path_query_array: collect results ───────────────────────────
-- Returns all matches as a single jsonb array
SELECT jsonb_path_query_array(data, '$.items[*].product') FROM orders;
-- ["Widget", "Gadget"]

SELECT jsonb_path_query_array(data, '$.items[*].price ? (@ > 50)') FROM orders;
-- [149.99]

-- ── jsonb_path_exists: boolean test ──────────────────────────────────
-- Returns true if any match found
SELECT jsonb_path_exists(data, '$.items[*] ? (@.price > 100)') FROM orders; -- true

-- Equivalent with @? operator (GIN-accelerated):
SELECT * FROM orders WHERE data @? '$.items[*] ? (@.price > 100)';

-- ── jsonb_path_match: evaluate a boolean predicate ───────────────────
-- Evaluates a path that must return true/false
SELECT * FROM orders WHERE jsonb_path_match(data, '$.score >= 9.0');

-- Equivalent with @@ operator:
SELECT * FROM orders WHERE data @@ '$.score >= 9.0';

-- ── Arithmetic in path expressions ────────────────────────────────────
-- Calculate total value of high-price items
SELECT jsonb_path_query_array(
    data,
    '$.items[*] ? (@.price > 50).price'
) FROM orders;
-- [149.99]

-- ── Variables in path (PostgreSQL 14+) ───────────────────────────────
SELECT jsonb_path_query(
    data,
    '$.items[*] ? (@.price > $min)',
    '{"min": 100}'   -- variables object
) FROM orders;
-- {"product":"Gadget","qty":1,"price":149.99}

-- ── Datetime support ──────────────────────────────────────────────────
SELECT * FROM orders
WHERE data @? '$.created_at.datetime() ? (@ > "2026-01-01".datetime())';

JSON path expressions are the best tool for filtering nested arrays — use $.items[*] ? (@.price > 100) instead of jsonb_array_elements(data->{'items'}) WHERE (elem->>'price')::numeric > 100. The path expression is more readable, composes cleanly in WHERE clauses with @?, and is accelerated by a jsonb_path_ops GIN index. Variables (PostgreSQL 14+) allow parameterizing the path without string interpolation — important for preventing injection when path conditions include user input.

jsonb Functions: jsonb_set, jsonb_strip_nulls, jsonb_each

PostgreSQL provides a comprehensive set of jsonb functions for construction, mutation, expansion, and aggregation. The most-used are jsonb_set() for partial updates, jsonb_strip_nulls() for cleaning documents before storage, jsonb_each() for expanding objects to rows, and jsonb_agg()/jsonb_build_object() for assembling JSON responses from relational data. These functions eliminate most application-layer JSON assembly.

-- ── jsonb_set: partial update ─────────────────────────────────────────
-- jsonb_set(target, path, new_value [, create_missing=true])
-- IMPORTANT: string values need inner double-quotes
UPDATE orders SET data = jsonb_set(data, '{status}', '"shipped"') WHERE id = 1;
UPDATE orders SET data = jsonb_set(data, '{address,city}', '"Austin"') WHERE id = 1;
UPDATE orders SET data = jsonb_set(data, '{score}', '9.8') WHERE id = 1;

-- Safe conversion with to_jsonb() — avoids quoting errors
UPDATE orders
SET data = jsonb_set(data, '{status}', to_jsonb('shipped'::text))
WHERE id = 1;

-- Increment a numeric field
UPDATE orders
SET data = jsonb_set(
    data, '{view_count}',
    to_jsonb(coalesce((data->>'view_count')::int, 0) + 1)
)
WHERE id = 1;

-- Chain multiple updates atomically
UPDATE orders
SET data = jsonb_set(
               jsonb_set(data, '{status}', '"shipped"'),
               '{tracking}', '"TRK-98765"'
           )
WHERE id = 1;

-- ── jsonb_insert: add to array ─────────────────────────────────────────
-- jsonb_insert(target, path, value [, insert_after=false])
-- Appends to end of tags array:
UPDATE orders
SET data = jsonb_insert(data, '{tags,-1}', '"urgent"', true)
WHERE id = 1;

-- ── jsonb_strip_nulls: remove null-valued keys ─────────────────────────
SELECT jsonb_strip_nulls('{"a":1,"b":null,"c":{"d":null,"e":2}}'::jsonb);
-- {"a":1,"c":{"e":2}}  — nulls removed recursively

-- Useful before INSERT to clean optional fields from application payloads:
INSERT INTO orders (data)
VALUES (jsonb_strip_nulls('{"status":"paid","discount":null,"amount":99.99}'));
-- Stores: {"amount": 99.99, "status": "paid"}

-- ── jsonb_each: expand object to (key, value) rows ─────────────────────
SELECT key, value
FROM orders, jsonb_each(data)
WHERE id = 1;
-- key      | value
-- ---------+-------------------
-- status   | "paid"
-- amount   | 149.99
-- tags     | ["priority","express"]

-- jsonb_each_text: value as text (no JSON quoting)
SELECT key, value FROM orders, jsonb_each_text(data) WHERE id = 1;
-- status   | paid
-- amount   | 149.99

-- ── jsonb_array_elements: expand array to rows ─────────────────────────
SELECT id, elem
FROM orders, jsonb_array_elements(data->'items') AS elem
WHERE id = 1;
-- 1 | {"product":"Widget","qty":3,"price":19.99}
-- 1 | {"product":"Gadget","qty":1,"price":149.99}

-- Access fields within expanded elements:
SELECT
    id,
    elem->>'product'       AS product,
    (elem->>'qty')::int    AS qty,
    (elem->>'price')::numeric AS price
FROM orders, jsonb_array_elements(data->'items') AS elem
WHERE id = 1;

-- ── jsonb_agg + jsonb_build_object: assemble JSON in SQL ───────────────
SELECT
    o.id AS order_id,
    jsonb_agg(
        jsonb_build_object(
            'product', li.product,
            'qty',     li.qty,
            'price',   li.unit_price
        )
        ORDER BY li.product
    ) AS items
FROM orders o
JOIN line_items li ON li.order_id = o.id
GROUP BY o.id;

-- ── jsonb_object_keys: get top-level key names ─────────────────────────
SELECT jsonb_object_keys(data) AS key FROM orders WHERE id = 1;
-- status, amount, tags, items, ...

-- ── to_jsonb: convert SQL values to jsonb ─────────────────────────────
SELECT to_jsonb(42);               -- 42
SELECT to_jsonb('hello'::text);    -- "hello"
SELECT to_jsonb(now());            -- "2026-01-25T10:00:00+00:00"
SELECT to_jsonb(ARRAY[1,2,3]);    -- [1, 2, 3]

-- ── row_to_json: entire row as JSON ──────────────────────────────────
SELECT row_to_json(orders.*) FROM orders WHERE id = 1;

jsonb_strip_nulls() is particularly useful for API endpoints that receive sparse objects — rather than manually removing null fields in application code before INSERT, strip them in SQL. jsonb_agg() with jsonb_build_object() eliminates the need for multiple database round-trips to assemble nested API responses: the entire relational-to-JSON transformation happens in a single query. For JSON API design patterns that leverage these aggregation functions, see the linked guide.

pg_jsonschema: Database-Level JSON Schema Validation

pg_jsonschema (by Supabase) is a PostgreSQL extension that adds a jsonb_matches_schema() function, enabling CHECK constraints that validate jsonb columns against a JSON Schema on every INSERT and UPDATE — at the database layer, not the application layer. This means no document can ever be written that violates the schema, regardless of which application, migration script, or direct SQL connection inserts the data.

-- ── Install pg_jsonschema ─────────────────────────────────────────────
-- On Supabase: available as a trusted extension
CREATE EXTENSION IF NOT EXISTS pg_jsonschema;

-- On self-hosted PostgreSQL: install from source
-- https://github.com/supabase/pg_jsonschema

-- ── Add a CHECK constraint with JSON Schema validation ─────────────────
ALTER TABLE orders ADD CONSTRAINT valid_order_data
CHECK (
    jsonb_matches_schema(
        '{
          "type": "object",
          "required": ["status", "amount"],
          "properties": {
            "status": {
              "type": "string",
              "enum": ["pending", "paid", "shipped", "cancelled"]
            },
            "amount": {
              "type": "number",
              "minimum": 0
            },
            "tags": {
              "type": "array",
              "items": { "type": "string" }
            },
            "user": {
              "type": "object",
              "required": ["id"],
              "properties": {
                "id":   { "type": "integer" },
                "name": { "type": "string" }
              }
            }
          },
          "additionalProperties": false
        }',
        data
    )
);

-- ── Test: valid INSERT succeeds ────────────────────────────────────────
INSERT INTO orders (data) VALUES
('{"status":"paid","amount":149.99,"user":{"id":42,"name":"Alice"}}');
-- INSERT 0 1  (succeeds)

-- ── Test: invalid INSERT fails at database level ───────────────────────
INSERT INTO orders (data) VALUES ('{"status":"invalid_status","amount":149.99}');
-- ERROR:  new row for relation "orders" violates check constraint "valid_order_data"
-- DETAIL:  Failing row contains ({"status":"invalid_status","amount":149.99}).

INSERT INTO orders (data) VALUES ('{"amount":149.99}');  -- missing required "status"
-- ERROR:  new row for relation "orders" violates check constraint "valid_order_data"

INSERT INTO orders (data) VALUES ('{"status":"paid","amount":-5}');  -- negative amount
-- ERROR:  new row for relation "orders" violates check constraint "valid_order_data"

-- ── Use jsonb_matches_schema() as a function (returns boolean) ─────────
SELECT jsonb_matches_schema(
    '{"type":"object","required":["status"],"properties":{"status":{"type":"string"}}}',
    '{"status":"paid","extra":"ignored"}'
);
-- true

-- ── JSON Schema draft-07 features supported ────────────────────────────
-- $ref, $defs, anyOf, oneOf, allOf, not, if/then/else
-- pattern, format (date, email, uuid, uri)
-- minimum, maximum, exclusiveMinimum, exclusiveMaximum
-- minLength, maxLength, minItems, maxItems
-- additionalProperties, patternProperties

ALTER TABLE orders ADD CONSTRAINT valid_order_v2
CHECK (
    jsonb_matches_schema(
        '{
          "$schema": "http://json-schema.org/draft-07/schema#",
          "type": "object",
          "required": ["status", "amount"],
          "properties": {
            "status": { "type": "string" },
            "amount": { "type": "number", "minimum": 0 },
            "email":  { "type": "string", "format": "email" },
            "tags":   {
              "type": "array",
              "items": { "type": "string" },
              "maxItems": 10
            }
          }
        }',
        data
    )
);

-- ── Alternative: lightweight CHECK without pg_jsonschema ──────────────
-- Enforce required keys exist (no extension required):
ALTER TABLE orders ADD CONSTRAINT has_required_keys
CHECK (data ? 'status' AND data ? 'amount');

-- Enforce status is one of an enum list:
ALTER TABLE orders ADD CONSTRAINT valid_status
CHECK (data->>'status' IN ('pending', 'paid', 'shipped', 'cancelled'));

The key advantage of pg_jsonschema over application-level validation is atomicity: the schema check and the write happen in the same transaction. A bug in one microservice, a direct psql session, or a migration script cannot bypass it. The schema string in the CHECK constraint is a SQL literal — for production use, store the canonical schema in a migration file and reference it via a SQL function rather than inlining it. For application-layer JSON Schema validation with Zod or Ajv before reaching the database, see the linked guide.

Prisma and Drizzle ORM: JSON Field Handling

Both Prisma and Drizzle ORM support jsonb columns, but with different levels of type safety and query capability. Prisma maps Json schema fields to PostgreSQL jsonb and provides structured path filters. Drizzle provides a more SQL-close API with explicit jsonb operators. For complex jsonb queries neither ORM handles natively, use raw SQL via the ORM escape hatch.

// ── Prisma: schema definition ─────────────────────────────────────────
// schema.prisma
model Order {
  id        Int      @id @default(autoincrement())
  createdAt DateTime @default(now())
  data      Json     // maps to jsonb in PostgreSQL
}

// ── Prisma: insert with jsonb data ─────────────────────────────────────
const order = await prisma.order.create({
  data: {
    data: {
      status: 'paid',
      amount: 149.99,
      user: { id: 42, name: 'Alice' },
      tags: ['priority', 'express'],
    },
  },
});

// ── Prisma: query with path filter ─────────────────────────────────────
// path filter: navigate to a field and apply a condition
const paidOrders = await prisma.order.findMany({
  where: {
    data: {
      path: ['status'],
      equals: 'paid',
    },
  },
});

// String contains filter
const expressOrders = await prisma.order.findMany({
  where: {
    data: {
      path: ['tags'],
      array_contains: 'express',
    },
  },
});

// Nested path filter
const aliceOrders = await prisma.order.findMany({
  where: {
    data: {
      path: ['user', 'name'],
      equals: 'Alice',
    },
  },
});

// ── Prisma: raw SQL for GIN-indexed @> queries ─────────────────────────
// Prisma does not generate @> containment queries — use $queryRaw
import { Prisma } from '@prisma/client';

const filter = JSON.stringify({ status: 'paid' });
const orders = await prisma.$queryRaw<Order[]>(
  Prisma.sql`SELECT * FROM "Order" WHERE data @> ${filter}::jsonb`
);

// ── Prisma: GIN index via migration ───────────────────────────────────
// Prisma does not auto-generate GIN indexes — add to migration SQL:
// CREATE INDEX idx_order_data_gin ON "Order" USING GIN (data);

// ── Drizzle ORM: schema definition ────────────────────────────────────
import { pgTable, serial, jsonb, timestamp } from 'drizzle-orm/pg-core';

export const orders = pgTable('orders', {
  id:        serial('id').primaryKey(),
  createdAt: timestamp('created_at').defaultNow(),
  data:      jsonb('data').notNull().$type<OrderData>(),  // typed jsonb
});

// TypeScript type for the jsonb document
type OrderData = {
  status: 'pending' | 'paid' | 'shipped' | 'cancelled';
  amount: number;
  user?: { id: number; name: string };
  tags?: string[];
};

// ── Drizzle: insert ───────────────────────────────────────────────────
await db.insert(orders).values({
  data: {
    status: 'paid',
    amount: 149.99,
    user: { id: 42, name: 'Alice' },
    tags: ['priority'],
  },
});

// ── Drizzle: query with sql template (raw jsonb operators) ─────────────
import { sql, eq } from 'drizzle-orm';

// Containment query using @>
const paidOrders = await db
  .select()
  .from(orders)
  .where(sql`data @> '{"status":"paid"}'::jsonb`);

// Extract a field with ->>
const statuses = await db
  .select({ status: sql<string>`data->>'status'` })
  .from(orders);

// GIN index: add via migration
await db.execute(sql`
  CREATE INDEX IF NOT EXISTS idx_orders_data_gin
  ON orders USING GIN (data)
`);

// ── Type-safe jsonb updates in both ORMs ──────────────────────────────
// Prisma: update the whole document (partial update requires jsonb_set raw SQL)
await prisma.order.update({
  where: { id: 1 },
  data: { data: { status: 'shipped', amount: 149.99 } }, // replaces whole field
});

// Drizzle: jsonb_set via sql template
await db.execute(sql`
  UPDATE orders
  SET data = jsonb_set(data, '{status}', '"shipped"')
  WHERE id = 1
`);

Neither Prisma nor Drizzle generates GIN indexes automatically — always add them via a migration SQL file. Prisma path filters translate to data->'field' = value queries, which do NOT use GIN indexes. For GIN-accelerated containment queries in production, use prisma.$queryRaw with the @> operator. Drizzle ORM gives more direct control through its sql template tag, making it easier to express native jsonb operators without a raw query escape hatch. Both ORMs support typed jsonb columns where TypeScript catches schema mismatches at compile time — define the document type and pass it as a generic to the column definition.

Key Terms

jsonb
A PostgreSQL column type that stores JSON data in a decomposed binary format. Unlike json, jsonb parses input on write, sorts keys alphabetically, de-duplicates keys (last value wins), and strips whitespace. The binary representation allows direct byte-offset access to any key without re-parsing, making reads significantly faster than json. jsonb supports the full operator set including @>, ?, ?|, ?&, and all SQL/JSON path functions. It can be indexed with GIN indexes for containment and key-existence queries. Storage overhead is 30–50% higher than json. Use jsonb for all JSON storage in PostgreSQL unless byte-exact preservation of the original input is legally required.
GIN index
A Generalized Inverted Index — a PostgreSQL index structure designed for multi-valued types including jsonb, arrays, and tsvector. A GIN index on a jsonb column stores every key-value path in the document, enabling O(log n) lookups for containment (@>), key-existence (?), and SQL/JSON path queries instead of O(n) sequential scans. Two operator classes are available: jsonb_ops (default — supports all operators) and jsonb_path_ops (supports only @> and path operators, but is ~30% smaller). GIN indexes have higher write overhead than B-tree indexes due to their inverted structure — they are best for read-heavy workloads. GIN does NOT accelerate the -> and ->> path extraction operators.
JSON path expression
A query language (SQL/JSON path, added in PostgreSQL 12) for navigating and filtering jsonb structures. The syntax uses $ as the root, .key for field access, [*] for array iteration, and ? (@condition) for filtering. Examples: $.items[*].price navigates to the price field of every item; $.items[*] ? (@.price > 100) filters to items with price above 100. Used via functions (jsonb_path_query, jsonb_path_exists, jsonb_path_match) and operators (@? for path exists, @@ for path match). Path expressions support arithmetic, string methods, datetime operations, and variables (PostgreSQL 14+). The @? and @@ operators can be accelerated by a jsonb_path_ops GIN index.
jsonb operator
Symbols that act on jsonb values. Access operators: -> (get field as jsonb), ->> (get field as text), #> (get path as jsonb), #>> (get path as text). Containment: @> (left contains right), <@ (left contained by right). Key existence: ? (key exists), ?| (any key exists), ?& (all keys exist). Path: @? (path expression returns any match), @@ (path expression evaluates to true). Modification: || (shallow merge — right overwrites left for matching top-level keys), - (delete key), #- (delete at path). The critical rule: use ->> not -> in WHERE clauses comparing to string literals.
pg_jsonschema
A PostgreSQL extension by Supabase that adds jsonb_matches_schema(schema, data) — a function that validates a jsonb value against a JSON Schema draft-07 document and returns a boolean. Used with CHECK constraints to enforce document structure at the database layer on every INSERT and UPDATE. Supports $ref, anyOf, oneOf, if/then/else, pattern, format (date, email, uuid), minimum, maximum, minItems, maxItems, additionalProperties, and patternProperties. Available as a trusted extension on Supabase; installable from source on self-hosted PostgreSQL. The constraint is enforced transactionally — no document violating the schema can be committed.
jsonb_set
A PostgreSQL function that returns a modified copy of a jsonb value with a specific path replaced by a new value. Signature: jsonb_set(target jsonb, path text[], new_value jsonb, create_missing boolean = true). The path is a PostgreSQL array literal: {'key'} for a top-level key, {'outer','inner'} for nested access. The new value must be a valid JSON literal — string values require inner double-quotes: '"shipped"' . Use to_jsonb() for safe type conversion: to_jsonb('shipped'::text). When create_missing = true (default), the path is created if absent. jsonb_set() does not mutate in place — it returns a new value that must be assigned in an UPDATE statement. Chain calls for multi-field atomic updates.
JSON_TABLE
A SQL standard table-valued function added in PostgreSQL 17 that converts JSON arrays into relational rows using standard SQL syntax. Usage: SELECT * FROM orders, JSON_TABLE(data->'items', '$[*]' COLUMNS (product TEXT PATH '$.product', qty INT PATH '$.qty')) AS items; — each element of the items array becomes a row with typed columns. Before PostgreSQL 17, the equivalent was achieved with jsonb_array_elements() in a lateral join with manual column extraction. JSON_TABLE() is more concise, handles type conversion declaratively, and supports DEFAULT values and error handling via the ON ERROR clause. It is part of the ISO SQL/JSON standard and allows portability between database systems.

FAQ

What is the difference between json and jsonb in PostgreSQL?

json stores the raw input text verbatim — it validates syntax on write but re-parses the text on every read, preserving whitespace, duplicate keys, and original key ordering. jsonb parses JSON on write and stores a binary representation: keys are sorted, duplicates de-duplicated (last value wins), whitespace stripped. The binary form makes reads 10–100× faster for indexed queries because PostgreSQL navigates a binary tree directly instead of re-parsing text. Only jsonb supports the @>, ?, ?|, ?& operators and GIN indexes. jsonb uses 30–50% more disk space than json. Use jsonb for everything — use json only for compliance audit logs where byte-exact preservation of the original payload is legally required (e.g., signed webhooks where altering whitespace breaks HMAC verification). Cast freely between types: ::json::jsonb and ::jsonb::json.

How do I query a jsonb column in PostgreSQL?

For equality filters, use containment (GIN-indexed): SELECT * FROM orders WHERE data @> '{"status":"paid"}'; — this runs in O(log n) with a GIN index. For string comparisons with ->> : {"SELECT * FROM orders WHERE data->>'status' = 'paid';"} — use a B-tree index on the extracted field for this pattern. For nested access: {"data->'address'->>'city'"} or the path operator {"data #>> '{address,city}'"}. For numeric comparisons, cast: {"WHERE (data->>'amount')::numeric > 100"}. For nested array filtering (JSON path): {"WHERE data @? '$.items[*].price ? (@ > 100)'"} — GIN-indexed with jsonb_path_ops. Never use -> in a WHERE clause comparing to a string literal — it returns jsonb and causes a type mismatch error. Use ->> for text, @> for containment.

How do I create a GIN index on a jsonb column?

Full GIN index supporting all operators: CREATE INDEX idx_data_gin ON orders USING GIN (data); — enables @>, ?, ?|, ?&, @?, @@. Compact index supporting only @> and path operators: CREATE INDEX idx_data_jpo ON orders USING GIN (data jsonb_path_ops); — ~30% smaller. For a specific extracted field, use B-tree: {"CREATE INDEX idx_status ON orders ((data->>'status'));"} — enables equality and range queries with ORDER BY. Always verify with EXPLAIN (ANALYZE) — look for "Bitmap Index Scan" or "Index Scan", not "Seq Scan". Note: GIN does not accelerate -> or ->> extraction operators — only containment and key-existence operators. A query using {"data->>'status' = 'paid'"} will NOT use the GIN index; rewrite as {"data @> '{"status":"paid"}'"} or add a B-tree index on {"(data->>'status')"}.

How do I use JSON path expressions in PostgreSQL?

PostgreSQL 12+ supports SQL/JSON path via jsonb_path_query(), jsonb_path_exists(), and the @?/@@ operators. Navigate to a field: {"jsonb_path_query(data, '$.status')"}. Iterate an array: {"jsonb_path_query(data, '$.items[*]')"}. Filter array elements: {"jsonb_path_query(data, '$.items[*] ? (@.price > 100)')"} — returns only items with price above 100. In a WHERE clause: {"WHERE data @? '$.items[*].price ? (@ > 100)'"} — tests if any item price exceeds 100 (GIN-accelerated with jsonb_path_ops). Boolean test: {"WHERE data @@ '$.score >= 9.0'"}. Collect results as array: {"jsonb_path_query_array(data, '$.items[*].product')"} returns ["Widget","Gadget"]. Variables (PG 14+): {"jsonb_path_query(data, '$.items[*] ? (@.price > $min)', '{"min": 100}')"} — parameterize safely without string interpolation.

How do I update a nested jsonb field in PostgreSQL?

Use jsonb_set(): UPDATE orders SET data = jsonb_set(data, '{status}', '"shipped"') WHERE id = 1; The path is a PostgreSQL array literal. String values need inner double-quotes: '"shipped"' not 'shipped''shipped' is not valid JSON. Safe alternative: jsonb_set(data, '{status}', to_jsonb('shipped'::text)). For nested paths: jsonb_set(data, '{address,city}', '"Austin"'). Increment a number: jsonb_set(data, '{views}', to_jsonb(coalesce((data->>'views')::int, 0) + 1)). Chain for multi-field atomic updates: jsonb_set(jsonb_set(data, '{status}', '"shipped"'), '{tracking}', '"TRK-123"') . Shallow merge new keys: data || '{"shipped_at":"2026-01-25"}' (overwrites top-level matching keys). Remove a key: {"data - 'temp'"}. Remove nested: {"data #- '{address,zip}'"}.

How do I validate JSON data in PostgreSQL with a schema?

Use the pg_jsonschema extension: CREATE EXTENSION pg_jsonschema; then add a CHECK constraint: ALTER TABLE orders ADD CONSTRAINT valid_data CHECK (jsonb_matches_schema('{"type":"object","required":["status","amount"]}', data)); This fires on every INSERT and UPDATE — if the document fails validation, PostgreSQL raises an error and rolls back the write. No application code needed; validation is enforced regardless of which client connects. pg_jsonschema supports JSON Schema draft-07: $ref, anyOf, oneOf, if/then/else, pattern, format (email, uuid, date), minimum, maximum, minItems, maxItems, additionalProperties. Available as a trusted extension on Supabase; installable from source elsewhere. Without the extension, use basic CHECK constraints: {"CHECK (data ? 'status' AND data->>'status' IN ('paid','pending'))"}. For application-layer JSON data validation before database writes, see the linked guide.

How do I handle jsonb in Prisma ORM?

In Prisma schema, declare data Json — Prisma maps this to PostgreSQL jsonb. Insert by passing a plain JavaScript object: data: { status: 'paid', amount: 99.99 }. Query with the path filter: where: { data: { path: ['status'], equals: 'paid' } }. For array containment: where: { data: { path: ['tags'], array_contains: 'express' } }. The critical limitation: Prisma path filters translate to ->> extraction queries that do NOT use GIN indexes. For GIN-accelerated @> containment queries in production, use prisma.$queryRaw: prisma.$queryRaw`SELECT * FROM "Order" WHERE data @> ${JSON.stringify(filter)}::jsonb`. GIN indexes must be added via migration SQL — Prisma does not auto-generate them. For full type safety, use a TypeScript type assertion on the Json field: data Json @db.JsonB with a TypeScript type alias. Drizzle ORM provides more direct jsonb operator access through its sql template tag without requiring raw queries.

How do I convert a JSON array to rows in PostgreSQL?

Use jsonb_array_elements() in a lateral join: SELECT id, elem FROM orders, jsonb_array_elements(data->>'items') AS elem WHERE id = 1; — each array element becomes a row as jsonb. For text output: jsonb_array_elements_text(data->'tags'). Access fields within expanded objects: {"elem->>'product' AS product, (elem->>'qty')::int AS qty"}. For JSON path: {"SELECT jsonb_path_query(data, '$.items[*]') FROM orders;"} — equivalent but composable in WHERE clauses. PostgreSQL 17 adds JSON_TABLE() for standard SQL conversion: JSON_TABLE(data->'items', '$[*]' COLUMNS (product TEXT PATH '$.product', qty INT PATH '$.qty')) — returns a proper relational table with typed columns. Aggregate back to an array: {"SELECT jsonb_agg(elem ORDER BY (elem->>'qty')::int DESC) FROM orders, jsonb_array_elements(data->'items') AS elem;"}. For objects (not arrays): {"jsonb_each(data)"} expands to (key text, value jsonb) rows.

Further reading and primary sources