MongoDB JSON Documents: BSON, EJSON, Mongoose & $jsonSchema

Q: How do I serialize a MongoDB document to JSON?

The native JSON.stringify() fails silently with MongoDB documents because ObjectId and Date have no standard JSON representation — ObjectId serializes as an empty object {} and Date serializes as an ISO string but loses the BSON Date type. Three correct approaches: (1) EJSON.stringify() from the bson package — import { EJSON } from "bson"; EJSON.stringify(doc) — preserves types as Extended JSON: {"$oid": "..."}, {"$date": "..."}. (2) Custom replacer — JSON.stringify(doc, (key, val) => val instanceof ObjectId ? val.toString() : val) — converts ObjectId to 24-character hex string, Date to ISO string. (3) Mongoose toJSON() — configure the schema with toJSON: { virtuals: true, transform: (doc, ret) => { ret.id = ret._id.toString(); delete ret._id; delete ret.__v; return ret; } } — controls exactly how documents serialize in res.json(). For REST APIs, always define an explicit toJSON transform so consumers receive plain strings, not ObjectId instances that serialize differently across environments.

Q: What is MongoDB Extended JSON (EJSON)?

Extended JSON (EJSON) is a text-based JSON representation that preserves BSON type information using special wrapper objects. EJSON has two modes: Canonical (verbose, type-preserving) and Relaxed (human-readable, loses some type fidelity). In Canonical mode: ObjectId is {"$oid": "507f1f77bcf86cd799439011"}, Date is {"$date": {"$numberLong": "1705276800000"}}, Decimal128 is {"$numberDecimal": "19.99"}, Int64 is {"$numberLong": "9007199254740993"}, BinData is {"$binary": {"base64": "...", "subType": "00"}}. In Relaxed mode: ObjectId is {"$oid": "507f..."}, Date is {"$date": "2024-01-15T00:00:00Z"} (ISO string), numbers are plain JSON numbers. Use the bson package: import { EJSON } from "bson"; const str = EJSON.stringify(doc, null, 2); const parsed = EJSON.parse(str). mongoexport outputs Relaxed EJSON by default; add --canonical for Canonical mode. EJSON is the correct format for round-tripping MongoDB documents through text-based systems without losing type information.

Q: How do I query MongoDB with JSON-like filters?

MongoDB query filters are plain JavaScript objects (JSON-like documents) passed to find(), findOne(), updateOne(), or deleteOne(). A plain value is an implicit $eq: collection.find({ status: "active" }). Comparison operators use nested objects: { price: { $gt: 10, $lte: 100 } }. Array operators: { tags: { $in: ["sale", "featured"] } } matches any element; { tags: { $all: ["sale", "featured"] } } requires both. Logical operators: { $or: [{ price: { $lt: 5 } }, { inStock: true }] }. Nested field queries use dot notation: { "address.city": "London" }. For arrays of embedded documents, use $elemMatch when multiple conditions must match the same element: { reviews: { $elemMatch: { rating: { $gte: 4 }, verified: true } } }. Field existence: { discount: { $exists: true } }. BSON type filter: { _id: { $type: "objectId" } }. Aggregation pipeline stages ($match, $group, $project) use the same JSON-like filter syntax as find().

Q: How do I validate MongoDB document structure with $jsonSchema?

$jsonSchema is a MongoDB validator that enforces document structure on insert and update operations using JSON Schema draft-04 syntax. Define it when creating a collection: db.createCollection("users", { validator: { $jsonSchema: { bsonType: "object", required: ["email", "createdAt"], properties: { email: { bsonType: "string", pattern: "^.+@.+$" }, age: { bsonType: "int", minimum: 0, maximum: 150 }, createdAt: { bsonType: "date" } } } } }). Key differences from standard JSON Schema: use bsonType instead of type to support BSON types ("objectId", "date", "decimal", "binData"); standard JSON Schema type values ("string", "number", "boolean") are also supported for compatibility. validationAction controls behavior on violation: "error" (default) rejects the write; "warn" logs a warning but allows the write. validationLevel: "strict" validates all inserts and updates; "moderate" only validates inserts and updates to documents that already pass validation. $jsonSchema runs on each write — it does not retroactively validate existing documents.

Q: How do I use Mongoose to return JSON from a MongoDB query?

Mongoose documents have a toJSON() method that controls JSON serialization. Configure the schema-level transform to shape the output: const userSchema = new Schema({ name: String, email: String }, { toJSON: { virtuals: true, transform: (doc, ret) => { ret.id = ret._id.toString(); delete ret._id; delete ret.__v; return ret; } } }). After this, res.json(user) and JSON.stringify(user) both call toJSON() automatically — ObjectId _id becomes a plain string id. To include virtual fields (like fullName computed from firstName + lastName), set virtuals: true in the toJSON option. For lean() queries (which return plain JavaScript objects instead of Mongoose documents), toJSON transforms are not applied — you must handle serialization manually or avoid lean() when JSON output matters. Use toObject() for the same transform when you need a plain object (not a JSON string) in application code.

Q: How do I handle MongoDB ObjectId in JSON APIs?

ObjectId is a 12-byte BSON type that serializes to a 24-character hex string like "507f1f77bcf86cd799439011". In REST API responses, always convert ObjectId to a plain string — JSON.stringify() of an ObjectId instance produces {} (empty object) without a toJSON() method or custom replacer. Three patterns: (1) Mongoose toJSON transform — ret.id = ret._id.toString(); delete ret._id — automatically applied on res.json(). (2) Native driver projection — add { projection: { _id: 0, id: { $toString: "$_id" } } } to find() calls to convert at the database level. (3) Manual conversion — doc._id.toString() wherever you build response objects. When accepting ObjectId in API request bodies (e.g., a userId field), convert the incoming string back to ObjectId before querying: new ObjectId(req.body.userId). Validate that the string is a valid 24-character hex before calling new ObjectId() — an invalid string throws BSONError. In TypeScript, type incoming IDs as string and outgoing IDs as string; use ObjectId only internally for database operations.

Q: How do I type MongoDB documents in TypeScript?

The MongoDB Node.js driver is fully typed — use the generic parameter on Collection to type documents. Define a document interface: interface UserDoc { _id: ObjectId; name: string; email: string; createdAt: Date; role: "admin" | "user"; }. Then: const users = db.collection ("users"). find() returns FindCursor , findOne() returns Promise , insertOne() accepts WithoutId (the driver type that excludes _id for inserts). For REST API response types, define a separate DTO interface where _id is replaced by id: string and Date fields are string. Use a mapper function: function toUserDto(doc: UserDoc): UserDto { return { id: doc._id.toString(), name: doc.name, email: doc.email, createdAt: doc.createdAt.toISOString() }; }. With Mongoose, define the schema interface and pass it to Schema and model ("User", schema) — Mongoose 6+ infers types from the schema definition using type inference plugins. Avoid using any for document types — it defeats the purpose of TypeScript and hides ObjectId serialization bugs at compile time.

Written and reviewed by the Jsonic editorial team — every guide is verified against the official spec or runtime before publication.

Last updated: May 20, 2026

MongoDB stores data as BSON (Binary JSON) — a binary-encoded superset of JSON that adds types JSON lacks: ObjectId, Date, Binary, Decimal128, Regular Expression, and Timestamp. When you read from MongoDB via the Node.js driver, BSON types are converted to JavaScript objects — ObjectId becomes an ObjectId instance (not a plain string), Date becomes a JavaScript Date. JSON.stringify() of a MongoDB document requires toJSON() or a custom replacer since ObjectId.toString() is not called automatically. This guide covers BSON vs JSON differences, Extended JSON (EJSON) for type-preserving serialization, Mongoose schema-to-JSON mapping, aggregation pipeline JSON operators, and $jsonSchema validation. Every example includes TypeScript types with Mongoose and the native driver.

BSON vs JSON: MongoDB's Type System

BSON is MongoDB's binary serialization format — not a text format. Where JSON supports six types (string, number, boolean, null, array, object), BSON adds over 20: ObjectId (12-byte unique ID), Date (64-bit UTC milliseconds), Decimal128 (128-bit IEEE 754 for financial math), BinData (arbitrary binary), Regex (pattern plus flags), Int32, Int64, Timestamp (internal replication clock), MinKey, MaxKey, and others. The MongoDB driver handles BSON serialization automatically — you insert plain JavaScript objects and the driver converts them to BSON on the wire. The practical impact is that types with no JSON equivalent must be handled explicitly when serializing to JSON for APIs or storage.

import { MongoClient, ObjectId, Decimal128, Long } from 'mongodb'

const client = new MongoClient('mongodb://localhost:27017')
const db = client.db('shop')

// ── Inserting BSON-typed fields ────────────────────────────────────
await db.collection('products').insertOne({
  _id:       new ObjectId(),                    // 12-byte unique ID (auto if omitted)
  name:      'Widget Pro',
  price:     Decimal128.fromString('19.99'),    // no float rounding — financial-safe
  createdAt: new Date(),                        // BSON Date — 64-bit UTC ms
  tags:      ['electronics', 'sale'],
  specs:     { weight: 150, unit: 'g' },        // embedded document
  inStock:   true,
  eventId:   Long.fromString('9007199254740993'), // 64-bit int — beyond JS safe range
})

// ── ObjectId encodes its own creation timestamp ────────────────────
const id = new ObjectId('507f1f77bcf86cd799439011')
console.log(id.getTimestamp())  // Date: Mon Oct 18 2010 ...
// No separate createdAt index needed for approximate creation-time ordering

// ── BSON type comparison to JSON ──────────────────────────────────
// JSON type      BSON equivalent        JSON.stringify() result
// ─────────────────────────────────────────────────────────────────
// number         Int32 / Int64          number (loses 64-bit precision for large values)
// number         Decimal128             {} ← WRONG — no toJSON by default
// string         String                 "string"
// boolean        Boolean                true / false
// null           Null                   null
// (no JSON equiv) ObjectId             {} ← WRONG — must call .toString()
// (no JSON equiv) Date                 "2024-01-15T00:00:00.000Z" (via Date.toJSON)
// (no JSON equiv) BinData              {} ← WRONG — must encode as base64 string

// ── What BSON types look like in JSON exports (Extended JSON) ──────
const ejsonRepresentations = {
  objectId:    { '$oid': '507f1f77bcf86cd799439011' },
  date:        { '$date': '2024-01-15T00:00:00.000Z' },
  decimal128:  { '$numberDecimal': '19.99' },
  int64:       { '$numberLong': '9007199254740993' },
  binData:     { '$binary': { base64: 'abc123==', subType: '00' } },
  regex:       { '$regularExpression': { pattern: 'abc', options: 'i' } },
}

// ── Check BSON size in mongosh ─────────────────────────────────────
// Object.bsonsize(db.products.findOne())  → size in bytes (max 16 MB)

The 16 MB BSON document size limit applies to the binary-serialized form — not the JSON text size. A document with many Decimal128 fields may be smaller in BSON than its JSON representation. ObjectId's 12-byte encoding compresses the 24-character hex string by 50%. The key practical difference from JSON: JSON.stringify() is unsafe for MongoDB documents out of the box — ObjectId, Decimal128, and BinData all produce incorrect output without explicit handling. Always use EJSON or a toJSON transform.

Extended JSON (EJSON): Type-Preserving Serialization

Extended JSON (EJSON) is a text format that represents BSON types as JSON objects with special $-prefixed keys. EJSON has two modes: Canonical (fully type-preserving, verbose) and Relaxed (human-readable, loses some type fidelity). Use EJSON when you need to round-trip MongoDB documents through text-based systems — message queues, REST responses that must be re-ingested, export files — without losing ObjectId or Date type information. The bson npm package provides EJSON.stringify() and EJSON.parse().

import { EJSON } from 'bson'
import { ObjectId } from 'mongodb'

const doc = {
  _id:       new ObjectId('507f1f77bcf86cd799439011'),
  name:      'Alice',
  createdAt: new Date('2024-01-15T00:00:00Z'),
  score:     1234567890123456789n,  // BigInt would fail JSON.stringify
}

// ── EJSON Canonical mode — fully type-preserving ───────────────────
const canonical = EJSON.stringify(doc, null, 2, { relaxed: false })
// {
//   "_id":       { "$oid": "507f1f77bcf86cd799439011" },
//   "name":      "Alice",
//   "createdAt": { "$date": { "$numberLong": "1705276800000" } },
// }

// ── EJSON Relaxed mode (default) — human-readable ─────────────────
const relaxed = EJSON.stringify(doc, null, 2)
// {
//   "_id":       { "$oid": "507f1f77bcf86cd799439011" },
//   "name":      "Alice",
//   "createdAt": { "$date": "2024-01-15T00:00:00.000Z" },  ← ISO string
// }

// ── Parse EJSON back — types are restored ─────────────────────────
const parsed = EJSON.parse(relaxed)
console.log(parsed._id instanceof ObjectId)    // true
console.log(parsed.createdAt instanceof Date)  // true

// ── mongoexport uses Relaxed EJSON by default ──────────────────────
// mongoexport --uri="..." --db=mydb --collection=users --out=users.json
// Add --canonical for Canonical EJSON output

// ── Using EJSON in a REST API ──────────────────────────────────────
// When the consumer will re-ingest the data into MongoDB:
import express from 'express'
const app = express()

app.get('/api/users/:id', async (req, res) => {
  const user = await db.collection('users').findOne({
    _id: new ObjectId(req.params.id),
  })
  if (!user) return res.status(404).json({ error: 'Not found' })
  // Use EJSON to preserve ObjectId and Date types for re-ingestion
  res.type('application/json').send(EJSON.stringify(user))
})

// ── JSON.stringify vs EJSON.stringify comparison ──────────────────
const testDoc = { _id: new ObjectId(), ts: new Date() }

JSON.stringify(testDoc)
// {"_id":{},"ts":"2024-01-15T00:00:00.000Z"}  ← _id is WRONG

EJSON.stringify(testDoc)
// {"_id":{"$oid":"..."},"ts":{"$date":"2024-01-15T00:00:00.000Z"}}  ← correct

Choose EJSON Canonical when the output will be parsed by a system that must reconstruct exact BSON types — for example, mongoimport with the --jsonArray flag or another MongoDB driver. Choose Relaxed when the consumer is a frontend JavaScript application or external API that just needs a human-readable date string and a hex ObjectId string. For most REST APIs, neither EJSON mode is ideal — consumers expect {"id": "507f..."}, not {"_id": {"$oid": "507f..."}}. Use a Mongoose toJSON transform or a manual DTO mapper instead.

Mongoose: Mapping JSON to MongoDB Documents

Mongoose is the most widely used MongoDB ODM (Object Document Mapper) for Node.js. It adds schema enforcement, validation, middleware (hooks), and controlled JSON serialization on top of the native driver. The toJSON() and toObject() schema options control how Mongoose documents serialize — critical for REST API responses where consumers expect plain strings, not ObjectId instances. Configure the transform once at the schema level and every res.json() call uses it automatically.

import mongoose, { Schema, model, Document, Types } from 'mongoose'

// ── Define TypeScript interface for the document ───────────────────
interface IUser {
  _id: Types.ObjectId
  name: string
  email: string
  role: 'admin' | 'user'
  createdAt: Date
  updatedAt: Date
}

// ── Schema with toJSON transform ───────────────────────────────────
const userSchema = new Schema<IUser>(
  {
    name:  { type: String, required: true },
    email: { type: String, required: true, unique: true, lowercase: true },
    role:  { type: String, enum: ['admin', 'user'], default: 'user' },
  },
  {
    timestamps: true,  // adds createdAt and updatedAt automatically
    toJSON: {
      virtuals: true,  // include virtual fields (e.g. fullName)
      transform: (_doc, ret) => {
        ret.id = ret._id.toString()  // expose id as string
        delete ret._id               // remove _id (ObjectId)
        delete ret.__v               // remove version key
        // Convert Date to ISO string explicitly (Date.toJSON() does this anyway,
        // but being explicit prevents surprises if the field is a string in DB)
        ret.createdAt = ret.createdAt?.toISOString()
        ret.updatedAt = ret.updatedAt?.toISOString()
        return ret
      },
    },
  }
)

// ── Add a virtual field ────────────────────────────────────────────
userSchema.virtual('profileUrl').get(function () {
  return `https://jsonic.io/users/${this._id.toString()}`
})

const User = model<IUser>('User', userSchema)

// ── Querying and serializing ───────────────────────────────────────
const user = await User.findOne({ email: 'alice@example.com' })

// res.json() calls JSON.stringify() which calls toJSON() on Mongoose doc
// Output: { id: "507f...", name: "Alice", email: "alice@...", role: "user",
//           createdAt: "2024-01-15T00:00:00.000Z", ..., profileUrl: "..." }

// ── lean() bypasses toJSON ─────────────────────────────────────────
// lean() returns a plain JS object — toJSON transform NOT applied
const rawDoc = await User.findOne({ email: 'alice@example.com' }).lean()
// rawDoc._id is ObjectId, rawDoc.__v is number — must map manually
const dto = { id: rawDoc._id.toString(), name: rawDoc.name, email: rawDoc.email }

// ── Populate + toJSON ──────────────────────────────────────────────
// Populated referenced documents also call toJSON recursively
interface IPost {
  _id: Types.ObjectId
  title: string
  author: Types.ObjectId | IUser
}
const postSchema = new Schema<IPost>({
  title:  String,
  author: { type: Schema.Types.ObjectId, ref: 'User' },
})
const Post = model<IPost>('Post', postSchema)

const post = await Post.findById(postId).populate('author')
// post.toJSON() → { id: "...", title: "...", author: { id: "...", name: "..." } }
// author is also transformed by User's toJSON schema option

The lean() option returns raw POJO (plain old JavaScript object) from MongoDB — bypassing Mongoose document instantiation, middleware, and toJSON transforms. This is 3-5x faster than non-lean queries and is ideal for read-only endpoints where you only need data, not document methods. However, you must handle ObjectId-to-string conversion manually with lean. A common pattern: use .lean() in API routes and call a dedicated DTO mapper function on the result. See also our guide on TypeScript JSON types for typing the DTO layer.

JSON.stringify() with MongoDB Documents

JSON.stringify() fails silently with MongoDB documents — ObjectId serializes as {} (empty object) because it has no toJSON() method in the native driver, and Decimal128 has the same issue. Date does have toJSON() and produces an ISO string, but that loses the BSON Date type distinction. The fix is a custom replacer function passed as the second argument to JSON.stringify(), or using EJSON, or configuring Mongoose's toJSON schema option.

import { ObjectId, Decimal128, Binary } from 'mongodb'

// ── The problem ────────────────────────────────────────────────────
const doc = {
  _id:   new ObjectId('507f1f77bcf86cd799439011'),
  price: Decimal128.fromString('19.99'),
  ts:    new Date('2024-01-15'),
}

JSON.stringify(doc)
// {"_id":{},"price":{},"ts":"2024-01-15T00:00:00.000Z"}
//   ^^^^ WRONG  ^^^^ WRONG

// ── Solution 1: Custom replacer ────────────────────────────────────
function mongoReplacer(_key: string, value: unknown): unknown {
  if (value instanceof ObjectId)   return value.toString()          // "507f..."
  if (value instanceof Decimal128) return parseFloat(value.toString()) // 19.99
  if (value instanceof Binary)     return value.buffer.toString('base64')
  return value
}

JSON.stringify(doc, mongoReplacer, 2)
// {
//   "_id":   "507f1f77bcf86cd799439011",
//   "price": 19.99,
//   "ts":    "2024-01-15T00:00:00.000Z"
// }

// ── Solution 2: Transform before stringify ─────────────────────────
function serializeDoc<T extends Record<string, unknown>>(doc: T): Record<string, unknown> {
  return Object.fromEntries(
    Object.entries(doc).map(([k, v]) => {
      if (v instanceof ObjectId)   return [k, v.toString()]
      if (v instanceof Decimal128) return [k, v.toString()]
      if (v instanceof Date)       return [k, v.toISOString()]
      if (v !== null && typeof v === 'object' && !Array.isArray(v))
        return [k, serializeDoc(v as Record<string, unknown>)]
      return [k, v]
    })
  )
}

// ── Solution 3: EJSON.stringify (bson package) ─────────────────────
import { EJSON } from 'bson'
EJSON.stringify(doc)
// {"_id":{"$oid":"507f..."},"price":{"$numberDecimal":"19.99"},"ts":{"$date":"..."}}
// Correct — but uses Extended JSON wrapper objects, not plain strings

// ── Solution 4: Mongoose toJSON schema option (recommended) ────────
// Configure once, every res.json() call uses it automatically — see Section 3

// ── Express middleware: safe JSON response ─────────────────────────
import express from 'express'
const app = express()

// Override res.json to use EJSON — useful for internal microservices
app.use((_req, res, next) => {
  const originalJson = res.json.bind(res)
  res.json = function (data: unknown) {
    return originalJson(JSON.parse(EJSON.stringify(data)))
  }
  next()
})

// ── TypeScript: detect serialization issues at compile time ────────
// Define a serialized type where ObjectId → string, Date → string
type Serialized<T> = {
  [K in keyof T]: T[K] extends ObjectId ? string
    : T[K] extends Date ? string
    : T[K] extends Decimal128 ? number
    : T[K]
}

The custom replacer approach is the lowest-overhead solution for native driver queries — no extra dependencies, no schema setup. However, it must be kept in sync manually as new BSON types are used. Mongoose's toJSON transform is the most maintainable approach for applications using Mongoose, because it is schema-scoped and applies automatically on every serialization. For microservices that pass MongoDB documents between services via JSON, EJSON with the Relaxed mode is the safest: it is still valid JSON, parseable by any JSON parser, and consumers that use the bson package can restore types with EJSON.parse().

Aggregation Pipeline: JSON Query Operators

MongoDB's aggregation pipeline uses JSON-like stage documents to express data transformations. Each stage is a plain object with a single $operator key. The $project stage controls field selection with JSON numeric flags (1 = include, 0 = exclude) — identical syntax to find() projections. The $match stage uses the same query operator syntax as find(). Computed expressions in $project and $addFields use JSON operator objects like {"$concat": ["$first", " ", "$last"]}.

const col = db.collection('orders')

// ── $match — JSON query filter (same syntax as find()) ─────────────
// ── $group — aggregate with JSON accumulator operators ─────────────
// ── $project — JSON field selection: 1 include, 0 exclude ──────────
const report = await col.aggregate([
  {
    $match: {                         // filter stage — uses index if first
      status:    'completed',
      createdAt: { $gte: new Date('2024-01-01') },
      amount:    { $gt: 0 },
    },
  },
  {
    $group: {
      _id:          '$category',       // group key — field reference with $prefix
      totalRevenue: { $sum: '$amount' },
      orderCount:   { $sum: 1 },
      avgOrder:     { $avg: '$amount' },
      customers:    { $addToSet: '$customerId' },  // unique set accumulator
    },
  },
  {
    $project: {
      _id:             0,              // 0 = exclude
      category:        '$_id',         // rename: _id → category
      totalRevenue:    { $round: ['$totalRevenue', 2] },  // computed expression
      orderCount:      1,              // 1 = include as-is
      avgOrder:        { $round: ['$avgOrder', 2] },
      uniqueCustomers: { $size: '$customers' },  // array length operator
    },
  },
  { $sort: { totalRevenue: -1 } },
  { $limit: 10 },
]).toArray()

// ── $lookup — JSON join between collections ─────────────────────────
const ordersWithUsers = await col.aggregate([
  {
    $lookup: {
      from:         'users',
      localField:   'customerId',
      foreignField: '_id',
      as:           'customer',
    },
  },
  { $unwind: '$customer' },
  {
    $project: {
      orderId:       { $toString: '$_id' },  // ObjectId → string in pipeline
      amount:        1,
      customerName:  '$customer.name',
      customerEmail: '$customer.email',
      _id:           0,
    },
  },
]).toArray()

// ── $addFields with JSON expressions ───────────────────────────────
await db.collection('users').aggregate([
  {
    $addFields: {
      fullName:   { $concat: ['$firstName', ' ', '$lastName'] },
      ageInDays:  { $dateDiff: { startDate: '$birthDate', endDate: '$$NOW', unit: 'day' } },
      idAsString: { $toString: '$_id' },  // ObjectId → string within pipeline
    },
  },
]).toArray()

// ── $facet — run multiple sub-pipelines in one pass ────────────────
const searchPage = await db.collection('products').aggregate([
  { $match: { $text: { $search: 'widget' }, inStock: true } },
  {
    $facet: {
      results:    [
        { $addFields: { score: { $meta: 'textScore' } } },
        { $sort: { score: -1 } },
        { $limit: 10 },
        { $project: { name: 1, price: 1, _id: 0, score: 1 } },
      ],
      totalCount: [{ $count: 'count' }],
      categories: [
        { $group: { _id: '$category', count: { $sum: 1 } } },
        { $sort: { count: -1 } },
      ],
    },
  },
]).toArray()

The JSON field-reference convention in aggregation uses a $ prefix to distinguish field names from literal string values — "$amount" means the value of the amount field, while "amount" (without $) is a literal string. System variables use $$: $$NOW (current timestamp), $$ROOT (entire input document), $$CURRENT (current document being processed). This JSON-like expression language is also used in JSON API design contexts where MongoDB is the backing store. Always place $match on indexed fields as the first stage — it reduces the document count before expensive stages like $lookup and $group.

$jsonSchema: Document Validation in MongoDB

$jsonSchema is a MongoDB collection validator that enforces document structure on every insert and update using JSON Schema draft-04 syntax with BSON-specific extensions. Define it at collection creation time — or add it later with collMod. The validator runs server-side on each write operation. It does not retroactively validate existing documents unless you explicitly re-validate them. The key BSON extension is the bsonType keyword, which accepts BSON type names ("objectId", "date", "decimal") in addition to standard JSON Schema types.

// ── Create collection with $jsonSchema validator ───────────────────
await db.createCollection('users', {
  validator: {
    $jsonSchema: {
      bsonType: 'object',
      required: ['email', 'role', 'createdAt'],
      additionalProperties: false,   // reject unknown fields
      properties: {
        _id:       { bsonType: 'objectId' },
        email:     {
          bsonType:    'string',
          pattern:     '^[^@]+@[^@]+\.[^@]+$',
          description: 'must be a valid email address',
        },
        role: {
          bsonType: 'string',
          enum:     ['admin', 'user', 'editor'],
        },
        age: {
          bsonType:    'int',
          minimum:     0,
          maximum:     150,
          description: 'must be an integer between 0 and 150',
        },
        createdAt:  { bsonType: 'date' },
        updatedAt:  { bsonType: 'date' },
        tags:       {
          bsonType: 'array',
          items:    { bsonType: 'string' },
          maxItems: 20,
        },
        address: {
          bsonType: 'object',
          required: ['city', 'country'],
          properties: {
            street:  { bsonType: 'string' },
            city:    { bsonType: 'string' },
            country: { bsonType: 'string', minLength: 2, maxLength: 2 },
          },
        },
        price: { bsonType: 'decimal' },  // BSON Decimal128
      },
    },
  },
  validationAction: 'error',   // 'error' (default) rejects write | 'warn' logs
  validationLevel:  'strict',  // 'strict' validates all writes | 'moderate' only new
})

// ── Add validator to existing collection ───────────────────────────
await db.command({
  collMod:    'products',
  validator: {
    $jsonSchema: {
      bsonType:   'object',
      required:   ['name', 'price'],
      properties: {
        name:  { bsonType: 'string', minLength: 1 },
        price: { bsonType: 'decimal', description: 'must be a Decimal128' },
      },
    },
  },
})

// ── View current validator ─────────────────────────────────────────
const info = await db.listCollections({ name: 'users' }).toArray()
console.log(JSON.stringify(info[0].options.validator, null, 2))

// ── $jsonSchema in a query (not just validation) ───────────────────
// Find all documents that do NOT match the schema (existing bad data)
const invalids = await db.collection('users').find({
  $nor: [{
    $jsonSchema: {
      bsonType:   'object',
      required:   ['email', 'role'],
      properties: {
        email: { bsonType: 'string' },
        role:  { bsonType: 'string', enum: ['admin', 'user', 'editor'] },
      },
    },
  }],
}).toArray()
// Useful for auditing existing data before adding strict validation

// ── TypeScript: generate $jsonSchema from a Zod schema ─────────────
import { z } from 'zod'
// (manual mapping — no official Zod-to-BSON-schema converter)
const UserZod = z.object({
  email: z.string().email(),
  role:  z.enum(['admin', 'user', 'editor']),
  age:   z.number().int().min(0).max(150).optional(),
})

$jsonSchema uses bsonType instead of the standard JSON Schema type keyword — this is the most common source of confusion. Use bsonType: "int" for 32-bit integers (not type: "integer"), bsonType: "date" for BSON Date, bsonType: "objectId" for ObjectId. The standard type: "string" and type: "boolean" also work for basic types. additionalProperties: false rejects any field not listed in properties — useful for strict schemas but requires listing every field including _id, createdAt, and updatedAt. See also our guide on JSON Schema validation for the JSON Schema standard and JSON data validation for library-level validation.

TypeScript Types for MongoDB JSON Documents

The MongoDB Node.js driver is fully generic — Collection<T> accepts a TypeScript interface and types all find, insert, and update return values accordingly. The key is defining two separate interfaces: a document interface (with BSON types like ObjectId and Date) for database operations, and a DTO interface (with serialized types like string) for API responses. Mixing the two causes subtle bugs where ObjectId instances reach JSON responses and serialize incorrectly.

import { ObjectId, Decimal128, Collection, WithId, Filter, UpdateFilter } from 'mongodb'

// ── Document interface — BSON types for DB operations ─────────────
interface UserDoc {
  _id:       ObjectId
  name:      string
  email:     string
  role:      'admin' | 'user' | 'editor'
  age?:      number
  createdAt: Date
  updatedAt: Date
}

// ── DTO interface — serialized types for API responses ─────────────
interface UserDto {
  id:        string   // _id.toString()
  name:      string
  email:     string
  role:      'admin' | 'user' | 'editor'
  age?:      number
  createdAt: string   // Date.toISOString()
  updatedAt: string
}

// ── Mapper function — document to DTO ─────────────────────────────
function toUserDto(doc: WithId<UserDoc>): UserDto {
  return {
    id:        doc._id.toString(),
    name:      doc.name,
    email:     doc.email,
    role:      doc.role,
    age:       doc.age,
    createdAt: doc.createdAt.toISOString(),
    updatedAt: doc.updatedAt.toISOString(),
  }
}

// ── Typed collection usage ─────────────────────────────────────────
const users: Collection<UserDoc> = db.collection<UserDoc>('users')

// findOne returns WithId<UserDoc> | null — _id is ObjectId
const user = await users.findOne({ email: 'alice@example.com' })
if (user) {
  const dto: UserDto = toUserDto(user)
  res.json(dto)  // safe — no ObjectId in the response
}

// insertOne accepts WithoutId<UserDoc> — driver generates _id
await users.insertOne({
  name:      'Alice',
  email:     'alice@example.com',
  role:      'user',
  createdAt: new Date(),
  updatedAt: new Date(),
})

// Typed filter — Field types are checked
const filter: Filter<UserDoc> = {
  role:      'admin',
  createdAt: { $gte: new Date('2024-01-01') },
  // _id:    'bad-string'  ← TypeScript error: string not assignable to ObjectId
}

// Typed update
const update: UpdateFilter<UserDoc> = {
  $set:   { role: 'editor', updatedAt: new Date() },
  $unset: { age: '' },
}

// ── Mongoose typed schema ──────────────────────────────────────────
import { Schema, model, InferSchemaType } from 'mongoose'

const productSchema = new Schema({
  name:     { type: String,  required: true },
  price:    { type: Number,  required: true, min: 0 },
  inStock:  { type: Boolean, default: true },
  tags:     [String],
  category: { type: String,  required: true },
})

// Infer TypeScript type from schema definition (Mongoose 6+)
type ProductDoc = InferSchemaType<typeof productSchema>
// { name: string; price: number; inStock: boolean; tags: string[]; category: string }

const Product = model('Product', productSchema)

// ── Aggregation pipeline typed result ─────────────────────────────
interface SalesReport {
  category:        string
  totalRevenue:    number
  orderCount:      number
  uniqueCustomers: number
}

const report = await db.collection('orders')
  .aggregate<SalesReport>([
    { $match: { status: 'completed' } },
    { $group: { _id: '$category', totalRevenue: { $sum: '$amount' }, orderCount: { $sum: 1 } } },
    { $project: { _id: 0, category: '$_id', totalRevenue: 1, orderCount: 1 } },
  ])
  .toArray()
// report is SalesReport[] — fully typed

The document/DTO split is the single most important TypeScript pattern for MongoDB APIs. Without it, ObjectId instances silently appear in JSON responses as {}. Use the WithId<T> utility type from the driver for documents returned by findOne() and find() — it adds _id: ObjectId automatically. Use WithoutId<T> for insert payloads where _id is optional. See our guides on TypeScript JSON types and JSON API design for the broader typing and response-shaping patterns.

Key Terms

BSON: Binary JSON — MongoDB's internal binary serialization format. BSON is a binary-encoded superset of JSON that supports all JSON types plus over 20 additional types: ObjectId (12-byte unique identifier), Date (64-bit UTC milliseconds), Decimal128 (128-bit IEEE 754 for financial precision), BinData (arbitrary binary), Regex (pattern plus flags), Int32, Int64, Timestamp, MinKey, and MaxKey. BSON encodes each field with a type byte and length prefix, enabling fast field skipping during traversal without full document parsing. The MongoDB driver converts application objects to BSON on insert and back to application objects on read — application code never handles raw BSON bytes. Document size is limited to 16 MB of BSON-encoded data.
EJSON (Extended JSON): A text format that represents BSON types within valid JSON using special $-prefixed wrapper objects. EJSON has two modes: Canonical (fully type-preserving) and Relaxed (human-readable). In Relaxed mode, ObjectId is {"$oid": "507f..."}, Date is {"$date": "2024-01-15T00:00:00Z"}, and Decimal128 is {"$numberDecimal": "19.99"}. Use the bson npm package: EJSON.stringify(doc) and EJSON.parse(str). mongoexport outputs Relaxed EJSON by default. EJSON is the correct format for round-tripping MongoDB documents through text-based systems without losing BSON type information.
ObjectId: A 12-byte BSON type used as the default _id in MongoDB documents. The 12 bytes encode: 4 bytes Unix timestamp (seconds), 5 bytes random machine/process ID, 3 bytes incrementing counter. The timestamp prefix makes ObjectIds approximately time-ordered — _id.getTimestamp() returns the creation Date. In JSON, ObjectId serializes as a 24-character lowercase hex string: "507f1f77bcf86cd799439011". In EJSON it is {"$oid": "507f..."}. A naively JSON.stringify()-d ObjectId produces {} — always use .toString() or a toJSON transform.
Mongoose: The most widely used MongoDB ODM (Object Document Mapper) for Node.js. Mongoose adds schema enforcement, validation, middleware hooks, and controlled serialization on top of the native driver. Key serialization features: toJSON() and toObject() schema options accept a transform function that controls how documents serialize — called automatically on res.json(). The virtuals: true option in toJSON includes computed virtual fields. The lean() query option returns plain JavaScript objects without Mongoose document wrapping, bypassing toJSON transforms — 3-5x faster for read-only queries.
aggregation pipeline: MongoDB's server-side data processing framework, composed of sequential JSON stage documents. Each stage is a plain object with a single $operator key. Common stages: $match (filter, same syntax as find()), $group (group by key with accumulator operators $sum, $avg, $addToSet), $project (field selection with 1/0 flags and JSON expression operators), $lookup (left outer join), $unwind (deconstruct array to one document per element), $addFields (add computed fields), $facet (parallel sub-pipelines). Field references use a $ prefix: "$fieldName"; system variables use $$: $$NOW, $$ROOT.
$jsonSchema: A MongoDB collection validator that enforces document structure using JSON Schema draft-04 syntax with BSON-specific extensions. Applied as a collection option via db.createCollection(name, { validator: { $jsonSchema: {...} } }) or added later with collMod. Key BSON extension: bsonType keyword accepts BSON type names ("objectId", "date", "decimal", "binData") in addition to standard JSON Schema types. validationAction: "error" (default) rejects invalid writes; "warn" logs but allows. Runs on every insert and update — does not retroactively validate existing documents. Also usable as a find() filter operator to identify non-conforming existing documents.
toJSON(): A JavaScript method called automatically by JSON.stringify() on any object that defines it. Mongoose schema option toJSON: { transform: (doc, ret) => ret } lets you control document serialization: rename _id to id, delete __v, convert ObjectId fields to strings, include virtual fields. The native MongoDB driver's ObjectId class does not define toJSON() by default (in some versions it does in others it does not) — always use .toString() explicitly or configure a Mongoose toJSON transform. Date has toJSON() that returns toISOString(), so Date fields serialize correctly with plain JSON.stringify().

FAQ

What is the difference between BSON and JSON in MongoDB?

BSON (Binary JSON) is MongoDB's internal binary serialization format — a binary-encoded superset of JSON. JSON supports six types (string, number, boolean, null, array, object); BSON adds over 20 more: ObjectId, Date (64-bit UTC milliseconds), Decimal128, BinData, Regex, Int32, Int64, Timestamp, MinKey, MaxKey, and others. BSON is not human-readable text — it is a compact binary format that encodes field lengths ahead of values, making document traversal and field skipping faster than JSON parsing. The MongoDB driver converts your native language objects (JavaScript objects, Python dicts) to BSON on insert and back to native objects on find — you never work with raw BSON bytes. When BSON types appear in JSON exports (mongoexport) or shell output, they use Extended JSON notation: {"$oid": "..."} for ObjectId, {"$date": "..."} for Date. The key practical impact: JSON.stringify() of a MongoDB document produces incorrect output for ObjectId and Decimal128 without explicit handling.

How do I serialize a MongoDB document to JSON?

The native JSON.stringify() fails silently with MongoDB documents — ObjectId serializes as {} because it lacks a toJSON() method, and Decimal128 has the same issue. Four correct approaches: (1) EJSON.stringify() from the bson package — import { EJSON } from "bson"; EJSON.stringify(doc) — preserves types as Extended JSON wrappers. (2) Custom replacer — JSON.stringify(doc, (k, v) => v instanceof ObjectId ? v.toString() : v) — converts ObjectId to hex string. (3) Mongoose toJSON transform — configure at schema level: toJSON: { transform: (_doc, ret) => { ret.id = ret._id.toString(); delete ret._id; return ret; } } — applied automatically on every res.json() call. (4) Aggregation $toString — convert ObjectId to string at the database level in a pipeline projection. For REST APIs, the Mongoose toJSON transform is most maintainable; for internal microservices, EJSON preserves type information for re-ingestion.

What is MongoDB Extended JSON (EJSON)?

Extended JSON (EJSON) is a text representation of BSON that preserves type information using $-prefixed wrapper objects within valid JSON. EJSON has two modes: Canonical (fully type-preserving, verbose) and Relaxed (human-readable, preferred for APIs). In Relaxed mode: ObjectId is {"$oid": "507f1f77bcf86cd799439011"}, Date is {"$date": "2024-01-15T00:00:00Z"}, Decimal128 is {"$numberDecimal": "19.99"}, Int64 is {"$numberLong": "9007199254740993"}. Use the bson package: EJSON.stringify(doc, null, 2) to serialize and EJSON.parse(str) to restore types. mongoexport outputs Relaxed EJSON by default; add --canonical for Canonical mode. EJSON is the correct format when you need to round-trip MongoDB documents through text-based systems — message queues, S3 files, cross-service APIs — without losing type information.

How do I query MongoDB with JSON-like filters?

MongoDB query filters are plain JavaScript objects (JSON documents) passed to find(), findOne(), updateOne(), and deleteOne(). A plain value is an implicit $eq: collection.find({ status: "active" }). Comparison operators nest inside the field object: { price: { $gt: 10, $lte: 100 } }. Array matching: { tags: { $in: ["sale", "featured"] } } matches any element; $all requires all elements. Logical: { $or: [{...}, {...}] }. Nested field queries use dot notation: { "address.city": "London" }. For arrays of embedded documents, use $elemMatch when multiple conditions must match the same element — without it, conditions can match across different elements. Field existence: { field: { $exists: true } }. BSON type filter: { _id: { $type: "objectId" } }. The same JSON filter syntax works in aggregation $match stages.

How do I validate MongoDB document structure with $jsonSchema?

Add a $jsonSchema validator when creating the collection or via collMod: db.createCollection("users", { validator: { $jsonSchema: { bsonType: "object", required: ["email"], properties: { email: { bsonType: "string", pattern: "^.+@.+$" }, age: { bsonType: "int", minimum: 0 } } } } }). Use bsonType instead of type for BSON-specific types: "objectId", "date", "decimal", "binData". Standard JSON Schema type values ("string", "boolean") also work. Set validationAction: "error" (default) to reject invalid writes, or "warn" to log but allow. Set validationLevel: "strict" to validate all inserts and updates, or "moderate" to only validate writes to documents that already pass. $jsonSchema runs on each write operation — it does not validate existing documents retroactively. Use $jsonSchema in a find({ $nor: [{ $jsonSchema: {...} }] } query to audit non-conforming existing documents.

How do I use Mongoose to return JSON from a MongoDB query?

Configure a toJSON transform at the schema level — it is called automatically whenever JSON.stringify() or res.json() serializes a Mongoose document: new Schema({...}, { toJSON: { virtuals: true, transform: (_doc, ret) => { ret.id = ret._id.toString(); delete ret._id; delete ret.__v; return ret; } } }). Set virtuals: true to include virtual fields computed from schema properties (e.g., fullName derived from firstName and lastName). The transform receives doc (the original Mongoose document) and ret (the plain object copy being serialized) — modify and return ret. For .lean() queries, toJSON transforms are not applied — lean returns plain objects from the driver. Use lean for high-performance read-only endpoints and add a separate mapper function for DTO conversion. Populated referenced documents also call their schema's toJSON recursively.

How do I handle MongoDB ObjectId in JSON APIs?

ObjectId is a 12-byte BSON type that must be explicitly converted to a 24-character hex string for JSON APIs — JSON.stringify() without a toJSON method produces {}. Three patterns for API responses: (1) Mongoose toJSON transform — ret.id = ret._id.toString(); delete ret._id — applied on every res.json() call. (2) Aggregation $toString — { $project: { id: { $toString: "$_id" }, _id: 0, name: 1 } } — converts at the database level. (3) Manual .toString() — call doc._id.toString() explicitly when building DTO objects. For incoming ObjectId strings in API request bodies (e.g., a userId path parameter), convert back before querying: new ObjectId(req.params.id) — validate it is a valid 24-character hex first; an invalid string throws BSONError. In TypeScript, type incoming IDs as string and convert to ObjectId at the service/repository boundary.

How do I type MongoDB documents in TypeScript?

Define two separate interfaces: a document interface for database operations (with BSON types like ObjectId and Date) and a DTO interface for API responses (with serialized types: string for IDs and dates). Pass the document interface to the collection generic: db.collection<UserDoc>("users"). findOne() returns WithId<UserDoc> | null — _id is typed as ObjectId. insertOne() accepts WithoutId<UserDoc> — the driver adds _id automatically. Write a mapper function to convert from document interface to DTO: function toDto(doc: WithId<UserDoc>): UserDto { return { id: doc._id.toString(), ... } }. With Mongoose, use InferSchemaType<typeof schema> to infer the TypeScript type automatically from the schema definition (Mongoose 6+). Type aggregation results with the pipeline generic: collection.aggregate<ResultType>([...]). Never use any for document types — it hides ObjectId serialization bugs at compile time.