MongoDB JSON Documents: BSON, EJSON, Mongoose & $jsonSchema
Last updated:
MongoDB stores data as BSON (Binary JSON) — a binary-encoded superset of JSON that adds types JSON lacks: ObjectId, Date, Binary, Decimal128, Regular Expression, and Timestamp. When you read from MongoDB via the Node.js driver, BSON types are converted to JavaScript objects — ObjectId becomes an ObjectId instance (not a plain string), Date becomes a JavaScript Date. JSON.stringify() of a MongoDB document requires toJSON() or a custom replacer since ObjectId.toString() is not called automatically. This guide covers BSON vs JSON differences, Extended JSON (EJSON) for type-preserving serialization, Mongoose schema-to-JSON mapping, aggregation pipeline JSON operators, and $jsonSchema validation. Every example includes TypeScript types with Mongoose and the native driver.
BSON vs JSON: MongoDB's Type System
BSON is MongoDB's binary serialization format — not a text format. Where JSON supports six types (string, number, boolean, null, array, object), BSON adds over 20: ObjectId (12-byte unique ID), Date (64-bit UTC milliseconds), Decimal128 (128-bit IEEE 754 for financial math), BinData (arbitrary binary), Regex (pattern plus flags), Int32, Int64, Timestamp (internal replication clock), MinKey, MaxKey, and others. The MongoDB driver handles BSON serialization automatically — you insert plain JavaScript objects and the driver converts them to BSON on the wire. The practical impact is that types with no JSON equivalent must be handled explicitly when serializing to JSON for APIs or storage.
import { MongoClient, ObjectId, Decimal128, Long } from 'mongodb'
const client = new MongoClient('mongodb://localhost:27017')
const db = client.db('shop')
// ── Inserting BSON-typed fields ────────────────────────────────────
await db.collection('products').insertOne({
_id: new ObjectId(), // 12-byte unique ID (auto if omitted)
name: 'Widget Pro',
price: Decimal128.fromString('19.99'), // no float rounding — financial-safe
createdAt: new Date(), // BSON Date — 64-bit UTC ms
tags: ['electronics', 'sale'],
specs: { weight: 150, unit: 'g' }, // embedded document
inStock: true,
eventId: Long.fromString('9007199254740993'), // 64-bit int — beyond JS safe range
})
// ── ObjectId encodes its own creation timestamp ────────────────────
const id = new ObjectId('507f1f77bcf86cd799439011')
console.log(id.getTimestamp()) // Date: Mon Oct 18 2010 ...
// No separate createdAt index needed for approximate creation-time ordering
// ── BSON type comparison to JSON ──────────────────────────────────
// JSON type BSON equivalent JSON.stringify() result
// ─────────────────────────────────────────────────────────────────
// number Int32 / Int64 number (loses 64-bit precision for large values)
// number Decimal128 {} ← WRONG — no toJSON by default
// string String "string"
// boolean Boolean true / false
// null Null null
// (no JSON equiv) ObjectId {} ← WRONG — must call .toString()
// (no JSON equiv) Date "2024-01-15T00:00:00.000Z" (via Date.toJSON)
// (no JSON equiv) BinData {} ← WRONG — must encode as base64 string
// ── What BSON types look like in JSON exports (Extended JSON) ──────
const ejsonRepresentations = {
objectId: { '$oid': '507f1f77bcf86cd799439011' },
date: { '$date': '2024-01-15T00:00:00.000Z' },
decimal128: { '$numberDecimal': '19.99' },
int64: { '$numberLong': '9007199254740993' },
binData: { '$binary': { base64: 'abc123==', subType: '00' } },
regex: { '$regularExpression': { pattern: 'abc', options: 'i' } },
}
// ── Check BSON size in mongosh ─────────────────────────────────────
// Object.bsonsize(db.products.findOne()) → size in bytes (max 16 MB)The 16 MB BSON document size limit applies to the binary-serialized form — not the JSON text size. A document with many Decimal128 fields may be smaller in BSON than its JSON representation. ObjectId's 12-byte encoding compresses the 24-character hex string by 50%. The key practical difference from JSON: JSON.stringify() is unsafe for MongoDB documents out of the box — ObjectId, Decimal128, and BinData all produce incorrect output without explicit handling. Always use EJSON or a toJSON transform.
Extended JSON (EJSON): Type-Preserving Serialization
Extended JSON (EJSON) is a text format that represents BSON types as JSON objects with special $-prefixed keys. EJSON has two modes: Canonical (fully type-preserving, verbose) and Relaxed (human-readable, loses some type fidelity). Use EJSON when you need to round-trip MongoDB documents through text-based systems — message queues, REST responses that must be re-ingested, export files — without losing ObjectId or Date type information. The bson npm package provides EJSON.stringify() and EJSON.parse().
import { EJSON } from 'bson'
import { ObjectId } from 'mongodb'
const doc = {
_id: new ObjectId('507f1f77bcf86cd799439011'),
name: 'Alice',
createdAt: new Date('2024-01-15T00:00:00Z'),
score: 1234567890123456789n, // BigInt would fail JSON.stringify
}
// ── EJSON Canonical mode — fully type-preserving ───────────────────
const canonical = EJSON.stringify(doc, null, 2, { relaxed: false })
// {
// "_id": { "$oid": "507f1f77bcf86cd799439011" },
// "name": "Alice",
// "createdAt": { "$date": { "$numberLong": "1705276800000" } },
// }
// ── EJSON Relaxed mode (default) — human-readable ─────────────────
const relaxed = EJSON.stringify(doc, null, 2)
// {
// "_id": { "$oid": "507f1f77bcf86cd799439011" },
// "name": "Alice",
// "createdAt": { "$date": "2024-01-15T00:00:00.000Z" }, ← ISO string
// }
// ── Parse EJSON back — types are restored ─────────────────────────
const parsed = EJSON.parse(relaxed)
console.log(parsed._id instanceof ObjectId) // true
console.log(parsed.createdAt instanceof Date) // true
// ── mongoexport uses Relaxed EJSON by default ──────────────────────
// mongoexport --uri="..." --db=mydb --collection=users --out=users.json
// Add --canonical for Canonical EJSON output
// ── Using EJSON in a REST API ──────────────────────────────────────
// When the consumer will re-ingest the data into MongoDB:
import express from 'express'
const app = express()
app.get('/api/users/:id', async (req, res) => {
const user = await db.collection('users').findOne({
_id: new ObjectId(req.params.id),
})
if (!user) return res.status(404).json({ error: 'Not found' })
// Use EJSON to preserve ObjectId and Date types for re-ingestion
res.type('application/json').send(EJSON.stringify(user))
})
// ── JSON.stringify vs EJSON.stringify comparison ──────────────────
const testDoc = { _id: new ObjectId(), ts: new Date() }
JSON.stringify(testDoc)
// {"_id":{},"ts":"2024-01-15T00:00:00.000Z"} ← _id is WRONG
EJSON.stringify(testDoc)
// {"_id":{"$oid":"..."},"ts":{"$date":"2024-01-15T00:00:00.000Z"}} ← correctChoose EJSON Canonical when the output will be parsed by a system that must reconstruct exact BSON types — for example, mongoimport with the --jsonArray flag or another MongoDB driver. Choose Relaxed when the consumer is a frontend JavaScript application or external API that just needs a human-readable date string and a hex ObjectId string. For most REST APIs, neither EJSON mode is ideal — consumers expect {"id": "507f..."}, not {"_id": {"$oid": "507f..."}}. Use a Mongoose toJSON transform or a manual DTO mapper instead.
Mongoose: Mapping JSON to MongoDB Documents
Mongoose is the most widely used MongoDB ODM (Object Document Mapper) for Node.js. It adds schema enforcement, validation, middleware (hooks), and controlled JSON serialization on top of the native driver. The toJSON() and toObject() schema options control how Mongoose documents serialize — critical for REST API responses where consumers expect plain strings, not ObjectId instances. Configure the transform once at the schema level and every res.json() call uses it automatically.
import mongoose, { Schema, model, Document, Types } from 'mongoose'
// ── Define TypeScript interface for the document ───────────────────
interface IUser {
_id: Types.ObjectId
name: string
email: string
role: 'admin' | 'user'
createdAt: Date
updatedAt: Date
}
// ── Schema with toJSON transform ───────────────────────────────────
const userSchema = new Schema<IUser>(
{
name: { type: String, required: true },
email: { type: String, required: true, unique: true, lowercase: true },
role: { type: String, enum: ['admin', 'user'], default: 'user' },
},
{
timestamps: true, // adds createdAt and updatedAt automatically
toJSON: {
virtuals: true, // include virtual fields (e.g. fullName)
transform: (_doc, ret) => {
ret.id = ret._id.toString() // expose id as string
delete ret._id // remove _id (ObjectId)
delete ret.__v // remove version key
// Convert Date to ISO string explicitly (Date.toJSON() does this anyway,
// but being explicit prevents surprises if the field is a string in DB)
ret.createdAt = ret.createdAt?.toISOString()
ret.updatedAt = ret.updatedAt?.toISOString()
return ret
},
},
}
)
// ── Add a virtual field ────────────────────────────────────────────
userSchema.virtual('profileUrl').get(function () {
return `https://jsonic.io/users/${this._id.toString()}`
})
const User = model<IUser>('User', userSchema)
// ── Querying and serializing ───────────────────────────────────────
const user = await User.findOne({ email: 'alice@example.com' })
// res.json() calls JSON.stringify() which calls toJSON() on Mongoose doc
// Output: { id: "507f...", name: "Alice", email: "alice@...", role: "user",
// createdAt: "2024-01-15T00:00:00.000Z", ..., profileUrl: "..." }
// ── lean() bypasses toJSON ─────────────────────────────────────────
// lean() returns a plain JS object — toJSON transform NOT applied
const rawDoc = await User.findOne({ email: 'alice@example.com' }).lean()
// rawDoc._id is ObjectId, rawDoc.__v is number — must map manually
const dto = { id: rawDoc._id.toString(), name: rawDoc.name, email: rawDoc.email }
// ── Populate + toJSON ──────────────────────────────────────────────
// Populated referenced documents also call toJSON recursively
interface IPost {
_id: Types.ObjectId
title: string
author: Types.ObjectId | IUser
}
const postSchema = new Schema<IPost>({
title: String,
author: { type: Schema.Types.ObjectId, ref: 'User' },
})
const Post = model<IPost>('Post', postSchema)
const post = await Post.findById(postId).populate('author')
// post.toJSON() → { id: "...", title: "...", author: { id: "...", name: "..." } }
// author is also transformed by User's toJSON schema optionThe lean() option returns raw POJO (plain old JavaScript object) from MongoDB — bypassing Mongoose document instantiation, middleware, and toJSON transforms. This is 3-5x faster than non-lean queries and is ideal for read-only endpoints where you only need data, not document methods. However, you must handle ObjectId-to-string conversion manually with lean. A common pattern: use .lean() in API routes and call a dedicated DTO mapper function on the result. See also our guide on TypeScript JSON types for typing the DTO layer.
JSON.stringify() with MongoDB Documents
JSON.stringify() fails silently with MongoDB documents — ObjectId serializes as {} (empty object) because it has no toJSON() method in the native driver, and Decimal128 has the same issue. Date does have toJSON() and produces an ISO string, but that loses the BSON Date type distinction. The fix is a custom replacer function passed as the second argument to JSON.stringify(), or using EJSON, or configuring Mongoose's toJSON schema option.
import { ObjectId, Decimal128, Binary } from 'mongodb'
// ── The problem ────────────────────────────────────────────────────
const doc = {
_id: new ObjectId('507f1f77bcf86cd799439011'),
price: Decimal128.fromString('19.99'),
ts: new Date('2024-01-15'),
}
JSON.stringify(doc)
// {"_id":{},"price":{},"ts":"2024-01-15T00:00:00.000Z"}
// ^^^^ WRONG ^^^^ WRONG
// ── Solution 1: Custom replacer ────────────────────────────────────
function mongoReplacer(_key: string, value: unknown): unknown {
if (value instanceof ObjectId) return value.toString() // "507f..."
if (value instanceof Decimal128) return parseFloat(value.toString()) // 19.99
if (value instanceof Binary) return value.buffer.toString('base64')
return value
}
JSON.stringify(doc, mongoReplacer, 2)
// {
// "_id": "507f1f77bcf86cd799439011",
// "price": 19.99,
// "ts": "2024-01-15T00:00:00.000Z"
// }
// ── Solution 2: Transform before stringify ─────────────────────────
function serializeDoc<T extends Record<string, unknown>>(doc: T): Record<string, unknown> {
return Object.fromEntries(
Object.entries(doc).map(([k, v]) => {
if (v instanceof ObjectId) return [k, v.toString()]
if (v instanceof Decimal128) return [k, v.toString()]
if (v instanceof Date) return [k, v.toISOString()]
if (v !== null && typeof v === 'object' && !Array.isArray(v))
return [k, serializeDoc(v as Record<string, unknown>)]
return [k, v]
})
)
}
// ── Solution 3: EJSON.stringify (bson package) ─────────────────────
import { EJSON } from 'bson'
EJSON.stringify(doc)
// {"_id":{"$oid":"507f..."},"price":{"$numberDecimal":"19.99"},"ts":{"$date":"..."}}
// Correct — but uses Extended JSON wrapper objects, not plain strings
// ── Solution 4: Mongoose toJSON schema option (recommended) ────────
// Configure once, every res.json() call uses it automatically — see Section 3
// ── Express middleware: safe JSON response ─────────────────────────
import express from 'express'
const app = express()
// Override res.json to use EJSON — useful for internal microservices
app.use((_req, res, next) => {
const originalJson = res.json.bind(res)
res.json = function (data: unknown) {
return originalJson(JSON.parse(EJSON.stringify(data)))
}
next()
})
// ── TypeScript: detect serialization issues at compile time ────────
// Define a serialized type where ObjectId → string, Date → string
type Serialized<T> = {
[K in keyof T]: T[K] extends ObjectId ? string
: T[K] extends Date ? string
: T[K] extends Decimal128 ? number
: T[K]
}The custom replacer approach is the lowest-overhead solution for native driver queries — no extra dependencies, no schema setup. However, it must be kept in sync manually as new BSON types are used. Mongoose's toJSON transform is the most maintainable approach for applications using Mongoose, because it is schema-scoped and applies automatically on every serialization. For microservices that pass MongoDB documents between services via JSON, EJSON with the Relaxed mode is the safest: it is still valid JSON, parseable by any JSON parser, and consumers that use the bson package can restore types with EJSON.parse().
Aggregation Pipeline: JSON Query Operators
MongoDB's aggregation pipeline uses JSON-like stage documents to express data transformations. Each stage is a plain object with a single $operator key. The $project stage controls field selection with JSON numeric flags (1 = include, 0 = exclude) — identical syntax to find() projections. The $match stage uses the same query operator syntax as find(). Computed expressions in $project and $addFields use JSON operator objects like {"$concat": ["$first", " ", "$last"]}.
const col = db.collection('orders')
// ── $match — JSON query filter (same syntax as find()) ─────────────
// ── $group — aggregate with JSON accumulator operators ─────────────
// ── $project — JSON field selection: 1 include, 0 exclude ──────────
const report = await col.aggregate([
{
$match: { // filter stage — uses index if first
status: 'completed',
createdAt: { $gte: new Date('2024-01-01') },
amount: { $gt: 0 },
},
},
{
$group: {
_id: '$category', // group key — field reference with $prefix
totalRevenue: { $sum: '$amount' },
orderCount: { $sum: 1 },
avgOrder: { $avg: '$amount' },
customers: { $addToSet: '$customerId' }, // unique set accumulator
},
},
{
$project: {
_id: 0, // 0 = exclude
category: '$_id', // rename: _id → category
totalRevenue: { $round: ['$totalRevenue', 2] }, // computed expression
orderCount: 1, // 1 = include as-is
avgOrder: { $round: ['$avgOrder', 2] },
uniqueCustomers: { $size: '$customers' }, // array length operator
},
},
{ $sort: { totalRevenue: -1 } },
{ $limit: 10 },
]).toArray()
// ── $lookup — JSON join between collections ─────────────────────────
const ordersWithUsers = await col.aggregate([
{
$lookup: {
from: 'users',
localField: 'customerId',
foreignField: '_id',
as: 'customer',
},
},
{ $unwind: '$customer' },
{
$project: {
orderId: { $toString: '$_id' }, // ObjectId → string in pipeline
amount: 1,
customerName: '$customer.name',
customerEmail: '$customer.email',
_id: 0,
},
},
]).toArray()
// ── $addFields with JSON expressions ───────────────────────────────
await db.collection('users').aggregate([
{
$addFields: {
fullName: { $concat: ['$firstName', ' ', '$lastName'] },
ageInDays: { $dateDiff: { startDate: '$birthDate', endDate: '$$NOW', unit: 'day' } },
idAsString: { $toString: '$_id' }, // ObjectId → string within pipeline
},
},
]).toArray()
// ── $facet — run multiple sub-pipelines in one pass ────────────────
const searchPage = await db.collection('products').aggregate([
{ $match: { $text: { $search: 'widget' }, inStock: true } },
{
$facet: {
results: [
{ $addFields: { score: { $meta: 'textScore' } } },
{ $sort: { score: -1 } },
{ $limit: 10 },
{ $project: { name: 1, price: 1, _id: 0, score: 1 } },
],
totalCount: [{ $count: 'count' }],
categories: [
{ $group: { _id: '$category', count: { $sum: 1 } } },
{ $sort: { count: -1 } },
],
},
},
]).toArray()The JSON field-reference convention in aggregation uses a $ prefix to distinguish field names from literal string values — "$amount" means the value of the amount field, while "amount" (without $) is a literal string. System variables use $$: $$NOW (current timestamp), $$ROOT (entire input document), $$CURRENT (current document being processed). This JSON-like expression language is also used in JSON API design contexts where MongoDB is the backing store. Always place $match on indexed fields as the first stage — it reduces the document count before expensive stages like $lookup and $group.
$jsonSchema: Document Validation in MongoDB
$jsonSchema is a MongoDB collection validator that enforces document structure on every insert and update using JSON Schema draft-04 syntax with BSON-specific extensions. Define it at collection creation time — or add it later with collMod. The validator runs server-side on each write operation. It does not retroactively validate existing documents unless you explicitly re-validate them. The key BSON extension is the bsonType keyword, which accepts BSON type names ("objectId", "date", "decimal") in addition to standard JSON Schema types.
// ── Create collection with $jsonSchema validator ───────────────────
await db.createCollection('users', {
validator: {
$jsonSchema: {
bsonType: 'object',
required: ['email', 'role', 'createdAt'],
additionalProperties: false, // reject unknown fields
properties: {
_id: { bsonType: 'objectId' },
email: {
bsonType: 'string',
pattern: '^[^@]+@[^@]+\.[^@]+$',
description: 'must be a valid email address',
},
role: {
bsonType: 'string',
enum: ['admin', 'user', 'editor'],
},
age: {
bsonType: 'int',
minimum: 0,
maximum: 150,
description: 'must be an integer between 0 and 150',
},
createdAt: { bsonType: 'date' },
updatedAt: { bsonType: 'date' },
tags: {
bsonType: 'array',
items: { bsonType: 'string' },
maxItems: 20,
},
address: {
bsonType: 'object',
required: ['city', 'country'],
properties: {
street: { bsonType: 'string' },
city: { bsonType: 'string' },
country: { bsonType: 'string', minLength: 2, maxLength: 2 },
},
},
price: { bsonType: 'decimal' }, // BSON Decimal128
},
},
},
validationAction: 'error', // 'error' (default) rejects write | 'warn' logs
validationLevel: 'strict', // 'strict' validates all writes | 'moderate' only new
})
// ── Add validator to existing collection ───────────────────────────
await db.command({
collMod: 'products',
validator: {
$jsonSchema: {
bsonType: 'object',
required: ['name', 'price'],
properties: {
name: { bsonType: 'string', minLength: 1 },
price: { bsonType: 'decimal', description: 'must be a Decimal128' },
},
},
},
})
// ── View current validator ─────────────────────────────────────────
const info = await db.listCollections({ name: 'users' }).toArray()
console.log(JSON.stringify(info[0].options.validator, null, 2))
// ── $jsonSchema in a query (not just validation) ───────────────────
// Find all documents that do NOT match the schema (existing bad data)
const invalids = await db.collection('users').find({
$nor: [{
$jsonSchema: {
bsonType: 'object',
required: ['email', 'role'],
properties: {
email: { bsonType: 'string' },
role: { bsonType: 'string', enum: ['admin', 'user', 'editor'] },
},
},
}],
}).toArray()
// Useful for auditing existing data before adding strict validation
// ── TypeScript: generate $jsonSchema from a Zod schema ─────────────
import { z } from 'zod'
// (manual mapping — no official Zod-to-BSON-schema converter)
const UserZod = z.object({
email: z.string().email(),
role: z.enum(['admin', 'user', 'editor']),
age: z.number().int().min(0).max(150).optional(),
})$jsonSchema uses bsonType instead of the standard JSON Schema type keyword — this is the most common source of confusion. Use bsonType: "int" for 32-bit integers (not type: "integer"), bsonType: "date" for BSON Date, bsonType: "objectId" for ObjectId. The standard type: "string" and type: "boolean" also work for basic types. additionalProperties: false rejects any field not listed in properties — useful for strict schemas but requires listing every field including _id, createdAt, and updatedAt. See also our guide on JSON Schema validation for the JSON Schema standard and JSON data validation for library-level validation.
TypeScript Types for MongoDB JSON Documents
The MongoDB Node.js driver is fully generic — Collection<T> accepts a TypeScript interface and types all find, insert, and update return values accordingly. The key is defining two separate interfaces: a document interface (with BSON types like ObjectId and Date) for database operations, and a DTO interface (with serialized types like string) for API responses. Mixing the two causes subtle bugs where ObjectId instances reach JSON responses and serialize incorrectly.
import { ObjectId, Decimal128, Collection, WithId, Filter, UpdateFilter } from 'mongodb'
// ── Document interface — BSON types for DB operations ─────────────
interface UserDoc {
_id: ObjectId
name: string
email: string
role: 'admin' | 'user' | 'editor'
age?: number
createdAt: Date
updatedAt: Date
}
// ── DTO interface — serialized types for API responses ─────────────
interface UserDto {
id: string // _id.toString()
name: string
email: string
role: 'admin' | 'user' | 'editor'
age?: number
createdAt: string // Date.toISOString()
updatedAt: string
}
// ── Mapper function — document to DTO ─────────────────────────────
function toUserDto(doc: WithId<UserDoc>): UserDto {
return {
id: doc._id.toString(),
name: doc.name,
email: doc.email,
role: doc.role,
age: doc.age,
createdAt: doc.createdAt.toISOString(),
updatedAt: doc.updatedAt.toISOString(),
}
}
// ── Typed collection usage ─────────────────────────────────────────
const users: Collection<UserDoc> = db.collection<UserDoc>('users')
// findOne returns WithId<UserDoc> | null — _id is ObjectId
const user = await users.findOne({ email: 'alice@example.com' })
if (user) {
const dto: UserDto = toUserDto(user)
res.json(dto) // safe — no ObjectId in the response
}
// insertOne accepts WithoutId<UserDoc> — driver generates _id
await users.insertOne({
name: 'Alice',
email: 'alice@example.com',
role: 'user',
createdAt: new Date(),
updatedAt: new Date(),
})
// Typed filter — Field types are checked
const filter: Filter<UserDoc> = {
role: 'admin',
createdAt: { $gte: new Date('2024-01-01') },
// _id: 'bad-string' ← TypeScript error: string not assignable to ObjectId
}
// Typed update
const update: UpdateFilter<UserDoc> = {
$set: { role: 'editor', updatedAt: new Date() },
$unset: { age: '' },
}
// ── Mongoose typed schema ──────────────────────────────────────────
import { Schema, model, InferSchemaType } from 'mongoose'
const productSchema = new Schema({
name: { type: String, required: true },
price: { type: Number, required: true, min: 0 },
inStock: { type: Boolean, default: true },
tags: [String],
category: { type: String, required: true },
})
// Infer TypeScript type from schema definition (Mongoose 6+)
type ProductDoc = InferSchemaType<typeof productSchema>
// { name: string; price: number; inStock: boolean; tags: string[]; category: string }
const Product = model('Product', productSchema)
// ── Aggregation pipeline typed result ─────────────────────────────
interface SalesReport {
category: string
totalRevenue: number
orderCount: number
uniqueCustomers: number
}
const report = await db.collection('orders')
.aggregate<SalesReport>([
{ $match: { status: 'completed' } },
{ $group: { _id: '$category', totalRevenue: { $sum: '$amount' }, orderCount: { $sum: 1 } } },
{ $project: { _id: 0, category: '$_id', totalRevenue: 1, orderCount: 1 } },
])
.toArray()
// report is SalesReport[] — fully typedThe document/DTO split is the single most important TypeScript pattern for MongoDB APIs. Without it, ObjectId instances silently appear in JSON responses as {}. Use the WithId<T> utility type from the driver for documents returned by findOne() and find() — it adds _id: ObjectId automatically. Use WithoutId<T> for insert payloads where _id is optional. See our guides on TypeScript JSON types and JSON API design for the broader typing and response-shaping patterns.
Key Terms
- BSON
- Binary JSON — MongoDB's internal binary serialization format. BSON is a binary-encoded superset of JSON that supports all JSON types plus over 20 additional types: ObjectId (12-byte unique identifier), Date (64-bit UTC milliseconds), Decimal128 (128-bit IEEE 754 for financial precision), BinData (arbitrary binary), Regex (pattern plus flags), Int32, Int64, Timestamp, MinKey, and MaxKey. BSON encodes each field with a type byte and length prefix, enabling fast field skipping during traversal without full document parsing. The MongoDB driver converts application objects to BSON on insert and back to application objects on read — application code never handles raw BSON bytes. Document size is limited to 16 MB of BSON-encoded data.
- EJSON (Extended JSON)
- A text format that represents BSON types within valid JSON using special
$-prefixed wrapper objects. EJSON has two modes: Canonical (fully type-preserving) and Relaxed (human-readable). In Relaxed mode, ObjectId is{"$oid": "507f..."}, Date is{"$date": "2024-01-15T00:00:00Z"}, and Decimal128 is{"$numberDecimal": "19.99"}. Use thebsonnpm package:EJSON.stringify(doc)andEJSON.parse(str).mongoexportoutputs Relaxed EJSON by default. EJSON is the correct format for round-tripping MongoDB documents through text-based systems without losing BSON type information. - ObjectId
- A 12-byte BSON type used as the default
_idin MongoDB documents. The 12 bytes encode: 4 bytes Unix timestamp (seconds), 5 bytes random machine/process ID, 3 bytes incrementing counter. The timestamp prefix makes ObjectIds approximately time-ordered —_id.getTimestamp()returns the creation Date. In JSON, ObjectId serializes as a 24-character lowercase hex string:"507f1f77bcf86cd799439011". In EJSON it is{"$oid": "507f..."}. A naivelyJSON.stringify()-d ObjectId produces{}— always use.toString()or atoJSONtransform. - Mongoose
- The most widely used MongoDB ODM (Object Document Mapper) for Node.js. Mongoose adds schema enforcement, validation, middleware hooks, and controlled serialization on top of the native driver. Key serialization features:
toJSON()andtoObject()schema options accept atransformfunction that controls how documents serialize — called automatically onres.json(). Thevirtuals: trueoption intoJSONincludes computed virtual fields. Thelean()query option returns plain JavaScript objects without Mongoose document wrapping, bypassingtoJSONtransforms — 3-5x faster for read-only queries. - aggregation pipeline
- MongoDB's server-side data processing framework, composed of sequential JSON stage documents. Each stage is a plain object with a single
$operatorkey. Common stages:$match(filter, same syntax as find()),$group(group by key with accumulator operators$sum,$avg,$addToSet),$project(field selection with 1/0 flags and JSON expression operators),$lookup(left outer join),$unwind(deconstruct array to one document per element),$addFields(add computed fields),$facet(parallel sub-pipelines). Field references use a$prefix:"$fieldName"; system variables use$$:$$NOW,$$ROOT. - $jsonSchema
- A MongoDB collection validator that enforces document structure using JSON Schema draft-04 syntax with BSON-specific extensions. Applied as a collection option via
db.createCollection(name, { validator: { $jsonSchema: {...} } })or added later withcollMod. Key BSON extension:bsonTypekeyword accepts BSON type names ("objectId","date","decimal","binData") in addition to standard JSON Schema types.validationAction: "error"(default) rejects invalid writes;"warn"logs but allows. Runs on every insert and update — does not retroactively validate existing documents. Also usable as a find() filter operator to identify non-conforming existing documents. - toJSON()
- A JavaScript method called automatically by
JSON.stringify()on any object that defines it. Mongoose schema optiontoJSON: { transform: (doc, ret) => ret }lets you control document serialization: rename_idtoid, delete__v, convert ObjectId fields to strings, include virtual fields. The native MongoDB driver's ObjectId class does not definetoJSON()by default (in some versions it does in others it does not) — always use.toString()explicitly or configure a MongoosetoJSONtransform. Date hastoJSON()that returnstoISOString(), so Date fields serialize correctly with plainJSON.stringify().
FAQ
What is the difference between BSON and JSON in MongoDB?
BSON (Binary JSON) is MongoDB's internal binary serialization format — a binary-encoded superset of JSON. JSON supports six types (string, number, boolean, null, array, object); BSON adds over 20 more: ObjectId, Date (64-bit UTC milliseconds), Decimal128, BinData, Regex, Int32, Int64, Timestamp, MinKey, MaxKey, and others. BSON is not human-readable text — it is a compact binary format that encodes field lengths ahead of values, making document traversal and field skipping faster than JSON parsing. The MongoDB driver converts your native language objects (JavaScript objects, Python dicts) to BSON on insert and back to native objects on find — you never work with raw BSON bytes. When BSON types appear in JSON exports (mongoexport) or shell output, they use Extended JSON notation: {"$oid": "..."} for ObjectId, {"$date": "..."} for Date. The key practical impact: JSON.stringify() of a MongoDB document produces incorrect output for ObjectId and Decimal128 without explicit handling.
How do I serialize a MongoDB document to JSON?
The native JSON.stringify() fails silently with MongoDB documents — ObjectId serializes as {} because it lacks a toJSON() method, and Decimal128 has the same issue. Four correct approaches: (1) EJSON.stringify() from the bson package — import { EJSON } from "bson"; EJSON.stringify(doc) — preserves types as Extended JSON wrappers. (2) Custom replacer — JSON.stringify(doc, (k, v) => v instanceof ObjectId ? v.toString() : v) — converts ObjectId to hex string. (3) Mongoose toJSON transform — configure at schema level: toJSON: { transform: (_doc, ret) => { ret.id = ret._id.toString(); delete ret._id; return ret; } } — applied automatically on every res.json() call. (4) Aggregation $toString — convert ObjectId to string at the database level in a pipeline projection. For REST APIs, the Mongoose toJSON transform is most maintainable; for internal microservices, EJSON preserves type information for re-ingestion.
What is MongoDB Extended JSON (EJSON)?
Extended JSON (EJSON) is a text representation of BSON that preserves type information using $-prefixed wrapper objects within valid JSON. EJSON has two modes: Canonical (fully type-preserving, verbose) and Relaxed (human-readable, preferred for APIs). In Relaxed mode: ObjectId is {"$oid": "507f1f77bcf86cd799439011"}, Date is {"$date": "2024-01-15T00:00:00Z"}, Decimal128 is {"$numberDecimal": "19.99"}, Int64 is {"$numberLong": "9007199254740993"}. Use the bson package: EJSON.stringify(doc, null, 2) to serialize and EJSON.parse(str) to restore types. mongoexport outputs Relaxed EJSON by default; add --canonical for Canonical mode. EJSON is the correct format when you need to round-trip MongoDB documents through text-based systems — message queues, S3 files, cross-service APIs — without losing type information.
How do I query MongoDB with JSON-like filters?
MongoDB query filters are plain JavaScript objects (JSON documents) passed to find(), findOne(), updateOne(), and deleteOne(). A plain value is an implicit $eq: collection.find({ status: "active" }). Comparison operators nest inside the field object: { price: { $gt: 10, $lte: 100 } }. Array matching: { tags: { $in: ["sale", "featured"] } } matches any element; $all requires all elements. Logical: { $or: [{...}, {...}] }. Nested field queries use dot notation: { "address.city": "London" }. For arrays of embedded documents, use $elemMatch when multiple conditions must match the same element — without it, conditions can match across different elements. Field existence: { field: { $exists: true } }. BSON type filter: { _id: { $type: "objectId" } }. The same JSON filter syntax works in aggregation $match stages.
How do I validate MongoDB document structure with $jsonSchema?
Add a $jsonSchema validator when creating the collection or via collMod: db.createCollection("users", { validator: { $jsonSchema: { bsonType: "object", required: ["email"], properties: { email: { bsonType: "string", pattern: "^.+@.+$" }, age: { bsonType: "int", minimum: 0 } } } } }). Use bsonType instead of type for BSON-specific types: "objectId", "date", "decimal", "binData". Standard JSON Schema type values ("string", "boolean") also work. Set validationAction: "error" (default) to reject invalid writes, or "warn" to log but allow. Set validationLevel: "strict" to validate all inserts and updates, or "moderate" to only validate writes to documents that already pass. $jsonSchema runs on each write operation — it does not validate existing documents retroactively. Use $jsonSchema in a find({ $nor: [{ $jsonSchema: {...} }] } query to audit non-conforming existing documents.
How do I use Mongoose to return JSON from a MongoDB query?
Configure a toJSON transform at the schema level — it is called automatically whenever JSON.stringify() or res.json() serializes a Mongoose document: new Schema({...}, { toJSON: { virtuals: true, transform: (_doc, ret) => { ret.id = ret._id.toString(); delete ret._id; delete ret.__v; return ret; } } }). Set virtuals: true to include virtual fields computed from schema properties (e.g., fullName derived from firstName and lastName). The transform receives doc (the original Mongoose document) and ret (the plain object copy being serialized) — modify and return ret. For .lean() queries, toJSON transforms are not applied — lean returns plain objects from the driver. Use lean for high-performance read-only endpoints and add a separate mapper function for DTO conversion. Populated referenced documents also call their schema's toJSON recursively.
How do I handle MongoDB ObjectId in JSON APIs?
ObjectId is a 12-byte BSON type that must be explicitly converted to a 24-character hex string for JSON APIs — JSON.stringify() without a toJSON method produces {}. Three patterns for API responses: (1) Mongoose toJSON transform — ret.id = ret._id.toString(); delete ret._id — applied on every res.json() call. (2) Aggregation $toString — { $project: { id: { $toString: "$_id" }, _id: 0, name: 1 } } — converts at the database level. (3) Manual .toString() — call doc._id.toString() explicitly when building DTO objects. For incoming ObjectId strings in API request bodies (e.g., a userId path parameter), convert back before querying: new ObjectId(req.params.id) — validate it is a valid 24-character hex first; an invalid string throws BSONError. In TypeScript, type incoming IDs as string and convert to ObjectId at the service/repository boundary.
How do I type MongoDB documents in TypeScript?
Define two separate interfaces: a document interface for database operations (with BSON types like ObjectId and Date) and a DTO interface for API responses (with serialized types: string for IDs and dates). Pass the document interface to the collection generic: db.collection<UserDoc>("users"). findOne() returns WithId<UserDoc> | null — _id is typed as ObjectId. insertOne() accepts WithoutId<UserDoc> — the driver adds _id automatically. Write a mapper function to convert from document interface to DTO: function toDto(doc: WithId<UserDoc>): UserDto { return { id: doc._id.toString(), ... } }. With Mongoose, use InferSchemaType<typeof schema> to infer the TypeScript type automatically from the schema definition (Mongoose 6+). Type aggregation results with the pipeline generic: collection.aggregate<ResultType>([...]). Never use any for document types — it hides ObjectId serialization bugs at compile time.
Further reading and primary sources
- MongoDB BSON Types — Official reference for all BSON types, their encodings, and Extended JSON representations
- MongoDB Extended JSON — EJSON Canonical and Relaxed mode specifications with type mapping tables
- Mongoose Schema toJSON Option — Mongoose documentation for toJSON and toObject schema options, transform functions, and virtuals
- MongoDB $jsonSchema Operator — Full $jsonSchema reference with bsonType values, supported keywords, and validation examples
- MongoDB Node.js Driver TypeScript — Official guide for using TypeScript with the MongoDB Node.js driver — Collection generics, WithId, and Filter types