JSON vs CBOR: Binary Encoding for APIs and IoT

Last updated:

CBOR (Concise Binary Object Representation, RFC 8949) is a binary serialization format based on the JSON data model — the same types (strings, numbers, arrays, maps, booleans, null) but encoded in binary, making it 30–50% smaller than JSON text for typical API payloads. CBOR is self-describing like JSON (no schema required), unlike Protobuf or MessagePack. It adds native types JSON lacks: byte strings, dates, big integers, and tagged values. This guide compares JSON and CBOR across payload size, parse speed, tooling support, and IoT/CoAP adoption, and provides Node.js (cbor-x) and Python (cbor2) encode/decode examples. For binary format comparisons see also JSON vs Protobuf and JSON vs MessagePack.

What is CBOR — data model, major types, and RFC 8949

CBOR was designed by the IETF (RFC 7049, updated to RFC 8949 in 2020) to be a compact, extensible binary format that shares the JSON data model. The design goals were small code size (important for constrained devices), no requirement for a schema, and extensibility through a tag mechanism.

Every CBOR value begins with a one-byte header that encodes two things: the major type (3 high bits, values 0–7) and the additional info (5 low bits). The major types are:

Major typeValueEncodesJSON equivalent
00x00–0x1bUnsigned integernumber (integer)
10x20–0x3bNegative integernumber (negative)
20x40–0x5bByte string— (no equivalent)
30x60–0x7bText string (UTF-8)string
40x80–0x9bArrayarray
50xa0–0xbbMapobject
60xc0–0xdbTagged value— (semantic annotation)
70xe0–0xffFloat, bool, null, breaknumber, boolean, null

The additional info field encodes the argument directly when it is 0–23 (a single-byte header encodes both type and a small integer), or signals that 1, 2, 4, or 8 additional bytes follow for larger values. This varint-like encoding means small positive integers (0–23) cost exactly 1 byte — far less than the 1–2 ASCII digits JSON uses plus the overhead of surrounding delimiters.

CBOR's major type 6 — tagged values — is what makes it extensible without breaking self-description. Tag 0 annotates a text string as an RFC 3339 date/time; tag 1 annotates a number as a POSIX epoch timestamp; tag 2 and 3 encode arbitrary-precision integers; tag 32 marks a text string as a URI. The full tag registry is maintained by IANA. A decoder that does not understand a tag can still decode the wrapped value — it just ignores the semantic annotation.

Payload size — 30–50% reduction and when it helps most

The size advantage of CBOR over JSON comes from three sources: (1) no delimiter overhead (no quotes around keys, no colons or commas between entries); (2) compact integer encoding (small integers in 1–3 bytes versus 1+ ASCII digits); (3) native byte string support (no Base64 encoding for binary data). The savings are highest for payloads with many keys, small integers, or binary fields; they are lowest for payloads dominated by long text strings, where both formats store the raw UTF-8 bytes.

Example: the JSON object {"id":42,"name":"Alice","active":true} is 37 bytes as UTF-8. As CBOR it encodes to approximately 22 bytes — a 40% reduction. A 1 KB JSON API response with mixed types typically compresses to 600–700 bytes as CBOR before network compression.

Payload typeJSON sizeCBOR sizeSavingsNotes
Small object (5 fields, short keys)~120 bytes~72 bytes40%No key quotes or colons
Array of 100 integers (0–255)~350 bytes~202 bytes42%Integers in 1 byte each
Binary field (1 KB thumbnail)~1 368 bytes (Base64)~1 025 bytes25%Byte string vs Base64
Long text payload (article body)~5 000 bytes~4 960 bytes1%String content dominates
Typical 10 KB REST response10 240 bytes~6 400 bytes37%Mixed types, ~60 keys
After gzip (10 KB REST response)~2 100 bytes~1 800 bytes14%Gap narrows after compression

When HTTP gzip or Brotli compression is enabled, the wire-size gap between JSON and CBOR narrows significantly — from ~35% to roughly 10–15% — because both formats compress well. If your bottleneck is serialization CPU rather than bandwidth, CBOR (with cbor-x) still wins regardless of compression, since binary parsing avoids text-scanning overhead.

Parse and serialize speed — cbor-x vs JSON.parse in Node.js, cbor2 vs json in Python

CBOR binary parsing eliminates the most expensive parts of JSON parsing: scanning for quote/bracket/comma characters, converting ASCII digit sequences to numbers, and validating UTF-8 escape sequences. The binary type header immediately tells the decoder the value type and byte count, so parsing becomes sequential reads rather than character-level scanning.

LibraryEncode throughputDecode throughputOutput size (10 KB JSON)
JSON.stringify / JSON.parse~200 MB/s~250 MB/s10 240 bytes
cbor-x encode/decode~350 MB/s~1 200 MB/s~6 400 bytes
cbor (Node.js reference)~80 MB/s~90 MB/s~6 400 bytes
Python json.dumps / json.loads~60 MB/s~80 MB/sbaseline
Python cbor2.dumps / cbor2.loads~55 MB/s~70 MB/s~37% smaller

The headline result is cbor-x's decode speed of ~1 200 MB/s — approximately 5× faster than JSON.parse. The encode side is ~1.75× faster. The reference cbor package for Node.js is slower than JSON because it is a pure-JavaScript streaming parser optimized for correctness rather than throughput; use cbor-x for performance-sensitive paths.

In Python, cbor2 is roughly equivalent to the built-in json module in speed on CPython (both are C-accelerated for common paths). The advantage in Python is not CPU throughput but payload size and native type support: cbor2 handles datetime, bytes, and Decimal without custom encoders.

CBOR in Node.js with cbor-x

cbor-x is the fastest Node.js CBOR library, supporting full RFC 8949 encoding and decoding with zero runtime dependencies. Install it with:

npm install cbor-x

Basic encode and decode:

import { encode, decode } from 'cbor-x'

const obj = {
  id: 42,
  name: 'Alice',
  active: true,
  tags: ['admin', 'user'],
  created: new Date('2025-01-01'),
}

// Encode to Buffer
const bytes = encode(obj)
console.log(bytes.byteLength)  // ~50 bytes vs ~90 bytes JSON

// Decode from Buffer or Uint8Array
const decoded = decode(bytes)
console.log(decoded.created instanceof Date)  // true — Date preserved

Streaming encode for large payloads:

import { Encoder } from 'cbor-x'
import { createWriteStream } from 'fs'

const encoder = new Encoder()
const out = createWriteStream('data.cbor')
encoder.pipe(out)

for (const record of largeArray) {
  encoder.write(record)
}
encoder.end()

HTTP server that negotiates JSON or CBOR based on the Accept header:

import express from 'express'
import { encode, decode } from 'cbor-x'

const app = express()
app.use(express.raw({ type: 'application/cbor' }))

app.get('/api/user/:id', async (req, res) => {
  const user = await db.getUser(req.params.id)

  if (req.accepts('application/cbor')) {
    res.set('Content-Type', 'application/cbor')
    return res.send(Buffer.from(encode(user)))
  }
  res.json(user)
})

app.post('/api/user', async (req, res) => {
  const contentType = req.headers['content-type'] ?? ''
  const body = contentType.includes('application/cbor')
    ? decode(req.body)          // req.body is Buffer (express.raw)
    : req.body                  // JSON (express.json middleware)

  const user = await db.createUser(body)
  res.status(201).json(user)
})

For browser fetch() clients consuming a CBOR endpoint:

import { decode } from 'cbor-x'

async function fetchUser(id: number) {
  const res = await fetch(`/api/user/${id}`, {
    headers: { Accept: 'application/cbor' },
  })
  const buffer = await res.arrayBuffer()
  return decode(new Uint8Array(buffer))
}

CBOR in Python with cbor2

cbor2 is the most complete Python CBOR library, supporting RFC 8949 tags,datetime, bytes, Decimal, and custom tag decoders. Install with:

pip install cbor2

Basic encode and decode:

import cbor2
from datetime import datetime, timezone

obj = {
    "id": 42,
    "name": "Alice",
    "active": True,
    "created": datetime(2025, 1, 1, tzinfo=timezone.utc),
    "thumbnail": bytes.fromhex("ffd8ffe0"),  # raw binary, no Base64
}

# Encode to bytes
encoded = cbor2.dumps(obj)
print(len(encoded))          # ~55 bytes vs ~110 bytes JSON

# Decode from bytes
decoded = cbor2.loads(encoded)
print(type(decoded["created"]))    # <class 'datetime.datetime'>
print(type(decoded["thumbnail"]))  # <class 'bytes'>

File-based encode/decode and streaming:

import cbor2

# Write to file
with open("data.cbor", "wb") as f:
    cbor2.dump(obj, f)

# Read from file
with open("data.cbor", "rb") as f:
    loaded = cbor2.load(f)

# Decode a stream of CBOR items (e.g. from a socket)
with open("stream.cbor", "rb") as f:
    decoder = cbor2.CBORDecoder(f)
    while True:
        try:
            item = decoder.decode()
            process(item)
        except EOFError:
            break

Custom tag encoder — encoding a Python UUID as CBOR tag 37:

import cbor2
import uuid

def encode_uuid(encoder, value):
    encoder.encode(cbor2.CBORTag(37, value.bytes))

encoded = cbor2.dumps(
    {"id": uuid.uuid4()},
    timezone=None,
    default=encode_uuid,
)

When to use CBOR — IoT/CoAP, WebAuthn, COSE, and constrained devices

CBOR has strong adoption in specific standards-driven domains where its IETF RFC status and extensible tag system matter. Understanding where CBOR is the correct choice — and where JSON or MessagePack is better — avoids adopting it in contexts where the tradeoffs do not pay off.

Use CBOR when:

  • WebAuthn / FIDO2 — CBOR is mandatory. Authenticator data, attestation objects, and all COSE key structures are CBOR-encoded. Server-side FIDO2 libraries handle this internally, but you need a CBOR library to inspect raw credential payloads.
  • CoAP (IoT protocol) — The IETF designed CoAP (RFC 7252) with CBOR as its native payload format. Microcontrollers, sensors, and edge devices in smart home, industrial IoT, and LPWAN networks send CBOR-encoded sensor readings over CoAP.
  • COSE (CBOR Object Signing and Encryption) — The IETF standard for signing and encrypting data in IoT and WebAuthn contexts. Uses CBOR maps with defined tag and key structures.
  • Constrained devices — CBOR decoders can be implemented in ~2 KB of ROM. JSON parsing requires significantly more code and RAM for full compliance. For firmware running on ARM Cortex-M0 microcontrollers, CBOR is the practical choice.
  • Payloads with binary fields — byte strings avoid 33% Base64 overhead and remove the Base64 encode/decode CPU cost on both ends.
  • Native date and big integer support — CBOR encodes these without string conventions, preserving full precision and type fidelity.

Stick with JSON when:

  • Your API is public-facing — consumers should not need a special library; curl and browser DevTools read JSON natively.
  • Human readability matters — config files, log lines, and webhook payloads need to be inspectable without a binary decoder.
  • Your ecosystem has no CBOR support — CBOR library quality and coverage varies more than JSON (which is native in every language).
  • You already use HTTP gzip/Brotli compression — after compression, the wire-size advantage of CBOR over JSON drops to ~10–15%, which may not justify adding a dependency.

JSON vs CBOR vs MessagePack vs Protobuf comparison

The table below covers the four most-compared binary and text serialization formats across the dimensions that matter for API and IoT decisions.

PropertyJSONCBORMessagePackProtobuf
FormatText (UTF-8)BinaryBinaryBinary
Human-readableYesNoNoNo
Schema requiredNoNoNoYes (.proto)
Self-describingYesYesYesNo
Typical size vs JSON100% (baseline)50–70%50–80%25–40%
Native byte stringsNo (Base64)YesYesYes
Native datesNo (strings)Yes (tags 0, 1)No (ext type)No (well-known type)
Big integersNo (precision loss)Yes (tags 2, 3)NoYes (int64/uint64)
IETF RFCRFC 8259RFC 8949NoNo
Parse speed (Node.js)Baseline~5× faster (cbor-x)~2× faster (msgpackr)~2–5× faster
Language supportUniversal (native)Good (varies by library)Good (most languages)Good (generated code)
Primary use casesPublic APIs, configWebAuthn, CoAP, IoTInternal APIs, WebSocketgRPC, microservices

The key takeaway: Protobuf achieves the smallest payloads but requires a schema. CBOR and MessagePack are comparable in size and both self-describing, but CBOR has stronger native type support and IETF RFC status, while MessagePack has broader general-purpose library adoption. JSON remains the right default for public APIs.

For a detailed size and speed comparison with MessagePack, see JSON vs MessagePack. For schema-driven encoding, see JSON vs Protobuf and JSON to Protobuf conversion. For security contexts using CBOR, see JSON Web Encryption (JWE) and JSON Canonicalization.

Definitions

CBOR (Concise Binary Object Representation)
An IETF binary serialization format (RFC 8949) based on the JSON data model. Self-describing, extensible via tags, and designed for constrained environments.
Major type
The 3 high bits of a CBOR header byte, identifying the data type: unsigned integer (0), negative integer (1), byte string (2), text string (3), array (4), map (5), tag (6), or simple/float (7).
Self-describing format
A serialization format where each value carries its own type information in the encoded bytes. CBOR and JSON are self-describing; Protobuf is not (requires a schema to identify field types).
CoAP (Constrained Application Protocol)
An IETF protocol (RFC 7252) designed for machine-to-machine communication on constrained IoT devices. Uses UDP instead of TCP and CBOR as its native payload format, making it suitable for microcontrollers and sensor networks.
COSE (CBOR Object Signing and Encryption)
An IETF standard (RFC 9052) for signing and encrypting data encoded in CBOR. Used in WebAuthn, SUIT (IoT firmware updates), and EDHOC key exchange. Analogous to JOSE (JSON Object Signing and Encryption) but for CBOR payloads.
Tagged value
A CBOR construct (major type 6) that wraps any CBOR value with a numeric tag number that conveys semantic meaning — for example, tag 1 for epoch timestamps, tag 32 for URIs, tag 37 for UUIDs. Tags are registered with IANA.
WebAuthn (Web Authentication)
A W3C standard API for strong authentication using public-key cryptography (passkeys, hardware security keys). The authenticator data, attestation objects, and COSE public keys in WebAuthn are CBOR-encoded binary structures.

Frequently asked questions

What is CBOR and how does it differ from JSON?

CBOR (Concise Binary Object Representation, RFC 8949) is a binary serialization format that shares the same data model as JSON — maps, arrays, strings, numbers, booleans, and null — but uses a compact binary encoding instead of Unicode text. JSON represents the integer 42 as the two-character ASCII string 42; CBOR encodes it as one or two bytes. JSON wraps every key in double quotes and separates entries with colons and commas; CBOR uses a single header byte that encodes both the type and, for small values, the length or value itself. The result is 30–50% smaller payloads. Unlike Protobuf, CBOR is self-describing — no schema needed to decode it. CBOR also adds native types JSON lacks: byte strings, dates, big integers, and tagged values.

How much smaller is CBOR than JSON?

For typical API or IoT payloads, CBOR is 30–50% smaller than equivalent JSON. Payloads with many keys and numeric values save the most; payloads dominated by long text strings save very little (string content is stored verbatim in both formats). Binary data is where CBOR wins most decisively: JSON must Base64-encode binary, adding 33% overhead, while CBOR stores byte strings at 1:1 byte cost. After HTTP gzip or Brotli compression, the wire-size gap narrows to roughly 10–15%.

Does CBOR require a schema like Protobuf?

No. CBOR is fully self-describing — every value carries its type information in a leading header byte. A decoder can read any CBOR binary without prior knowledge of a schema or .proto file. This contrasts with Protobuf, which strips field names from the wire format and requires a compiled .proto schema to identify fields. Optional CDDL (Concise Data Definition Language, RFC 8610) schemas can be used for validation but are not required for decoding.

What types does CBOR support that JSON does not?

CBOR natively supports: byte strings (major type 2, raw binary without Base64); dates (tag 0 for RFC 3339 text, tag 1 for numeric epoch timestamps with sub-second precision); big integers (tags 2 and 3, arbitrary precision without IEEE 754 loss); undefined (distinct from null); half-precision floats (16-bit, 3 bytes total, useful for sensor data); and tagged values (RFC 8949 tags for URIs, UUIDs, MIME types, and application-specific types).

How do I encode and decode CBOR in Node.js?

The fastest option is cbor-x: npm install cbor-x, then import { encode, decode } from 'cbor-x'. Call encode(value) to get a Buffer and decode(buffer) to recover the original object. cbor-x automatically handles Date, Map, Set, and Uint8Array values. For HTTP servers, set Content-Type: application/cbor and use express.raw({ type: "application/cbor" }) to receive CBOR bodies as a Buffer. For fetch() clients, set Accept: application/cbor and call decode(new Uint8Array(await response.arrayBuffer())).

How do I use CBOR in Python?

Install cbor2 with pip install cbor2. The API mirrors the json module: cbor2.dumps(obj) encodes to bytes and cbor2.loads(data) decodes back to a Python object. cbor2 automatically encodes Python datetime objects as CBOR tag 1 (epoch timestamp) and bytes objects as CBOR byte strings — no custom encoder needed. For file I/O: cbor2.dump(obj, file) and cbor2.load(file). The CBORDecoder class supports streaming decode from a file or socket.

What is CBOR used for in WebAuthn and IoT?

CBOR is the mandatory wire format for WebAuthn/FIDO2 — all authenticator data, attestation objects, and COSE public keys in the Web Authentication API are CBOR-encoded binary. In IoT, CBOR is the native payload format for CoAP (RFC 7252), the UDP-based protocol designed for microcontrollers and sensor networks. CBOR is also used in COSE (RFC 9052, signing and encryption for IoT and WebAuthn), SUIT (IoT firmware update manifests), and EDHOC (lightweight Diffie-Hellman key exchange for constrained devices).

Should I use CBOR or MessagePack for my API?

Both are self-describing binary formats with similar size savings over JSON (30–50%). Choose CBOR when you need IETF RFC compliance, interoperability with WebAuthn/CoAP/COSE standards, or native support for dates, big integers, and tagged semantic types. Choose MessagePack when you want the broadest language library support and the fastest possible encode/decode speed for general-purpose internal APIs — msgpackr slightly outperforms cbor-x for encode throughput, though cbor-x leads on decode. See JSON vs MessagePack for a detailed comparison.

Inspect JSON payloads before migrating to CBOR

Use Jsonic's JSON Formatter to validate, minify, and understand the exact structure of your current JSON payloads before writing a CBOR encoder.

Open JSON Formatter

Further reading and primary sources