JSON Streaming APIs: NDJSON, Server-Sent Events, ReadableStream & Chunked Transfer
Last updated:
JSON streaming sends data incrementally rather than waiting for a complete response — enabling real-time AI chat, live dashboards, and large dataset exports without memory-loading the full payload. Three patterns cover most streaming scenarios: NDJSON (one JSON object per line, Content-Type: application/x-ndjson) for bulk data feeds, Server-Sent Events (SSE, text/event-stream) for browser push without WebSockets, and ReadableStream with chunked transfer encoding for streaming JSON from Next.js Route Handlers. OpenAI's streaming API uses SSE: each chunk arrives as data: {'{"choices":[{"delta":{"content":"word"}}]}'}, letting the browser render tokens word-by-word. json.NewDecoder in Go and JsonParseStream in Deno parse streaming JSON without buffering. This guide covers NDJSON format, SSE JSON encoding, Next.js streaming Route Handlers, browser ReadableStream consumption, Go streaming server/client, and OpenAI-style AI streaming.
NDJSON: Line-Delimited JSON for Bulk Streaming
NDJSON (Newline-Delimited JSON) is a simple format: one complete JSON value per line, separated by \n. Unlike a JSON array that wraps all records in [] and requires the entire document to be parsed before any record is available, NDJSON enables incremental processing — a parser reads and handles line 1 while lines 2 through 1,000,000 are still arriving over the network. The Content-Type header is application/x-ndjson (or application/jsonl for JSON Lines). Elasticsearch bulk operations, OpenAI fine-tuning datasets, and BigQuery NDJSON loads all use this format. Processing a 1 GB NDJSON file requires O(1) memory — one line at a time — versus loading the equivalent JSON array into RAM.
// ── NDJSON format — one JSON object per line ──────────────────
// Each line is a complete, independently-parseable JSON value
{"id":1,"name":"Alice","score":98.5}
{"id":2,"name":"Bob","score":87.2}
{"id":3,"name":"Carol","score":91.0}
// ── Write NDJSON in Node.js ────────────────────────────────────
import { createWriteStream } from 'fs'
const file = createWriteStream('output.ndjson')
const records = [
{ id: 1, name: 'Alice', score: 98.5 },
{ id: 2, name: 'Bob', score: 87.2 },
{ id: 3, name: 'Carol', score: 91.0 },
]
for (const record of records) {
file.write(JSON.stringify(record) + '\n')
}
file.end()
// ── Read NDJSON in Node.js with readline (O(1) memory) ────────
import { createReadStream } from 'fs'
import { createInterface } from 'readline'
const rl = createInterface({
input: createReadStream('output.ndjson'),
crlfDelay: Infinity, // handle Windows CRLF line endings
})
for await (const line of rl) {
if (!line.trim()) continue // skip blank lines
const obj = JSON.parse(line)
console.log(obj.name, obj.score)
}
// → Alice 98.5
// → Bob 87.2
// → Carol 91.0
// ── Stream NDJSON over HTTP (Express) ─────────────────────────
import express from 'express'
import { db } from './db.js'
const app = express()
app.get('/api/export', async (req, res) => {
res.setHeader('Content-Type', 'application/x-ndjson')
res.setHeader('Transfer-Encoding', 'chunked') // no Content-Length
// Stream rows from DB cursor — never load all rows into memory
const cursor = db.collection('users').find().batchSize(500)
for await (const doc of cursor) {
res.write(JSON.stringify(doc) + '\n')
}
res.end()
})
// ── Elasticsearch bulk NDJSON format ──────────────────────────
// Bulk API requires action+source pairs — each is one NDJSON line
const bulk = [
JSON.stringify({ index: { _index: 'users', _id: '1' } }),
JSON.stringify({ name: 'Alice', score: 98.5 }),
JSON.stringify({ index: { _index: 'users', _id: '2' } }),
JSON.stringify({ name: 'Bob', score: 87.2 }),
].join('\n') + '\n'
await fetch('http://localhost:9200/_bulk', {
method: 'POST',
headers: { 'Content-Type': 'application/x-ndjson' },
body: bulk,
})The key constraint in NDJSON is that each JSON value must fit on a single line — newline characters inside string values must be escaped as \n (the two-character escape sequence), not literal newlines. The JSON.stringify() function in JavaScript handles this automatically. NDJSON is distinct from a JSON array: ["a","b","c"] is valid JSON but not NDJSON; "a"\n"b"\n"c"\n is valid NDJSON but not a JSON array. Tools like jq --null-input '[inputs]' < data.ndjson convert between the two formats at the command line.
Server-Sent Events: Push JSON to Browsers
Server-Sent Events (SSE) use a persistent HTTP connection with Content-Type: text/event-stream to push data from server to browser. The browser's built-in EventSource API handles reconnection automatically — if the connection drops, the browser retries using the Last-Event-ID header to resume from where it left off. Each SSE event is a block of field: value lines followed by two newlines. For JSON streaming, the data: field carries the serialized JSON string. SSE requires no protocol upgrade and works through HTTP/2 multiplexing — up to 100 concurrent SSE streams per domain versus 6 for HTTP/1.1.
// ── SSE event format ──────────────────────────────────────────
// Each event: one or more field lines + blank line (\n\n)
id: 42
event: update
data: {"type":"message","content":"Hello, world!"}
// Minimal event (just data):
data: {"status":"ok","timestamp":1716854400000}
// Multi-line data (each line prefixed with data:):
data: {"type":"start"}
data: {"chunk":"Hello"}
// Comment (keep-alive, not dispatched to client):
: heartbeat
// ── SSE server in Node.js (Express) ──────────────────────────
import express from 'express'
const app = express()
app.get('/api/events', (req, res) => {
res.setHeader('Content-Type', 'text/event-stream')
res.setHeader('Cache-Control', 'no-cache')
res.setHeader('Connection', 'keep-alive')
// Allow cross-origin if needed:
// res.setHeader('Access-Control-Allow-Origin', '*')
let eventId = 0
// Send initial connection event
res.write(`id: ${++eventId}\nevent: connected\ndata: {"status":"connected"}\n\n`)
// Push a JSON event every 2 seconds
const interval = setInterval(() => {
const payload = JSON.stringify({
type: 'metric',
cpu: Math.random() * 100,
timestamp: Date.now(),
})
res.write(`id: ${++eventId}\ndata: ${payload}\n\n`)
}, 2000)
// Send keep-alive comment every 30s to prevent proxy timeouts
const keepAlive = setInterval(() => {
res.write(': keep-alive\n\n')
}, 30_000)
// Clean up when client disconnects
req.on('close', () => {
clearInterval(interval)
clearInterval(keepAlive)
res.end()
})
})
// ── SSE client in the browser ─────────────────────────────────
const es = new EventSource('/api/events')
// Default "message" events
es.onmessage = (event) => {
const data = JSON.parse(event.data)
console.log('Received:', data)
updateDashboard(data)
}
// Named events (event: update)
es.addEventListener('update', (event) => {
const data = JSON.parse(event.data)
handleUpdate(data)
})
// Error / reconnect handling
es.onerror = (err) => {
if (es.readyState === EventSource.CLOSED) {
console.log('SSE connection closed')
}
// EventSource reconnects automatically — no manual retry needed
}
// Stop streaming
function stopStream() {
es.close()
}
// ── SSE with Last-Event-ID for resumable streams ──────────────
app.get('/api/resumable', (req, res) => {
res.setHeader('Content-Type', 'text/event-stream')
res.setHeader('Cache-Control', 'no-cache')
// Client sends Last-Event-ID header on reconnect
const lastId = parseInt(req.headers['last-event-id'] ?? '0', 10)
// Resume from where we left off
const missedEvents = getEventsSince(lastId) // your DB/queue lookup
for (const evt of missedEvents) {
res.write(`id: ${evt.id}\ndata: ${JSON.stringify(evt.payload)}\n\n`)
}
// Continue streaming new events...
})SSE is strictly server-to-client. If the client needs to send data while the stream is active — such as sending a follow-up question in an AI chat — it makes a separate HTTP POST to a different endpoint. SSE does not replace WebSockets for bidirectional communication; it specializes in efficient, reconnectable server push. The retry: field in the event format lets the server set the client reconnection interval in milliseconds: retry: 5000\n\n tells the browser to wait 5 seconds before reconnecting after a drop.
Next.js Streaming Route Handlers with ReadableStream
Next.js App Router Route Handlers support streaming responses by returning a Response with a ReadableStream body. The Fetch API's streaming primitives — ReadableStream, TransformStream, TextEncoder — work in both Node.js and Edge runtimes. For long-running AI streams, the Edge runtime is preferred because it has no function timeout. Mark the route dynamic to prevent caching: export const dynamic = 'force-dynamic'. For NDJSON, set Content-Type: application/x-ndjson; for SSE, set Content-Type: text/event-stream.
// app/api/stream-ndjson/route.ts
// ── NDJSON streaming Route Handler ───────────────────────────
import { NextRequest } from 'next/server'
export const dynamic = 'force-dynamic' // no caching
export const runtime = 'edge' // no timeout limit
export async function GET(req: NextRequest) {
const encoder = new TextEncoder()
const stream = new ReadableStream({
async start(controller) {
const records = [
{ id: 1, name: 'Alice', score: 98.5 },
{ id: 2, name: 'Bob', score: 87.2 },
{ id: 3, name: 'Carol', score: 91.0 },
]
for (const record of records) {
// Simulate async DB/API work
await new Promise((r) => setTimeout(r, 100))
controller.enqueue(encoder.encode(JSON.stringify(record) + '\n'))
}
controller.close()
},
})
return new Response(stream, {
headers: {
'Content-Type': 'application/x-ndjson',
'Cache-Control': 'no-cache',
},
})
}
// ── SSE streaming Route Handler ───────────────────────────────
// app/api/stream-sse/route.ts
export const dynamic = 'force-dynamic'
export const runtime = 'edge'
export async function GET(req: NextRequest) {
const encoder = new TextEncoder()
let eventId = 0
const stream = new ReadableStream({
async start(controller) {
const enqueueEvent = (data: unknown, event?: string) => {
let msg = `id: ${++eventId}\n`
if (event) msg += `event: ${event}\n`
msg += `data: ${JSON.stringify(data)}\n\n`
controller.enqueue(encoder.encode(msg))
}
enqueueEvent({ status: 'started' }, 'connected')
for (let i = 0; i < 5; i++) {
await new Promise((r) => setTimeout(r, 500))
enqueueEvent({ index: i, value: Math.random() })
}
enqueueEvent({ status: 'done' }, 'complete')
controller.close()
},
})
return new Response(stream, {
headers: {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive',
},
})
}
// ── Streaming with TransformStream (pipe pattern) ─────────────
// app/api/proxy-stream/route.ts
export async function GET() {
// Proxy an upstream streaming response through Next.js
const upstream = await fetch('https://api.example.com/stream', {
headers: { Authorization: `Bearer ${process.env.API_KEY}` },
})
// Pass the upstream ReadableStream body directly as the response
// TransformStream could intercept and modify each chunk if needed
return new Response(upstream.body, {
headers: {
'Content-Type': upstream.headers.get('Content-Type') ?? 'application/x-ndjson',
'Cache-Control': 'no-cache',
},
})
}The ReadableStream constructor's start(controller) callback receives a ReadableStreamDefaultController. Call controller.enqueue(chunk) to push a chunk (a Uint8Array from TextEncoder) and controller.close() when done. Handle errors with controller.error(err) — the response body will be aborted and the client's stream reader will throw. For database-backed streaming, iterate a cursor inside start() and enqueue each row — the stream will back-pressure automatically if the client reads slower than the server writes.
Consuming Streaming JSON in the Browser
The Fetch API's response.body is a ReadableStream<Uint8Array>. Use a TextDecoder to convert byte chunks to strings, buffer incomplete lines, and parse each complete NDJSON line. For SSE, the browser's EventSource API handles all parsing and reconnection automatically. For custom streaming JSON that is not SSE format, the manual reader loop is required. In React, manage the stream inside a useEffect with cleanup to cancel the stream on unmount.
// ── Consume NDJSON stream with Fetch + ReadableStream ─────────
async function streamNdjson(url: string, onRecord: (record: unknown) => void) {
const res = await fetch(url)
if (!res.ok) throw new Error(`HTTP ${res.status}`)
const reader = res.body!.getReader()
const decoder = new TextDecoder()
let buffer = ''
try {
while (true) {
const { done, value } = await reader.read()
if (done) break
buffer += decoder.decode(value, { stream: true })
const lines = buffer.split('\n')
// Keep the last (potentially incomplete) fragment
buffer = lines.pop() ?? ''
for (const line of lines) {
if (line.trim()) {
onRecord(JSON.parse(line))
}
}
}
// Process any remaining complete line after stream ends
if (buffer.trim()) {
onRecord(JSON.parse(buffer))
}
} finally {
reader.releaseLock()
}
}
// Usage:
await streamNdjson('/api/export', (record) => {
console.log('Got record:', record)
})
// ── React hook for streaming NDJSON ───────────────────────────
import { useState, useEffect, useRef } from 'react'
function useNdjsonStream<T>(url: string) {
const [records, setRecords] = useState<T[]>([])
const [error, setError ] = useState<string | null>(null)
const [done, setDone ] = useState(false)
const readerRef = useRef<ReadableStreamDefaultReader<Uint8Array> | null>(null)
useEffect(() => {
let cancelled = false
async function start() {
try {
const res = await fetch(url)
const reader = res.body!.getReader()
readerRef.current = reader
const decoder = new TextDecoder()
let buffer = ''
while (!cancelled) {
const { done, value } = await reader.read()
if (done) break
buffer += decoder.decode(value, { stream: true })
const lines = buffer.split('\n')
buffer = lines.pop() ?? ''
const parsed = lines
.filter(l => l.trim())
.map(l => JSON.parse(l) as T)
if (parsed.length > 0 && !cancelled) {
setRecords(prev => [...prev, ...parsed])
}
}
if (!cancelled) setDone(true)
} catch (err) {
if (!cancelled) setError(String(err))
}
}
start()
return () => {
cancelled = true
readerRef.current?.cancel()
}
}, [url])
return { records, error, done }
}
// ── Consume SSE with EventSource ─────────────────────────────
function useSseStream(url: string) {
const [events, setEvents ] = useState<unknown[]>([])
const [status, setStatus ] = useState<'connecting' | 'open' | 'closed'>('connecting')
useEffect(() => {
const es = new EventSource(url)
es.onopen = () => setStatus('open')
es.onmessage = (e) => setEvents(prev => [...prev, JSON.parse(e.data)])
es.onerror = () => {
if (es.readyState === EventSource.CLOSED) setStatus('closed')
}
return () => {
es.close()
setStatus('closed')
}
}, [url])
return { events, status }
}The buffer accumulation pattern handles TCP fragmentation — a single JSON object may arrive split across multiple read() calls. By splitting on newlines and keeping the last fragment in buffer, incomplete lines are held until the next chunk completes them. Always call reader.releaseLock() or reader.cancel() in the cleanup function — failing to do so leaks the stream lock and can cause memory leaks in long-running React apps. The { stream: true } option on TextDecoder.decode() tells the decoder that more bytes may follow, preventing it from incorrectly decoding multi-byte UTF-8 characters that are split across chunk boundaries.
OpenAI-Style AI Streaming with SSE and JSON Chunks
OpenAI's streaming chat completions API sends one SSE event per token, each carrying a JSON chunk with a delta field. The stream ends with data: [DONE]. This pattern — SSE + incremental JSON delta — is now the standard for AI streaming APIs and can be replicated in any backend. The client accumulates deltas to reconstruct the full message, updating the UI on each event for word-by-word rendering that reduces perceived latency from 5–10 seconds to near-instant first-token display.
// ── OpenAI streaming response format (one SSE event per token) ─
// data: {"id":"chatcmpl-abc","object":"chat.completion.chunk",
// "choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
// data: {"id":"chatcmpl-abc","object":"chat.completion.chunk",
// "choices":[{"index":0,"delta":{"content":", world"},"finish_reason":null}]}
// data: {"id":"chatcmpl-abc","object":"chat.completion.chunk",
// "choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
// data: [DONE]
// ── Implement OpenAI-style streaming in Next.js ───────────────
// app/api/chat/route.ts
import { NextRequest } from 'next/server'
export const runtime = 'edge'
export const dynamic = 'force-dynamic'
export async function POST(req: NextRequest) {
const { messages } = await req.json()
const encoder = new TextEncoder()
const stream = new ReadableStream({
async start(controller) {
const enqueue = (data: string) =>
controller.enqueue(encoder.encode(`data: ${data}\n\n`))
// Call your LLM / generate tokens
const words = generateResponse(messages) // async iterable of tokens
for await (const token of words) {
enqueue(JSON.stringify({
choices: [{ index: 0, delta: { content: token }, finish_reason: null }],
}))
}
// Final chunk with finish_reason
enqueue(JSON.stringify({
choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
}))
// OpenAI-compatible termination signal
enqueue('[DONE]')
controller.close()
},
})
return new Response(stream, {
headers: {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
},
})
}
// ── Consume OpenAI streaming in the browser ───────────────────
async function streamChat(messages: { role: string; content: string }[]) {
const res = await fetch('/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ messages }),
})
const reader = res.body!.getReader()
const decoder = new TextDecoder()
let buffer = ''
let fullText = ''
while (true) {
const { done, value } = await reader.read()
if (done) break
buffer += decoder.decode(value, { stream: true })
const lines = buffer.split('\n')
buffer = lines.pop() ?? ''
for (const line of lines) {
if (!line.startsWith('data: ')) continue
const payload = line.slice(6).trim() // strip "data: " prefix
if (payload === '[DONE]') return fullText
const chunk = JSON.parse(payload)
const delta = chunk.choices?.[0]?.delta?.content ?? ''
fullText += delta
// Update UI incrementally — word-by-word rendering
updateChatUI(fullText)
}
}
return fullText
}
// ── Using Vercel AI SDK (abstracts the streaming boilerplate) ──
import { streamText } from 'ai'
import { openai } from '@ai-sdk/openai'
export async function POST(req: NextRequest) {
const { messages } = await req.json()
const result = await streamText({
model: openai('gpt-4o'),
messages,
})
// toDataStreamResponse() returns SSE in OpenAI-compatible format
return result.toDataStreamResponse()
}The Vercel AI SDK's streamText and toDataStreamResponse() encapsulate the SSE boilerplate and handle edge cases like stream cancellation when the user navigates away. For custom LLMs, the manual pattern gives full control over the delta format. The critical performance insight: streaming reduces time-to-first-token from the full inference time (3–10 seconds for a long response) to the time for the first token (typically 200–800 ms), dramatically improving perceived responsiveness in AI applications.
Go: Streaming JSON with json.NewEncoder and json.NewDecoder
Go's encoding/json package has built-in streaming support via json.NewDecoder and json.NewEncoder. json.NewDecoder(r).Decode(&v) reads exactly one JSON value from the reader and stops — subsequent calls continue from where the previous left off, making it perfect for NDJSON streams. json.NewEncoder(w).Encode(v) writes one JSON value followed by a newline to the writer, naturally producing NDJSON output. Unlike json.Unmarshal which requires the full JSON in memory, the decoder reads lazily from the underlying io.Reader.
package main
import (
"encoding/json"
"fmt"
"io"
"net/http"
"os"
"time"
)
type Record struct {
ID int `json:"id"`
Name string `json:"name"`
Score float64 `json:"score"`
}
// ── Stream NDJSON from a file ──────────────────────────────────
func readNdjsonFile(path string) error {
f, err := os.Open(path)
if err != nil {
return err
}
defer f.Close()
// NewDecoder reads one JSON object per Decode() call
dec := json.NewDecoder(f)
for {
var rec Record
err := dec.Decode(&rec)
if err == io.EOF {
break // end of file — normal termination
}
if err != nil {
return fmt.Errorf("decode error: %w", err)
}
fmt.Printf("Record: %s (%.1f)\n", rec.Name, rec.Score)
}
return nil
}
// ── Serve NDJSON from an HTTP handler ─────────────────────────
func ndjsonHandler(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/x-ndjson")
// No Content-Length — chunked transfer encoding
enc := json.NewEncoder(w) // Encode() appends \n after each value
records := []Record{
{1, "Alice", 98.5},
{2, "Bob", 87.2},
{3, "Carol", 91.0},
}
for _, rec := range records {
if err := enc.Encode(rec); err != nil {
return // client disconnected
}
// Flush to send the chunk immediately without buffering
if f, ok := w.(http.Flusher); ok {
f.Flush()
}
time.Sleep(100 * time.Millisecond) // simulate processing delay
}
}
// ── Stream an SSE response from an HTTP handler ───────────────
func sseHandler(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "text/event-stream")
w.Header().Set("Cache-Control", "no-cache")
w.Header().Set("Connection", "keep-alive")
flusher, ok := w.(http.Flusher)
if !ok {
http.Error(w, "streaming not supported", http.StatusInternalServerError)
return
}
enc := json.NewEncoder(w)
eventId := 0
ticker := time.NewTicker(500 * time.Millisecond)
defer ticker.Stop()
for {
select {
case <-r.Context().Done():
return // client disconnected
case t := <-ticker.C:
eventId++
payload, _ := json.Marshal(map[string]any{
"timestamp": t.UnixMilli(),
"value": t.Second(),
})
fmt.Fprintf(w, "id: %d\ndata: %s\n\n", eventId, payload)
_ = enc // enc available for complex types
flusher.Flush()
}
}
}
// ── Consume NDJSON from an HTTP response ──────────────────────
func consumeNdjson(url string) error {
resp, err := http.Get(url)
if err != nil {
return err
}
defer resp.Body.Close()
// Decode reads one JSON object at a time from the response body
dec := json.NewDecoder(resp.Body)
for {
var rec Record
err := dec.Decode(&rec)
if err == io.EOF {
break
}
if err != nil {
return err
}
fmt.Printf("Got: %+v\n", rec)
}
return nil
}
func main() {
http.HandleFunc("/ndjson", ndjsonHandler)
http.HandleFunc("/events", sseHandler)
http.ListenAndServe(":8080", nil)
}Go's http.Flusher interface — implemented by the standard http.ResponseWriter — is essential for SSE and NDJSON streaming. Without calling Flush() after each write, Go's buffered HTTP writer holds chunks in memory and sends them in batches, preventing the client from seeing data until the buffer fills or the handler returns. Always check r.Context().Done() in streaming handlers to detect client disconnection and stop generating data — this prevents goroutine leaks when clients disconnect mid-stream.
Chunked Transfer Encoding vs Content-Length for JSON
HTTP/1.1 supports two modes for response body transmission. With Content-Length, the server announces the exact byte count upfront and sends the body in one shot — the client knows exactly when the response ends. With Transfer-Encoding: chunked, the server sends the body in pieces without knowing the total size ahead of time — each chunk is prefixed with its hex length, and a zero-length chunk (0\r\n\r\n) signals the end. Streaming JSON always uses chunked transfer encoding because the total response size is unknown at the time the first byte is sent. HTTP/2 and HTTP/3 use a different framing mechanism that achieves the same result without the chunked header.
// ── HTTP/1.1 chunked transfer encoding format ─────────────────
// Response headers:
// HTTP/1.1 200 OK
// Content-Type: application/x-ndjson
// Transfer-Encoding: chunked ← no Content-Length header
//
// Body (each chunk: <hex-length>\r\n<data>\r\n):
// 25\r\n
// {"id":1,"name":"Alice"}\n\r\n ← chunk 1: 37 bytes (hex 25)
// 23\r\n
// {"id":2,"name":"Bob"}\n\r\n ← chunk 2: 35 bytes (hex 23)
// 0\r\n ← final zero-length chunk
// \r\n ← end of chunked body
// ── When does Node.js use chunked automatically? ───────────────
// res.setHeader() + res.write() → chunked (streaming mode)
// res.send() / res.json() → Content-Length (buffered mode)
import http from 'node:http'
const server = http.createServer(async (req, res) => {
if (req.url === '/buffered') {
// Buffered: Content-Length calculated, sent as one response
const body = JSON.stringify([{ id: 1 }, { id: 2 }, { id: 3 }])
res.setHeader('Content-Type', 'application/json')
res.setHeader('Content-Length', Buffer.byteLength(body))
res.end(body)
}
if (req.url === '/streaming') {
// Streaming: no Content-Length, chunked transfer encoding
res.setHeader('Content-Type', 'application/x-ndjson')
// Node.js HTTP server sets Transfer-Encoding: chunked automatically
// when you call res.write() without setting Content-Length
for (let i = 1; i <= 1000; i++) {
res.write(JSON.stringify({ id: i, value: Math.random() }) + '\n')
// Each res.write() call becomes one or more chunks
}
res.end() // sends the zero-length terminating chunk
}
})
// ── HTTP/2 streaming (no Transfer-Encoding header needed) ─────
// HTTP/2 uses DATA frames with END_STREAM flag instead of chunked encoding.
// The streaming behavior is identical from the application perspective:
// res.write() sends a DATA frame; res.end() sends the END_STREAM DATA frame.
// node:http2 handles framing transparently — same Node.js streaming API.
// ── Compress streaming JSON with gzip + chunked ───────────────
import zlib from 'node:zlib'
import { pipeline } from 'node:stream/promises'
const gzipServer = http.createServer(async (req, res) => {
res.setHeader('Content-Type', 'application/x-ndjson')
res.setHeader('Content-Encoding', 'gzip')
// Transfer-Encoding: chunked is set automatically by Node.js
const gzip = zlib.createGzip()
gzip.pipe(res)
for (let i = 1; i <= 1000; i++) {
gzip.write(JSON.stringify({ id: i, value: Math.random() }) + '\n')
}
gzip.end()
// gzip compresses JSON ~70-80% — significant bandwidth savings for large streams
})HTTP/2 eliminates the Transfer-Encoding: chunked header entirely — HTTP/2 frames provide equivalent functionality at the protocol level. When deploying streaming JSON APIs on HTTP/2-capable infrastructure (Nginx, Caddy, AWS ALB, Cloudflare), the Transfer-Encoding header is transparent. The practical implication: always use res.write() for streaming and res.end() at completion regardless of HTTP version — Node.js and Go's HTTP libraries abstract the framing differences. The server's Flush() call (Go) or the absence of response buffering (Node.js) determines whether chunks appear at the client immediately or are batched by the runtime.
Key Terms
- NDJSON (Newline-Delimited JSON)
- A format where each line contains one complete, independently-parseable JSON value separated by
\n. Enables O(1)-memory incremental processing of arbitrarily large datasets. MIME type:application/x-ndjson. Also called JSON Lines (application/jsonl). Used by Elasticsearch bulk API, OpenAI fine-tuning, BigQuery NDJSON loads, and log aggregators. Each line must be self-contained — embedded newlines in strings must be escaped as\n, not literal line breaks. - Server-Sent Events (SSE)
- An HTTP-based protocol for unidirectional server-to-client streaming using
Content-Type: text/event-stream. Events arefield: valueline blocks terminated by\n\n. Thedata:field carries the payload (JSON as a serialized string). The browser'sEventSourceAPI handles reconnection using theLast-Event-IDheader. SSE works over HTTP/1.1 and HTTP/2, requires no protocol upgrade, and supports up to 100 concurrent streams per domain in HTTP/2. - ReadableStream
- A Web Streams API interface representing a readable sequence of byte chunks. Used on both client (Fetch API
response.body) and server (Next.js App Routernew Response(stream)). AReadableStreamDefaultReaderobtained viastream.getReader()exposes anasync read()method returning{ done, value }. Back-pressure is automatic: the reader signals when it is ready for more data, preventing the producer from generating data faster than the consumer can process it. - TransformStream
- A Web Streams API interface with a writable input side and a readable output side, enabling pipe-based stream transformation. A
TransformStreamcan sit between a sourceReadableStreamand a response, transforming chunks — for example, parsing NDJSON lines and emitting structured objects, or compressing chunks with gzip. Thetransform(chunk, controller)callback receives each input chunk and callscontroller.enqueue()to emit output chunks. - chunked transfer encoding
- An HTTP/1.1 transfer mechanism where the response body is sent in variable-size pieces without a
Content-Lengthheader. Each chunk is prefixed with its size in hexadecimal, allowing the server to start transmitting before knowing the total response size. A zero-length chunk signals end-of-body. HTTP/2 achieves equivalent streaming via DATA frames with the END_STREAM flag — noTransfer-Encodingheader is used. Node.js sets chunked transfer automatically whenres.write()is called without aContent-Lengthheader. - json.NewDecoder (Go)
- Go's streaming JSON decoder from the
encoding/jsonpackage. Unlikejson.Unmarshal, which requires the entire JSON payload in memory,json.NewDecoder(reader).Decode(&v)reads lazily from anio.Reader, consuming only enough bytes to parse one complete JSON value. CallingDecode()repeatedly on the same decoder advances through an NDJSON stream or a JSON array. Returnsio.EOFwhen the stream ends — the correct way to detect end-of-stream rather than checking for an empty response.
FAQ
What is NDJSON and how is it different from regular JSON?
NDJSON (Newline-Delimited JSON) encodes one complete JSON value per line, using a newline character (\n) as the record separator. A regular JSON file wraps all data in a single root structure — one large object or array — which means the entire file must be parsed before any record is available. NDJSON breaks that constraint: each line is independently valid JSON, so a parser can read and process line 1 while lines 2 through 1,000,000 are still arriving. The MIME type is application/x-ndjson (also written application/jsonl for JSON Lines). NDJSON is the standard format for Elasticsearch bulk operations, OpenAI fine-tuning datasets, BigQuery NDJSON loads, and Hugging Face dataset shards. A 500 MB NDJSON log file can be processed in O(1) memory — one line at a time — whereas the equivalent single-document JSON array would require loading all 500 MB into RAM before the first record can be read. The only constraint is that individual JSON values cannot span multiple lines, so embedded newline characters inside strings must be escaped as \n.
How do Server-Sent Events send JSON data?
Server-Sent Events (SSE) transmit data over a persistent HTTP connection using the text/event-stream content type. Each event is a block of lines ending with two newlines (\n\n). The data: field carries the payload — for JSON, the value is a JSON-serialized string: data: {'{"type":"message","content":"hello"}'}\n\n. The browser's built-in EventSource API reconnects automatically after network drops, using the Last-Event-ID header to resume from the last received event. An optional id: field sets this resume token; an event: field sets a named event type instead of the default "message". SSE supports only server-to-client push — the client cannot send data after the initial HTTP request. This is perfect for real-time feeds, notifications, and AI token streaming. Compared to WebSockets, SSE uses standard HTTP (no protocol upgrade), works through HTTP/2 multiplexing (up to 100 concurrent SSE streams per connection), and has built-in reconnection. SSE connections hold open for the duration of the stream; servers should send periodic keep-alive comments (: keep-alive\n\n) to prevent proxy timeouts.
How do I stream JSON from a Next.js API route?
In the Next.js App Router, Route Handlers return a streaming Response by passing a ReadableStream as the body. Create a ReadableStream with an async start(controller) callback, write JSON chunks using controller.enqueue(encoder.encode(JSON.stringify(record) + '\n')), and close with controller.close(). Return new Response(stream, {'{ headers: { "Content-Type": "application/x-ndjson" } }'}). For SSE, set Content-Type: text/event-stream and format chunks as data: {...}\n\n. Add export const dynamic = 'force-dynamic' to prevent Next.js from caching the handler, and export const runtime = 'edge' for long-running AI streams that would otherwise hit the 10-second serverless timeout on Node.js runtime. Both Edge and Node.js runtimes support ReadableStream — the API is identical regardless of runtime.
How do I consume a streaming JSON response in the browser?
Use the Fetch API with response.body (a ReadableStream) and a TextDecoder to process chunks as they arrive. The pattern: obtain a reader with res.body.getReader(), call reader.read() in a loop, decode each Uint8Array chunk with decoder.decode(value, {'{ stream: true }'}), accumulate in a string buffer, split on newlines, parse each complete line as JSON. The buffer handles TCP fragmentation — a JSON object may arrive split across 2 read() calls. For SSE, use the browser's built-in EventSource API: new EventSource(url), handle events with es.onmessage = (e) {'=> JSON.parse(e.data)'}, and call es.close() to stop. In React, start the reader in useEffect and cancel with reader.cancel() in the cleanup function to prevent stream lock leaks on component unmount. The { stream: true } option on TextDecoder prevents multi-byte UTF-8 characters split across chunk boundaries from being corrupted.
How do I implement OpenAI-style streaming in my own API?
OpenAI streaming uses Server-Sent Events with a specific JSON envelope per token. Each SSE event carries one JSON chunk: data: {'{"choices":[{"delta":{"content":"hello"},"finish_reason":null}]}'}. The stream ends with data: [DONE]. To replicate this: 1) Set Content-Type: text/event-stream and Cache-Control: no-cache. 2) For each token from your LLM or data source, write data: <JSON>\n\n to the response. 3) Send data: [DONE]\n\n to signal completion. In Next.js App Router, use a ReadableStream that iterates an async generator. The client reads events, strips the data: prefix, skips [DONE], parses the JSON, and appends delta.content to the accumulated text. This pattern reduces time-to-first-token from the full inference time (5–10 seconds for a long response) to the time for the first token (200–800 ms), dramatically improving perceived responsiveness. The Vercel AI SDK's streamText + toDataStreamResponse() encapsulates this entire pattern.
What is the difference between SSE and WebSockets for JSON streaming?
Server-Sent Events (SSE) and WebSockets solve different problems. SSE is unidirectional (server to client only) over standard HTTP/1.1 or HTTP/2, uses the text/event-stream content type, has built-in reconnection with Last-Event-ID, and requires no special server infrastructure. WebSockets are bidirectional (full-duplex), require a protocol upgrade from HTTP, and keep a persistent TCP connection open. For JSON streaming in one direction — AI token streaming, live dashboard updates, notification feeds — SSE is simpler and works through HTTP/2 multiplexing (up to 100 concurrent SSE streams per domain versus 6 for HTTP/1.1). WebSockets are better when the client must send frequent messages back to the server while the stream is active — multiplayer games, collaborative editing, real-time chat with sub-50 ms latency requirements. The practical rule: if you only need server-push JSON, use SSE. If you need bidirectional JSON at low latency with frequent client messages, use WebSockets. SSE's HTTP/2 advantage makes it suitable for up to 100 concurrent streams per browser tab to the same origin.
How do I stream large JSON datasets without running out of memory?
Never buffer the entire dataset in memory — process one record at a time. For NDJSON output in Node.js, write each JSON line to the response as it is generated rather than collecting all records first. For NDJSON input in Node.js, use the readline module: createInterface({'{ input: fs.createReadStream("data.ndjson") }'}) and iterate with for await (const line of rl). This processes arbitrarily large files in O(1) memory — only one line is in memory at a time. In Go, json.NewDecoder(r.Body).Decode(&v) reads one JSON object from the stream without buffering the rest. In Python, use ijson for streaming JSON array parsing: for item in ijson.items(file, "item"): process(item). For database-backed streaming, use cursor-based pagination or a SQL CURSOR to avoid loading all rows: iterate rows in batches of 100–500 and yield each as a JSON line. This pattern handles datasets of any size — 1 GB, 100 GB — without memory pressure. The key insight: Content-Length cannot be set for streaming responses, so Transfer-Encoding: chunked (HTTP/1.1) or HTTP/2 DATA frames carry the body incrementally.
How do I parse NDJSON line by line in Node.js?
Use the Node.js built-in readline module for file-based NDJSON: createInterface({'{ input: createReadStream("data.ndjson"), crlfDelay: Infinity }'}) and iterate with for await (const line of rl). The crlfDelay: Infinity option handles Windows-style CRLF line endings. Wrap each JSON.parse(line) in a try/catch — a single malformed line should not abort the entire stream. For HTTP responses from fetch(), read the body stream chunk by chunk, accumulate in a buffer string, split on \n, parse each complete line, and keep any partial last line in the buffer for the next chunk — this handles TCP fragmentation correctly. The ndjson npm package wraps this pattern as a Node.js Transform stream. For TypeScript, validate each parsed object with Zod: MySchema.safeParse(JSON.parse(line)) to catch schema violations in untrusted NDJSON input without throwing. A 1 GB NDJSON file with 10 million records can be processed in under 30 seconds with this pattern while using under 50 MB of RAM.
Format and validate JSON instantly
Paste any JSON into Jsonic's formatter to beautify, validate, and explore structure — no sign-up required.
Open JSON FormatterFurther reading and primary sources
- NDJSON Specification — Formal NDJSON format specification — one JSON value per line, newline-separated
- MDN: Server-Sent Events — MDN reference for the EventSource API and SSE event format with browser compatibility
- MDN: ReadableStream — Web Streams API ReadableStream documentation — getReader(), enqueue(), and back-pressure
- OpenAI Streaming API Reference — OpenAI SSE streaming format: chunk structure, delta fields, [DONE] termination, and error handling
- Vercel AI SDK: streamText — Vercel AI SDK streamText() and toDataStreamResponse() for Next.js streaming AI Route Handlers