JSON String Escaping: Complete Reference, Unicode & stringify()
Last updated:
JSON strings must escape 8 control characters: " (double quote), \\ (backslash), \/ (forward slash, optional), \b (backspace), \f (form feed), \n (newline), \r (carriage return), and \t (tab). JSON.stringify() handles all required escaping automatically — you never need to manually escape strings before calling stringify(); JSON.parse() unescapes them back. Unicode characters above U+FFFF require surrogate pairs: \uD83D\uDE00 encodes the 😀 emoji. This guide covers the complete JSON escape sequence reference, Unicode escaping with \uXXXX, surrogate pairs, JSON.stringify() escaping options, and the critical differences between JSON escaping and HTML entity encoding. Practical examples include newlines in JSON values, JSON inside HTML attributes, and escaping for database storage.
JSON Escape Sequences: The Complete Reference Table
The JSON syntax specification (RFC 8259) defines exactly 8 named escape sequences. The double quote and backslash are mandatory because they have syntactic meaning inside JSON strings. The four control characters (\b, \f, \n, \r, \t) are mandatory because raw control characters (U+0000–U+001F) are not permitted inside JSON string delimiters. Forward slash escaping is the only optional one.
| Escape Sequence | Unicode Code Point | Character Name | Mandatory? |
|---|---|---|---|
| " | U+0022 | Double quote | Yes |
| \\ | U+005C | Backslash (reverse solidus) | Yes |
| \/ | U+002F | Forward slash (solidus) | Optional |
| \b | U+0008 | Backspace | Yes |
| \f | U+000C | Form feed | Yes |
| \n | U+000A | Line feed (newline) | Yes |
| \r | U+000D | Carriage return | Yes |
| \t | U+0009 | Horizontal tab | Yes |
| \uXXXX | U+0000–U+FFFF | Unicode escape (any BMP code point) | For U+0000–U+001F |
// ── All 8 JSON escape sequences in practice ──────────────────────
const examples = {
doubleQuote: "He said \"hello\"", // " → "
backslash: "C:\\Users\\Alice", // \\ → \
forwardSlash: "https:\/\/jsonic.io", // \/ → / (optional)
backspace: "before\bafter", // \b → backspace char
formFeed: "page1\fpage2", // \f → form feed
newline: "line1\nline2", // \n → line feed
carriageReturn:"line1\rline2", // \r → carriage return
tab: "col1\tcol2", // \t → tab
unicode: "caf\u00e9", // \uXXXX → é
};
console.log(JSON.stringify(examples, null, 2));
// All values are auto-escaped by JSON.stringify()
// ── Characters that do NOT need escaping ──────────────────────────
// Any Unicode above U+001F (except " and \) is valid as-is:
const valid = {
japanese: "こんにちは", // U+3053 etc. — fine as-is
emoji: "Hello 👋", // U+1F44B — fine as-is in JSON
accented: "café résumé", // fine as-is
symbols: "© ™ € £ ¥", // fine as-is
angle: "< > & ' `", // fine as-is in JSON (not HTML!)
};
// ── Raw control characters are INVALID in JSON strings ────────────
// These will cause a parse error:
// {"bad": "line1
// line2"} ← raw newline: INVALID JSON
// {"bad": " tab"} ← raw tab: INVALID JSON (must use \t)
// Validate with JSON.parse — it throws on invalid JSON:
try {
JSON.parse('{"bad": "line1\nline2"}'); // valid: \n escape
JSON.parse('{"bad": "line1\u000Aline2"}'); // valid: \uXXXX form
} catch (e) {
console.error("Invalid JSON:", e.message);
}Every other Unicode character (U+0020 and above, excluding " and ) may appear directly in a JSON string without escaping. This means Chinese, Arabic, emoji, mathematical symbols, and currency signs are all valid JSON string content as-is — as long as the file encoding is UTF-8 (the JSON standard requires UTF-8). The \uXXXX form is an alternative encoding, not a requirement, for those characters. See the JSON data types guide for a full overview of the string type.
Unicode Escaping with \uXXXX and Surrogate Pairs
The \uXXXX escape encodes any Unicode code point in the Basic Multilingual Plane (U+0000 to U+FFFF) using exactly 4 hexadecimal digits. For supplementary characters (U+10000 and above) — which include emoji, historic scripts, and mathematical symbols — JSON uses surrogate pairs: two consecutive \uXXXX sequences derived from the UTF-16 encoding of the code point.
// ── \uXXXX: Basic Multilingual Plane (U+0000 to U+FFFF) ──────────
// Any BMP character encoded as exactly 4 hex digits
const bmpExamples = {
null_char: "\u0000", // U+0000 — null (control, must escape)
tab: "\u0009", // U+0009 — same as \t
newline: "\u000A", // U+000A — same as \n (hex is uppercase or lower)
space: "\u0020", // U+0020 — space (can also appear as-is)
A: "\u0041", // U+0041 — 'A' (can also appear as-is)
eAccent: "\u00E9", // U+00E9 — é
euroSign: "\u20AC", // U+20AC — €
hiragana: "\u3053", // U+3053 — こ
cjk: "\u4E2D", // U+4E2D — 中
};
// Parsing \uXXXX sequences:
JSON.parse('"\\u00E9"'); // → "é" (\u00E9 parsed to é)
JSON.parse('"caf\\u00E9"'); // → "café"
// ── Surrogate pairs: Supplementary Plane (U+10000 and above) ─────
// Step 1: subtract 0x10000 from the code point
// Step 2: high surrogate = 0xD800 + (offset >> 10)
// Step 3: low surrogate = 0xDC00 + (offset & 0x3FF)
// Example: 😀 GRINNING FACE (U+1F600)
// offset = 0x1F600 - 0x10000 = 0xF600
// high surrogate = 0xD800 + (0xF600 >> 10) = 0xD800 + 0x3D = 0xD83D
// low surrogate = 0xDC00 + (0xF600 & 0x3FF) = 0xDC00 + 0x200 = 0xDE00
// JSON encoding: \uD83D\uDE00
const emojiExamples = {
grinningFace: "\uD83D\uDE00", // 😀 U+1F600
thumbsUp: "\uD83D\uDC4D", // 👍 U+1F44D
fire: "\uD83D\uDD25", // 🔥 U+1F525
rocket: "\uD83D\uDE80", // 🚀 U+1F680
flag_us: "\uD83C\uDDFA\uD83C\uDDF8", // 🇺🇸 (two surrogate pairs!)
};
console.log(JSON.parse(JSON.stringify(emojiExamples)));
// { grinningFace: "😀", thumbsUp: "👍", fire: "🔥", ... }
// ── JavaScript surrogate pair calculation ─────────────────────────
function codePointToSurrogatePair(codePoint) {
if (codePoint <= 0xFFFF) {
return `\\u${codePoint.toString(16).padStart(4, "0").toUpperCase()}`;
}
const offset = codePoint - 0x10000;
const high = 0xD800 + (offset >> 10);
const low = 0xDC00 + (offset & 0x3FF);
return (
`\\u${high.toString(16).toUpperCase()}` +
`\\u${low.toString(16).toUpperCase()}`
);
}
codePointToSurrogatePair(0x1F600); // "\\uD83D\\uDE00"
codePointToSurrogatePair(0x00E9); // "\\u00E9"
// ── ASCII-safe JSON output (escape all non-ASCII) ─────────────────
function asciiSafeStringify(value) {
return JSON.stringify(value).replace(/[\u0080-\uFFFF]/g, (char) => {
const code = char.charCodeAt(0);
if (code >= 0xD800 && code <= 0xDBFF) {
return char; // high surrogate — already paired, leave as-is
}
return `\\u${code.toString(16).padStart(4, "0")}`;
});
}
asciiSafeStringify({ msg: "café 😀" });
// '{"msg":"caf\\u00e9 \\uD83D\\uDE00"}' — pure ASCII outputSurrogate pairs are a consequence of JSON using UTF-16 code units for \uXXXX escapes — the same encoding JavaScript uses internally for strings. A lone surrogate (a high surrogate without a following low surrogate, or vice versa) is technically invalid in JSON but accepted by most parsers. Flag emoji like 🇺🇸 require two surrogate pairs (4 \uXXXX sequences) because they are composed of two supplementary code points. The ASCII-safe stringify approach above is useful when transmitting JSON through systems that may mangle non-ASCII bytes, such as legacy email servers or certain XML processors.
JSON.stringify() Automatic Escaping Behavior
JSON.stringify() performs all required JSON escaping automatically. You should never manually escape strings before passing them to stringify — doing so causes double-escaping, one of the most common JSON bugs. Understanding exactly what stringify escapes (and does not escape) prevents both double-escaping and missed-escaping errors. See the full JSON.stringify() guide for replacer, space, and toJSON options.
// ── What JSON.stringify() escapes automatically ───────────────────
const input = {
doubleQuote: 'He said "hello"', // → He said \"hello\"
backslash: 'C:\\Users\\Alice', // → C:\\\\Users\\\\Alice
newline: "line1\nline2", // → line1\\nline2
tab: "col1\tcol2", // → col1\\tcol2
nullChar: "\u0000", // → \\u0000
controlChars: "\u0001\u001F", // → \\u0001\\u001f
forwardSlash: "https://jsonic.io", // NOT escaped: / → /
angle: "<script>", // NOT escaped: < > → < >
ampersand: "a & b", // NOT escaped: & → &
unicode: "café 😀", // NOT escaped: kept as-is
};
const json = JSON.stringify(input, null, 2);
// {
// "doubleQuote": "He said \"hello\"",
// "backslash": "C:\\\\Users\\\\Alice",
// "newline": "line1\\nline2",
// "tab": "col1\\tcol2",
// "nullChar": "\\u0000",
// "controlChars": "\\u0001\\u001f",
// "forwardSlash": "https://jsonic.io",
// "angle": "<script>",
// "unicode": "café 😀"
// }
// ── Double-escaping: the most common stringify mistake ────────────
// BAD: manually escape then stringify
const alreadyEscaped = 'line1\\nline2'; // literal backslash-n
JSON.stringify(alreadyEscaped);
// Result: "\"line1\\\\nline2\"" — double-escaped!
// When parsed back: "line1\\nline2" — NOT a newline, a literal \n
// GOOD: pass the actual string with the real newline
const realNewline = "line1\nline2"; // real newline character
JSON.stringify(realNewline);
// Result: "\"line1\\nline2\"" — correctly escaped
// ── XSS-safe stringify for HTML embedding ─────────────────────────
// JSON.stringify() does NOT escape < > & — dangerous in HTML contexts
// Use a replacer to escape these for safe embedding:
function xssSafeStringify(value, space) {
return JSON.stringify(value, null, space)
.replace(/</g, "\\u003C")
.replace(/>/g, "\\u003E")
.replace(/&/g, "\\u0026")
.replace(/'/g, "\\u0027");
}
xssSafeStringify({ script: "<script>alert(1)</script>" });
// '{"script":"\\u003Cscript\\u003Ealert(1)\\u003C\/script\\u003E"}'
// Safe to embed in any HTML context — no < or > characters
// ── JSON.parse() reverses all escaping ───────────────────────────
JSON.parse('"line1\\nline2"'); // → "line1\nline2" (real newline)
JSON.parse('"caf\\u00E9"'); // → "café"
JSON.parse('"He said \\\"hi\\\""'); // → 'He said "hi"'
// ── Controlling stringify output with replacer ────────────────────
// Force all strings to ASCII-safe \uXXXX form:
function replacer(key, value) {
if (typeof value === "string") {
return value.replace(/[^\x20-\x7E]/g, (c) =>
`\\u${c.charCodeAt(0).toString(16).padStart(4, "0")}`
);
}
return value;
}
JSON.stringify({ greeting: "café" }, replacer);
// '{"greeting":"caf\\u00e9"}'The critical takeaway: JSON.stringify() does not escape <, >, &, or / by default. These are safe inside pure JSON contexts but dangerous when the JSON string is embedded in HTML. The xssSafeStringify pattern above — replacing those characters with their \uXXXX equivalents — is the recommended approach for server-side rendering of JSON into HTML pages. The Unicode escape form \u003C is completely equivalent to < from JSON's perspective, so parsers handle it correctly while HTML parsers see no angle brackets.
Embedding JSON in HTML: Avoiding XSS with Escaping
Embedding JSON directly in HTML pages is a common pattern for server-side data injection, structured data (JSON-LD), and hydration of client-side frameworks. Done incorrectly, it opens XSS vulnerabilities. The key risk: JSON strings may contain <, >, &, and / characters that HTML parsers interpret as markup, allowing injection of arbitrary HTML and JavaScript.
// ── Dangerous: raw JSON in a <script> tag ────────────────────────
// If data.comment = "</script><script>alert(1)</script>":
// <script>
// const data = {"comment": "</script><script>alert(1)</script>"};
// </script>
// The HTML parser sees </script> and closes the tag early — XSS!
// ── Safe: escape </script> sequences ─────────────────────────────
function safeJsonForScript(data, space) {
return JSON.stringify(data, null, space)
// Replace </script> so the HTML parser does not close the tag:
.replace(/<\/script>/gi, "<\\/script>")
// Also escape <!-- to prevent comment injection:
.replace(/<!--/g, "<\\!--");
}
// Usage in a server-rendered page (Node.js/Express example):
// <script>
// const __DATA__ = <%= safeJsonForScript(serverData) %>;
// </script>
// Now </script> in data becomes <\/script> — HTML parser ignores it
// ── Safest: \uXXXX escaping for all potentially dangerous chars ──
function fullyEscapedJson(data, space) {
return JSON.stringify(data, null, space)
.replace(/</g, "\\u003C") // < → \u003C
.replace(/>/g, "\\u003E") // > → \u003E
.replace(/&/g, "\\u0026") // & → \u0026
.replace(/'/g, "\\u0027") // ' → \u0027 (for HTML attr contexts)
.replace(/\//g, "\\u002F"); // / → \u002F (closes </script>)
}
// Resulting JSON is pure ASCII except for Unicode strings:
fullyEscapedJson({ url: "https://jsonic.io", tag: "<b>bold</b>" });
// '{"url":"https:\\u002F\\u002Fjsonic.io","tag":"\\u003Cb\\u003Ebold\\u003C\\u002Fb\\u003E"}'
// ── JSON-LD structured data (<script type="application/ld+json">) ─
// Inside type="application/ld+json", the browser does not execute
// the content as JS, but still stops at </script>:
function safeJsonLd(schema) {
return JSON.stringify(schema)
.replace(/<\/script>/gi, "<\\/script>");
// OR use fullyEscapedJson() for maximum safety
}
// HTML output:
// <script type="application/ld+json">
// { "@context": "https://schema.org", ... }
// </script>
// ── React / Next.js: dangerouslySetInnerHTML is already safe ──────
// JSON.stringify() is called internally — use fullyEscapedJson()
// for additional protection in SSR:
// <script
// type="application/ld+json"
// dangerouslySetInnerHTML={{ __html: fullyEscapedJson(jsonLd) }}
// />
// ── NEVER use JSON in HTML event attributes ───────────────────────
// Bad (HTML-decodes & before JS sees it — corrupts JSON):
// <div onclick='handleData({"a":1,"b":"x & y"})'>
// JS sees: {"a":1,"b":"x & y"} — & was decoded, breaking potential values
// Bad (double quotes conflict with attribute delimiters):
// <div data-json='{"key":"value"}'> — conflicting quotes
// Use data attributes with JSON.stringify only:
// element.dataset.config = JSON.stringify(config);
// const config = JSON.parse(element.dataset.config);The safest pattern for HTML embedding uses the \uXXXX form for <, >, &, and / — this produces JSON that is valid and parseable while containing zero HTML-sensitive characters. Libraries like Google's Closure compiler, Python's json module with ensure_ascii=False, and Django's json_script template tag implement this pattern. In React and Next.js, dangerouslySetInnerHTML does not HTML-encode the content before inserting it, so apply fullyEscapedJson() before passing the string to ensure XSS safety in SSR.
Newlines and Tabs in JSON String Values
Newlines and tabs are the most frequently mishandled JSON escape sequences. Raw newline characters (U+000A) and tab characters (U+0009) inside JSON string delimiters are invalid — they must be represented as \n and \t. The confusion usually arises when JSON is hand-edited or generated by string concatenation rather than JSON.stringify(). Understanding exactly how these characters flow through stringify and parse prevents the common \n vs real newline mismatch.
// ── Newlines in JSON string values ───────────────────────────────
// INVALID: raw newline inside JSON string
// {"message": "line1
// line2"}
// JSON parsers reject this — use \n instead:
// VALID: \n escape sequence
const validJson = '{"message": "line1\\nline2"}';
const parsed = JSON.parse(validJson);
console.log(parsed.message);
// line1
// line2 ← real newline character in output
// ── JSON.stringify() converts real newlines to \n ─────────────────
const obj = { message: "line1\nline2" }; // real newline in JS string
const json = JSON.stringify(obj);
console.log(json); // {"message":"line1\\nline2"}
console.log(json.includes("\\n")); // true — \n in JSON output
// Round-trip: stringify then parse restores the original string
const restored = JSON.parse(json);
console.log(restored.message === obj.message); // true
// ── Tabs in JSON string values ────────────────────────────────────
const tabJson = '{"row": "col1\\tcol2\\tcol3"}';
const tabParsed = JSON.parse(tabJson);
console.log(tabParsed.row);
// col1 col2 col3 ← real tab characters
// CSV data with tabs (TSV format):
const tsvData = {
headers: "name\tage\tcity",
row1: "Alice\t30\tNYC",
row2: "Bob\t25\tLA",
};
const tsvJson = JSON.stringify(tsvData);
// {"headers":"name\\tage\\tcity","row1":"Alice\\t30\\tNYC",...}
// ── Multi-line text: \n for LF, \r\n for CRLF ─────────────────
// Unix line endings (LF only):
const unixText = { content: "line1\nline2\nline3" };
// Windows line endings (CRLF):
const windowsText = { content: "line1\r\nline2\r\nline3" };
JSON.stringify(unixText);
// '{"content":"line1\\nline2\\nline3"}'
JSON.stringify(windowsText);
// '{"content":"line1\\r\\nline2\\r\\nline3"}'
// ── Detecting raw newlines in malformed JSON ──────────────────────
function detectRawControlChars(jsonString) {
const raw = /[\x00-\x1F]/.test(jsonString);
if (raw) {
const match = jsonString.match(/[\x00-\x1F]/);
console.warn(
`Raw control character U+${match[0].charCodeAt(0)
.toString(16)
.padStart(4, "0")
.toUpperCase()} found — JSON is invalid`
);
}
return raw;
}
// ── Normalize newlines before stringifying ────────────────────────
// If input may contain mixed line endings, normalize first:
function normalizeNewlines(str) {
return str.replace(/\r\n/g, "\n").replace(/\r/g, "\n");
}
const normalized = normalizeNewlines("line1\r\nline2\rline3");
// "line1\nline2\nline3"
JSON.stringify({ text: normalized });
// '{"text":"line1\\nline2\\nline3"}'The most reliable approach: always use JSON.stringify() to produce JSON from JavaScript string values containing newlines and tabs — it correctly converts the real characters to their \n and \t escape forms. Hand-editing JSON multi-line strings is error-prone; use a JSON editor or generate programmatically. When consuming JSON from external systems (APIs, databases), always validate with JSON.parse() rather than regex — the parse call will throw on raw control characters, surfacing malformed data before it reaches your application logic.
Escaping JSON for SQL, Shell, and Environment Variables
When JSON strings are stored in databases, passed as shell arguments, or embedded in environment variables, additional escaping layers are required beyond JSON's own rules. Each context has its own special characters that must be escaped — and these layers are applied on top of, not instead of, JSON's escaping.
// ── SQL: parameterized queries prevent injection ─────────────────
// NEVER: string interpolation with JSON data
// const sql = `INSERT INTO logs (data) VALUES ('${jsonString}')`;
// A JSON string containing a single quote breaks the SQL string literal.
// CORRECT: parameterized query (Node.js pg / postgres):
import { Pool } from "pg";
const pool = new Pool();
const data = { message: "It's a test", tags: ["a", "b"] };
const jsonString = JSON.stringify(data);
// pg handles quoting internally — no manual escaping needed:
await pool.query(
"INSERT INTO logs (data) VALUES ($1::jsonb)",
[jsonString] // pg driver escapes the string for SQL
);
// For MySQL (mysql2):
await connection.execute(
"INSERT INTO logs (data) VALUES (?)",
[jsonString] // mysql2 driver escapes automatically
);
// ── Shell: single-quote wrapping is safest ───────────────────────
// JSON often contains double quotes — use single-quote wrapping in shell.
// Single-quoted strings in bash do not interpret any escape sequences.
// Problem: if JSON contains a single quote, it must be handled.
const shellData = JSON.stringify({ name: "Alice", age: 30 });
// '{"name":"Alice","age":30}'
// In bash — wrap in single quotes:
// echo '{"name":"Alice","age":30}' ← safe, no substitution
// If JSON might contain single quotes, escape them:
function escapeForShell(jsonStr) {
// Replace ' with '"'"' (end single-quote, double-quote the single-quote, reopen)
return "'" + jsonStr.replace(/'/g, "'\"'\"'") + "'";
}
// Or use heredoc in bash:
// curl -X POST https://api.example.com -d @- <<'EOF'
// {"message":"It's a test"}
// EOF
// ── Environment variables: JSON in .env files ─────────────────────
// JSON values in .env files must be wrapped in double quotes.
// Double quotes inside JSON must be escaped with backslash.
// .env file format:
// MY_CONFIG={"host":"localhost","port":5432} ← no spaces, no quotes needed
// MY_CONFIG='{"host":"localhost","port":5432}' ← single-quote wrap (bash)
// MY_CONFIG="{\"host\":\"localhost\",\"port\":5432}" ← double-quote wrap with \
// Reading in Node.js (dotenv):
// process.env.MY_CONFIG → '{"host":"localhost","port":5432}'
// JSON.parse(process.env.MY_CONFIG) → { host: "localhost", port: 5432 }
// Best practice: avoid JSON in env vars — use separate flat vars instead:
// DB_HOST=localhost
// DB_PORT=5432
// ── PostgreSQL JSONB: casting and escaping ────────────────────────
// Store JSON as JSONB (binary JSON) for indexing and query support:
// INSERT INTO settings (config) VALUES ('{"theme":"dark"}'::jsonb);
// Query JSON fields:
// SELECT config->>'theme' FROM settings WHERE config->>'theme' = 'dark';
// Node.js with pg — the driver handles escaping:
const config = { theme: "dark", fontSize: 14 };
await pool.query(
"UPDATE settings SET config = $1::jsonb WHERE id = $2",
[JSON.stringify(config), userId]
);
// ── Python: JSON in SQL with psycopg2 ────────────────────────────
// import json, psycopg2
// from psycopg2.extras import Json
//
// conn = psycopg2.connect(...)
// cur = conn.cursor()
// data = {"name": "Alice", "scores": [90, 85]}
//
// # Use psycopg2.extras.Json adapter — handles escaping:
// cur.execute("INSERT INTO logs (data) VALUES (%s)", (Json(data),))
// conn.commit()The cardinal rule for SQL: always use parameterized queries — never interpolate JSON strings into SQL using string formatting. Database drivers handle all necessary escaping when values are passed as parameters. For shell contexts, single-quote wrapping is the simplest safe approach since JSON contains double quotes but rarely contains single quotes. When JSON must go into a .env file, prefer flattening it into separate key-value pairs (see our guide on JSON.stringify()) rather than embedding raw JSON, which requires fragile quote escaping.
Common JSON String Escaping Errors and How to Fix Them
Most JSON escaping errors fall into five categories: double-escaping, missing escapes for control characters, incorrect surrogate pairs, and HTML/JSON escaping confusion. Each produces a distinct symptom that points directly to the root cause. Knowing the pattern makes debugging faster than trial-and-error string manipulation.
// ── Error 1: Double-escaping ─────────────────────────────────────
// Symptom: \n appears literally as two characters in parsed output
// Cause: manually escaped string passed to JSON.stringify()
const wrongWay = JSON.stringify("line1\\nline2");
// "line1\\nline2" is a 14-char string with literal backslash+n
// JSON.stringify() escapes the backslash: "line1\\\\nline2"
// When parsed: "line1\\nline2" — still literal backslash-n!
// Fix: pass the string with the REAL character:
const rightWay = JSON.stringify("line1\nline2");
// "line1\nline2" is a 12-char string with real newline (U+000A)
// JSON.stringify() escapes it correctly: "line1\\nline2"
// When parsed: "line1\nline2" — real newline restored ✓
// Detect double-escaping:
function isDoubleEscaped(str) {
return /\\\\[nrtbf"\\/]/.test(str);
}
// ── Error 2: Raw control character in JSON string ─────────────────
// Symptom: JSON.parse() throws "Unexpected token" or "Bad escaped character"
// Cause: raw U+000A (newline) or U+0009 (tab) inside JSON string literal
// This JSON is INVALID:
// {"text": "line1
// line2"}
// Fix: run the string through JSON.stringify() to produce valid JSON
function fixRawControlChars(invalidJsonString) {
// Replace raw control chars with their escape sequences
return invalidJsonString.replace(/[\x00-\x1F\x7F]/g, (char) => {
const escapes = { "\n": "\\n", "\r": "\\r", "\t": "\\t",
"\b": "\\b", "\f": "\\f" };
return escapes[char] || `\\u${char.charCodeAt(0).toString(16).padStart(4, "0")}`;
});
}
// ── Error 3: Unescaped backslash in file paths ────────────────────
// Symptom: "Unexpected token 'U'" or similar parse error in Windows paths
// Cause: Windows path with single backslash parsed as escape sequence
// {"path": "C:\Users\Alice"} ← \U is not a valid escape sequence
// Fix: double the backslashes
// {"path": "C:\\Users\\Alice"} ← correct
// In JavaScript:
const wrongPath = '{"path": "C:\\Users\\Alice"}'; // \U is invalid JSON
const fixedPath = '{"path": "C:\\\\Users\\\\Alice"}'; // \\ → \ in parsed string
JSON.parse(fixedPath); // { path: "C:\\Users\\Alice" }
// Better: use JSON.stringify() with the actual path string:
JSON.stringify({ path: "C:\\Users\\Alice" });
// '{"path":"C:\\\\Users\\\\Alice"}'
// ── Error 4: HTML entities in JSON ───────────────────────────────
// Symptom: & < " appear literally in parsed JSON values
// Cause: JSON embedded in HTML was HTML-decoded before JSON parsing,
// or HTML entity-encoded JSON was passed to JSON.parse() directly.
// HTML: <script id="data">{"msg": "a & b"}</script>
// Bad: JSON.parse('{"msg": "a & b"}') → { msg: "a & b" }
// ← & was NOT decoded; it is literal text in the JSON string
// The correct JSON should be: {"msg": "a & b"}
// HTML entities are decoded by the HTML parser, not the JSON parser.
// Solution: use textContent to retrieve the raw JSON text:
// const raw = document.getElementById("data").textContent;
// const parsed = JSON.parse(raw); // entities decoded by DOM, then parsed
// ── Error 5: Lone surrogate in JSON ──────────────────────────────
// Symptom: garbled characters or WTF-8 encoding issues
// Cause: high surrogate without matching low surrogate (or vice versa)
// {"char": "\uD83D"} ← lone high surrogate — invalid Unicode
// Fix: always include both surrogates for supplementary characters
// {"char": "\uD83D\uDE00"} ← correct surrogate pair for 😀
// Detection with JavaScript:
function hasLoneSurrogate(str) {
return /[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?<![\uD800-\uDBFF])[\uDC00-\uDFFF]/.test(str);
}
// ── Quick diagnosis: use JSON.parse() to validate ─────────────────
function validateJson(str) {
try {
JSON.parse(str);
return { valid: true };
} catch (e) {
return { valid: false, error: e.message };
}
}The single most effective debugging tool for JSON string escaping is JSON.parse() itself — it throws descriptive errors that identify the exact position of the invalid character. For double-escaping bugs, add a temporary console.log(JSON.stringify(yourString)) before stringifying the full object — if you see \\n (four characters: backslash backslash n) in the output rather than \n (two characters), the string was already escaped. The JSON.parse() guide covers error handling and safe parsing patterns in depth.
Key Terms
- escape sequence
- A two-or-more character combination that represents a single character that cannot appear literally in that context. In JSON, escape sequences begin with a backslash:
"represents a double quote,\\represents a backslash, and\nrepresents a newline (U+000A). The\uXXXXform is a six-character escape sequence representing any Unicode code point in the Basic Multilingual Plane. Escape sequences are interpreted by the JSON parser and converted to their represented character — the resulting string contains the actual character, not the escape syntax. - Unicode code point
- A unique number assigned to each character in the Unicode standard, written as U+ followed by hexadecimal digits (e.g., U+0041 for 'A', U+00E9 for 'é', U+1F600 for 😀). There are over 1.1 million possible code points organized into 17 planes. The Basic Multilingual Plane (U+0000 to U+FFFF) covers most modern scripts. Code points above U+FFFF are supplementary characters. In JSON, code points U+0000 to U+001F (control characters) must be escaped using
\uXXXXor a named sequence; all others may appear as-is or as\uXXXX. - surrogate pair
- A pair of 16-bit code units (a high surrogate in U+D800–U+DBFF and a low surrogate in U+DC00–U+DFFF) that together encode a single Unicode supplementary character (U+10000 and above) in UTF-16. JSON uses the
\uXXXXnotation for UTF-16 code units, so supplementary characters require two\uXXXXsequences. For example, 😀 (U+1F600) encodes as\uD83D\uDE00— the high surrogate\uD83Dfollowed by the low surrogate\uDE00. JavaScript's internal string representation is also UTF-16, sostr.lengthcounts surrogate pairs as 2; use[...str].lengthfor code point count. - BMP (Basic Multilingual Plane)
- The first of Unicode's 17 planes, covering code points U+0000 to U+FFFF (65,536 characters). The BMP contains most commonly used characters: Latin, Greek, Cyrillic, Arabic, Hebrew, Chinese, Japanese, Korean (CJK), and most symbols and punctuation. Characters in the BMP can be encoded with a single
\uXXXXescape in JSON. Characters outside the BMP (U+10000 and above) — including emoji, historic scripts, and mathematical alphabets — require surrogate pairs in JSON's\uXXXXnotation. Direct UTF-8 embedding of any Unicode character is also valid in JSON regardless of plane. - JSON.stringify()
- A built-in JavaScript function that converts a JavaScript value (object, array, string, number, boolean, or null) to a JSON string. For strings, it wraps the value in double quotes and applies all mandatory JSON escaping:
"becomes",\becomes\\, and control characters become their named escape sequences or\uXXXX. It does not escape/,<,>, or&by default. Accepts two optional arguments: a replacer (function or array to filter/transform values) and a space argument (number or string for pretty-printing). Returnsundefinedforundefinedvalues, functions, and symbols — not a JSON string with "undefined". - XSS (Cross-Site Scripting)
- A security vulnerability where an attacker injects malicious JavaScript into a web page viewed by other users. In the context of JSON escaping, XSS can occur when JSON data containing
</script>,<, or>is embedded directly in HTML without escaping those characters — the browser's HTML parser interprets them as markup, allowing script injection. Prevention: use\u003C,\u003E,\u0026instead of the literal characters when embedding JSON in HTML, or use a dedicatedContent-Security-Policyheader to restrict inline script execution. - control character
- Unicode characters in the range U+0000 to U+001F (C0 controls) and U+007F (DEL). These are non-printing characters originally defined for terminal control: null (U+0000), bell (U+0007), backspace (U+0008), tab (U+0009), line feed (U+000A), form feed (U+000C), carriage return (U+000D), escape (U+001B), and others. JSON prohibits raw control characters inside string literals — they must be represented using named escape sequences (
\b,\f,\n,\r,\t) or the\uXXXXform.JSON.stringify()automatically converts all control characters to their escaped forms.
FAQ
What characters must be escaped in a JSON string?
JSON requires exactly 8 named escape sequences for characters that cannot appear unescaped inside a double-quoted string: " (double quote, U+0022 — mandatory because it delimits the string), \\ (backslash, U+005C — mandatory because it introduces escapes), \/ (forward slash, U+002F — optional, sometimes used to avoid </script> issues), \b (backspace, U+0008), \f (form feed, U+000C), \n (line feed / newline, U+000A), \r (carriage return, U+000D), and \t (horizontal tab, U+0009). All other Unicode characters U+0020 and above — including <, >, &, Chinese, Arabic, emoji — may appear as-is in a JSON string. Any character can also be written as \uXXXX. Control characters U+0000–U+001F that do not have a named sequence (like U+0000, null) must use \uXXXX form (e.g., \u0000). JSON.stringify() handles all of this automatically.
How do I include a newline in a JSON string value?
Use the escape sequence \n — a backslash followed by the letter n — inside the JSON string. The raw U+000A line feed character is prohibited inside JSON string delimiters and will cause a parse error. Example of valid JSON: {"message": "Line one\nLine two"}. When JSON.parse() processes this, the \n sequence becomes a real newline (U+000A) in the resulting JavaScript string. In JavaScript, JSON.stringify() automatically converts a string containing a real newline to the \n form — you do not need to do it manually. For Windows-style CRLF line endings, use \r\n: {"message": "Line one\r\nLine two"}. The most common mistake is hand-editing JSON files and accidentally pressing Enter inside a string value, inserting a raw newline — use a JSON-aware editor that highlights this as an error, or run the file through a JSON validator before use.
How do I escape a double quote inside a JSON string?
Use " — a backslash immediately before the double quote. A raw double quote inside a JSON string would end the string prematurely, making the rest invalid JSON. Example: {"title": "He said \"hello\""} — when parsed, the value is He said "hello". In JavaScript, JSON.stringify() handles this automatically: JSON.stringify('He said "hello"') produces {"'"}"He said \"hello\""{"'"}. You never need to manually pre-escape double quotes before calling JSON.stringify(). If you do pre-escape and then stringify, you get double-escaping: the backslash itself gets escaped, producing \\" which parses to " (literal backslash-quote) rather than just a quote. In JSON schema validation errors mentioning "Unexpected token", check for an unescaped double quote inside a string value first — it is one of the most common hand-editing mistakes.
How do I include Unicode characters in a JSON string?
Two approaches: (1) Embed the character directly — JSON files are UTF-8 encoded and any Unicode character above U+001F (except " and \) is valid as-is. {"greeting": "こんにちは"} is perfectly valid JSON. (2) Use \uXXXX notation — encode any Basic Multilingual Plane character (U+0000 to U+FFFF) as a backslash followed by u and exactly 4 hex digits: \u3053 for こ, \u00E9 for é, \u20AC for €. For supplementary characters (emoji, historic scripts) above U+FFFF, use surrogate pairs: two \uXXXX sequences. The 😀 emoji (U+1F600) encodes as \uD83D\uDE00. Direct embedding (option 1) is simpler and produces smaller JSON. The \uXXXX form (option 2) is useful when the JSON file must remain pure ASCII — for example, when the transmission channel may corrupt non-ASCII bytes. JSON.stringify() uses direct embedding by default; apply a replacer for \uXXXX output.
How do JSON.stringify() and JSON.parse() handle string escaping?
JSON.stringify() converts a JavaScript value to a JSON string by wrapping string values in double quotes and escaping all characters that must be escaped: double quotes become ", backslashes become \\, newlines become \n, tabs become \t, carriage returns become \r, backspaces become \b, form feeds become \f, and other control characters (U+0000–U+001F) become \uXXXX. It does not escape /, <, >, or & by default. JSON.parse() performs the exact reverse: it reads the JSON string, interprets all escape sequences, and returns the original JavaScript value with the real characters. The pair is perfectly inverse — JSON.parse(JSON.stringify(value)) returns a deep copy of value (for JSON-serializable values). Never manually escape strings before calling JSON.stringify() — it produces double-escaping that causes data corruption when the result is later parsed.
What is the difference between JSON escaping and HTML entity encoding?
JSON escaping uses backslash sequences to make characters safe inside JSON string literals — the JSON parser converts them back to the original characters. HTML entity encoding uses ampersand sequences to make characters safe inside HTML markup — the browser renders them as their visual equivalent. They are entirely separate systems that operate in different contexts. A double quote in JSON is "; in HTML attributes it is ". An ampersand in JSON is just & (no escaping needed); in HTML it must be &. A newline in JSON is \n; in HTML a newline in text content renders as whitespace and in attributes as a real newline. These systems must not be mixed: running JSON.parse() on HTML-entity-encoded text treats & as a literal 5-character string, not as &. When embedding JSON in HTML, apply JSON escaping first (via JSON.stringify()), then optionally apply HTML-context safety by replacing <, >, & with their \uXXXX equivalents — this keeps the result valid JSON while removing HTML-dangerous characters.
How do I safely embed JSON in an HTML page?
The safest approach uses a <script type="application/ld+json"> or <script type="application/json"> tag — the browser does not execute the content as JavaScript. However, the HTML parser still terminates the script element at the first </script> sequence, so escape forward slashes in closing tags: replace </script> with <\/script> in the JSON output (or use \u003C\/script\u003E). For JSON embedded in a regular <script> tag (variable assignment), additionally escape <, >, and & as their \uXXXX equivalents to prevent XSS — JSON.stringify(data).replace(/</g, '\\u003C').replace(/>/g, '\\u003E').replace(/&/g, '\\u0026'). Never put JSON directly in HTML event attributes (onclick, data-* via setAttribute) — HTML-decodes the value before JavaScript sees it, corrupting JSON syntax. Read JSON from data attributes via element.dataset.config and parse with JSON.parse(), which handles the decoded string correctly.
Why does my JSON contain \n instead of a real newline?
This symptom — seeing the two literal characters \ and n in your output rather than an actual newline — indicates double-escaping: the string that was passed to JSON.stringify() already contained the two-character sequence backslash + n (not a real newline). JSON.stringify() then escaped the backslash to \\, producing \\n in the JSON, which displays as \n in most terminals and parsers. Three common root causes: (1) the string came from a previous JSON.stringify() call and was stringified again without parsing first — fix by calling JSON.parse() first to restore the original string; (2) the string came from a template or user input where \n was typed as two literal characters rather than as a newline — fix by replacing str.replace(/\\n/g, '\n') (a regex matching literal backslash-n, replacing with real newline); (3) the string came from a language (Python, PHP, etc.) where the value was already JSON-encoded before being passed to JavaScript — fix by not double-encoding.
Further reading and primary sources
- RFC 8259: The JSON Data Interchange Syntax — The official IETF standard defining JSON syntax, including all string escape sequences (Section 7)
- MDN: JSON.stringify() — MDN reference for JSON.stringify() including replacer, space, and toJSON options with escaping details
- MDN: JSON.parse() — MDN reference for JSON.parse() including reviver function and error handling
- Unicode Surrogate Pairs Explained — Unicode Consortium FAQ on UTF-16, surrogate pairs, and encoding supplementary characters
- OWASP: DOM-based XSS Prevention — OWASP cheat sheet covering safe JSON embedding in HTML pages and XSS prevention strategies