Parse JSON in Perl: JSON, JSON::PP, Cpanel::JSON::XS, and JSON::Parse
Last updated:
Perl has more JSON modules than any other mainstream language — a quirk of CPAN's long history. The shortlist that actually matters in 2026 is four names: Cpanel::JSON::XS (the recommended XS implementation, an actively maintained fork of JSON::XS), JSON::PP (pure-Perl, in core since 5.14), JSON::XS (the original C implementation, still common in legacy code), and JSON (a wrapper module that auto-picks the fastest available backend). The procedural API — decode_json and encode_json — is identical across all four, so most code is portable. The differences show up around Unicode handling, booleans, pretty-printing options, and raw performance, where XS implementations run 50 to 100 times faster than the pure-Perl fallback. This guide covers the working patterns: which module to pick, how to round-trip UTF-8 cleanly, how to represent booleans, how to pretty-print and stream, and where the per-implementation differences bite.
Got a JSON payload that decode_json is rejecting with a vague error? Paste it into Jsonic's JSON Validator — it pinpoints the exact line, column, and reason (trailing comma, single quotes, unescaped control character) so you can fix the source.
The Perl JSON module landscape: JSON, JSON::XS, Cpanel::JSON::XS, JSON::PP
CPAN has accumulated maybe twenty JSON modules over the years. The four that matter for new code in 2026 are easy to keep straight once you know the history.
JSON::XS— Marc Lehmann's original C implementation, released 2007. Fast, widely deployed, but development effectively stopped years ago. Bugs around tied hashes, threading, and some Unicode edge cases remain unfixed in the upstream version.Cpanel::JSON::XS— a maintained fork by the cPanel team. Drop-in compatible: the API and option methods match, so switching is a one-line change. Adds thread safety, JSON Pointer support, better error messages, and ongoing bug fixes. The recommended pick for new code.JSON::PP— pure-Perl implementation, shipped in core Perl since 5.14 (2011). No installation required, but 50 to 100 times slower than the XS modules. Useful when you cannot install C extensions: restricted shared hosting, embedded environments, or test fixtures that need to stay dependency-free.JSON— a wrapper that picks the best available backend at runtime, in this preference order:Cpanel::JSON::XS,JSON::XS,JSON::PP. Useful when your code might run on machines with or without an XS module installed; less useful for new projects where you can pick the backend directly.
Outside this shortlist, JSON::Parse offers a separate decode-only API with slightly different error reporting, and JSON::SL provides a streaming parser for huge documents (see the streaming section below). Mojo::JSON ships with Mojolicious and is fine if you are already in that ecosystem.
The decision tree: install Cpanel::JSON::XS if you can. Fall back to JSON::PP if you cannot. Use the JSON wrapper if you need the same code to work in both environments.
Basic decode_json and encode_json
The procedural API is two functions and covers most code. decode_json takes a UTF-8 byte string and returns a Perl data structure; encode_json takes a Perl data structure and returns a UTF-8 byte string. JSON objects become hashrefs, JSON arrays become arrayrefs, primitives become Perl scalars.
use strict;
use warnings;
use Cpanel::JSON::XS qw(decode_json encode_json);
# Decode a JSON string into Perl data
my $json = '{"name":"Ada","age":36,"skills":["math","engines"]}';
my $data = decode_json($json);
print $data->{name}; # Ada
print $data->{age}; # 36
print $data->{skills}[0]; # math
# Encode Perl data back to JSON
my $out = encode_json({
user => 'Ada',
roles => ['admin', 'editor'],
active => \1, # boolean true (see booleans section)
});
# {"user":"Ada","roles":["admin","editor"],"active":true}
print $out;Reading from a file follows the same pattern with a file slurp. The :raw layer is important — decode_json expects UTF-8 bytes, not pre-decoded characters.
use Cpanel::JSON::XS qw(decode_json);
sub read_json {
my ($path) = @_;
open my $fh, '<:raw', $path or die "open $path: $!";
local $/; # slurp mode
my $bytes = <$fh>;
close $fh;
return decode_json($bytes);
}
my $config = read_json('config.json');
print $config->{database}{host};Parse errors throw exceptions, so wrap calls that touch untrusted input in eval or Try::Tiny — see the error handling section below for the production-grade pattern.
Object-oriented JSON->new->decode patterns
When you need options — pretty-printing, sorted keys, relaxed parsing, custom limits — switch from the procedural API to the OO API. Every option is a chainable method on the encoder/decoder object.
use Cpanel::JSON::XS;
# Build a reusable encoder with options
my $json = Cpanel::JSON::XS->new
->utf8 # emit UTF-8 bytes (vs decoded characters)
->pretty # indent + spaces (equiv. indent + space_before + space_after)
->canonical # sort hash keys for deterministic output
->allow_nonref; # allow encoding/decoding non-reference scalars
my $out = $json->encode({ name => 'Ada', age => 36 });
# Pretty, sorted output:
# {
# "age" : 36,
# "name" : "Ada"
# }
# Same object can decode too
my $data = $json->decode($out);The most-used option methods:
utf8(1)— input/output is UTF-8 bytes (default for the procedural API)utf8(0)— input/output is already-decoded Perl characterspretty— shorthand forindent+space_before+space_aftercanonical(1)— sort hash keys; required for deterministic diffs and testsallow_nonref(1)— letdecodeaccept top-level primitives, not just objects/arraysrelaxed(1)— accept comments and trailing commas (non-standard, but useful for config files)max_depth(N)— reject deeply nested input as a DoS guardmax_size(N)— reject oversized inputallow_blessed(1)/convert_blessed(1)— control how blessed objects encode (default: throw)
Build the encoder once and reuse it. Each method returns the encoder object, so chains compose cleanly. For a typical config-loading helper you might keep one relaxed decoder for human-edited files and one strict decoder for API payloads.
Unicode handling: utf8, decode flag, encoding traps
The single biggest source of Perl JSON bugs is mixing up encoded bytes and decoded characters. The rules are simple once you write them down:
decode_json(procedural) expects UTF-8 bytes and returns decoded characters with the UTF-8 flag onencode_json(procedural) expects decoded characters and returns UTF-8 bytes- The OO API with
utf8(1)matches the procedural behavior;utf8(0)works in characters on both sides
The trap is reading a file with the wrong PerlIO layer. The :raw layer (or no layer at all) gives you bytes — the right thing to pass to decode_json. The :encoding(UTF-8) layer decodes the bytes to characters at read time — passing the result to decode_json causes double-decoding and mojibake.
use Cpanel::JSON::XS qw(decode_json encode_json);
use Cpanel::JSON::XS;
# RIGHT: bytes in, decoded chars out
open my $fh, '<:raw', 'data.json' or die $!;
local $/;
my $bytes = <$fh>;
my $data = decode_json($bytes); # $data->{title} is decoded chars
# RIGHT: chars in, bytes out
open my $out, '>:raw', 'out.json' or die $!;
print {$out} encode_json($data); # writes UTF-8 bytes
# WRONG: double decode
open my $fh2, '<:encoding(UTF-8)', 'data.json' or die $!;
my $chars = do { local $/; <$fh2> };
my $bad = decode_json($chars); # mojibake on non-ASCII
# RIGHT if you've already decoded yourself: use ->utf8(0)
my $coder = Cpanel::JSON::XS->new->utf8(0);
my $good = $coder->decode($chars); # works on charsPick one layer convention per project and stay with it. The :raw + procedural-API combination is the simplest and matches what almost every CPAN example shows. If you need to integrate with code that hands you decoded character strings, switch to the OO API with utf8(0) rather than re-encoding to bytes just to feed the procedural function.
Booleans: JSON::true, JSON::false, JSON::null
Perl has no native boolean type. Every Perl JSON module solves this the same way: scalar references — \1 for true, \0 for false — encode as literal true and false. Decoding goes the other way: JSON booleans become blessed objects of class Cpanel::JSON::XS::Boolean (or JSON::PP::Boolean depending on the backend) that compare correctly in boolean context.
use Cpanel::JSON::XS qw(decode_json encode_json);
# Encoding: use \1 and \0
my $out = encode_json({
active => \1, # true
archived => \0, # false
deleted => undef, # null
});
# {"active":true,"archived":false,"deleted":null}
# Sentinel functions also work
use Cpanel::JSON::XS;
my $sentinels = encode_json({
a => Cpanel::JSON::XS::true,
b => Cpanel::JSON::XS::false,
});
# Decoding: booleans become blessed objects, true/false in boolean context
my $data = decode_json('{"active":true,"count":0,"name":""}');
if ($data->{active}) { # true (booleanish)
print "active\n";
}
print ref $data->{active}; # Cpanel::JSON::XS::Boolean
# Trap: a plain 1 or 0 encodes as a NUMBER, not a boolean
my $bad = encode_json({ active => 1 });
# {"active":1} # not {"active":true}
# Safe pattern: coerce explicitly
sub jbool { $_[0] ? \1 : \0 }
my $good = encode_json({ active => jbool($flag) });The most common source of stray 1/0 in JSON output is forgetting to wrap. A small jbool helper at the top of any module that builds API payloads removes most accidents.
For input validation: JSON::is_bool($value) tells you whether a value came in as a JSON boolean, distinct from a numeric 1 or 0. Useful when you need to enforce strict types on incoming payloads.
Pretty-print with pretty(1), canonical(1) for stable ordering
The procedural encode_json always emits a single-line compact form. For indented output, switch to the OO API and set pretty. For sorted keys (deterministic output across runs, machines, and Perl versions), add canonical.
use Cpanel::JSON::XS;
my $data = {
name => 'Ada Lovelace',
born => 1815,
works => ['Notes on the Analytical Engine'],
active => \0,
};
# Pretty + sorted
my $pretty = Cpanel::JSON::XS->new->utf8->pretty->canonical->encode($data);
# {
# "active" : false,
# "born" : 1815,
# "name" : "Ada Lovelace",
# "works" : [
# "Notes on the Analytical Engine"
# ]
# }
print $pretty;
# Manual control over the pretty options
my $custom = Cpanel::JSON::XS->new
->utf8
->indent(1) # add newlines + indentation
->space_before(0) # no space before colon: "name":"Ada"
->space_after(1) # one space after colon: "name": "Ada"
->indent_length(2) # 2-space indent (default is 3 for ->pretty)
->canonical
->encode($data);canonical(1) is what you want for any output that goes into version control, content-addressed storage, or test snapshots. Without it, hash iteration order varies between Perl versions and even between runs of the same script (hash randomization has been on by default since Perl 5.18). With it, the same input always produces byte-identical output.
To reformat existing JSON without changing the data, decode then re-encode through a pretty + canonical encoder. This is a useful one-liner for normalizing JSON files in a repo:
# Reformat in place: perl -i -MCpanel::JSON::XS -0777 -pe ...
perl -MCpanel::JSON::XS -0777 -i -pe '
BEGIN { our $j = Cpanel::JSON::XS->new->utf8->pretty->canonical }
$_ = $j->encode($j->decode($_));
' config.jsonStreaming JSON: JSON::SL incremental parsing
decode_json loads the entire JSON value into memory. For a configuration file, that is fine. For a gigabyte-scale export, it is not. Two patterns handle large documents:
JSON Lines (NDJSON) — one JSON value per line. Read line by line and decode each line; memory stays bounded regardless of file size. This is the right format for logs, event streams, and bulk data exports — see JSON Lines format for the spec and tooling.
use Cpanel::JSON::XS qw(decode_json encode_json);
# Read a .ndjson file, process each record
open my $fh, '<:raw', 'events.ndjson' or die $!;
while (my $line = <$fh>) {
chomp $line;
next unless length $line;
my $event = decode_json($line);
process($event);
}
close $fh;
# Write a .ndjson file
open my $out, '>:raw', 'events.ndjson' or die $!;
for my $event (@events) {
print {$out} encode_json($event), "\n";
}
close $out;JSON::SL — a streaming parser for the case where you are stuck with a single huge JSON document (often a big array of records inside an outer wrapper). JSON::SL uses JSON Pointer paths to pick out values as the parser walks the document, emitting matches one at a time without holding the full tree in memory.
use JSON::SL;
my $sl = JSON::SL->new;
$sl->set_jsonpointer(['/records/^']); # ^ matches array elements
open my $fh, '<:raw', 'huge.json' or die $!;
while (read($fh, my $buf, 4096)) {
$sl->feed($buf);
while (my $obj = $sl->fetch) {
process($obj->{Value}); # one record at a time
}
}For most workloads, JSON Lines wins on simplicity. Reach for JSON::SL only when the producer is outside your control and only emits big single-document JSON.
Performance: JSON::XS vs Cpanel::JSON::XS vs JSON::PP
The XS implementations parse and serialize in C, so they run roughly 50 to 100 times faster than the pure-Perl JSON::PP. For most code that difference is invisible — a config file decode is fast either way — but it matters for any code that processes JSON in a hot loop (log ingestion, API gateways, batch ETL).
use Benchmark qw(cmpthese);
use JSON::PP ();
use JSON::XS ();
use Cpanel::JSON::XS ();
my $sample = { id => 42, name => 'Ada', tags => [qw(a b c d e)], active => \1 };
my $pp = JSON::PP->new->utf8;
my $xs = JSON::XS->new->utf8;
my $cpx = Cpanel::JSON::XS->new->utf8;
cmpthese(-3, {
'PP encode' => sub { $pp->encode($sample) },
'XS encode' => sub { $xs->encode($sample) },
'CPX encode' => sub { $cpx->encode($sample) },
});
# Indicative results on a 2025-era laptop (rate, /s):
# PP encode ~250,000
# XS encode ~18,000,000
# CPX encode ~18,500,000Numbers vary with payload shape — strings full of Unicode escapes hit JSON::PP harder than numeric data, and very large documents widen the gap. The qualitative picture is stable: Cpanel::JSON::XS and JSON::XS are within a few percent of each other, and JSON::PP is one to two orders of magnitude slower.
Error handling pattern — production code that touches untrusted input needs to catch decode failures. decode_json throws via die, so eval or Try::Tiny both work:
use Try::Tiny;
use Cpanel::JSON::XS qw(decode_json);
sub safe_decode {
my ($json) = @_;
my $data;
try {
$data = decode_json($json);
} catch {
warn "JSON parse failed: $_";
$data = undef;
};
return $data;
}
my $result = safe_decode($maybe_bad_input);
return error_response('invalid JSON') unless defined $result;Cpanel::JSON::XS error messages include line and column numbers, which makes debugging malformed input much easier than the older JSON::XS messages. If you are still on JSON::XS for legacy reasons, switching modules is usually the quickest way to get better diagnostics.
Key terms
- Cpanel::JSON::XS
- An actively maintained fork of JSON::XS by the cPanel team. API-compatible with JSON::XS, thread-safe, fixes a backlog of long-standing bugs, adds JSON Pointer/Patch support and clearer error messages. The recommended Perl JSON module for new code in 2026.
- JSON::PP
- A pure-Perl JSON implementation, shipped in core Perl since 5.14. No installation required and no C compiler needed, but 50 to 100 times slower than XS implementations. Useful as a fallback or for environments where installing XS modules is not an option.
- decode_json / encode_json
- The procedural API shared by JSON, JSON::XS, Cpanel::JSON::XS, and JSON::PP.
decode_jsontakes UTF-8 bytes and returns Perl data;encode_jsontakes Perl data and returns UTF-8 bytes. Both always operate on UTF-8 — use the OO API withutf8(0)when working in decoded characters. - boolean reference
- Perl's convention for representing JSON booleans:
\1encodes astrue,\0encodes asfalse. JSON booleans decode back to blessed objects (e.g.,Cpanel::JSON::XS::Boolean) that work correctly in boolean context. - canonical mode
- An encoder option (
canonical(1)) that sorts hash keys alphabetically in the output. Required for deterministic output across Perl versions, since hash iteration order has been randomized since Perl 5.18. Pair withprettyfor human-readable diffs. - JSON::SL
- A streaming JSON parser that emits values as they are matched by JSON Pointer paths, without loading the full document into memory. Useful for very large single-document JSON files; for line-based formats, JSON Lines with
decode_jsonper line is simpler.
Frequently asked questions
Which Perl JSON module should I use in 2026?
For new code, use Cpanel::JSON::XS directly. It is the actively maintained fork of JSON::XS, fixes a long backlog of bugs the original never addressed (notably around tied hashes, threading, and Unicode edge cases), and is 50 to 100 times faster than the pure-Perl alternatives. Install it with cpanm Cpanel::JSON::XS and import the procedural API with use Cpanel::JSON::XS qw(decode_json encode_json). If you cannot install XS modules — restricted shared hosting, no C compiler, frozen Perl install — fall back to JSON::PP, which has shipped in core Perl since 5.14 and needs no installation at all. The JSON wrapper module (use JSON;) auto-picks the fastest available backend (Cpanel::JSON::XS, then JSON::XS, then JSON::PP) and gives you portable code, but adds one layer of indirection; for new projects, pick the backend directly.
What's the difference between JSON::XS and Cpanel::JSON::XS?
Cpanel::JSON::XS is a maintained fork of JSON::XS that the cPanel team started after JSON::XS development stalled. The API surface is the same — decode_json, encode_json, the OO constructor, and option methods all match — so switching is a one-line import change. Cpanel::JSON::XS adds bug fixes the original never merged (correct handling of duplicate keys, better error messages with line and column numbers, fixes for threading and tied-hash edge cases, support for JSON Pointer and JSON Patch), plus methods that JSON::XS lacks. It is fully thread-safe where the original is not. Performance is comparable — both implement the parser in C and both are roughly 50 to 100 times faster than JSON::PP. For new projects there is no reason to pick JSON::XS over Cpanel::JSON::XS; existing JSON::XS code can usually swap modules with no other changes.
How do I parse a JSON file in Perl?
Slurp the file into a string and pass it to decode_json. The idiomatic pattern uses three-argument open with the :raw layer (so Perl does not double-decode UTF-8), local $/ to disable input record separation (so a single readline returns the whole file), and decode_json on the result. Example: open my $fh, "<:raw", "data.json" or die "open: $!"; local $/; my $json = <$fh>; close $fh; my $data = decode_json($json);. For large files (hundreds of megabytes), the slurp approach loads everything into memory — use File::Slurper for a one-line read or switch to JSON::SL for incremental parsing if memory is tight. For JSON Lines files (one JSON object per line), read line by line and call decode_json on each line individually; that pattern is memory-bounded regardless of file size.
How do I handle Unicode in Perl JSON?
The rule is: decode_json expects UTF-8 bytes (not characters), and the strings it returns are decoded Perl characters with the UTF-8 flag on. encode_json expects decoded Perl characters and returns UTF-8 bytes. Mismatching layers causes mojibake. Read files with the :raw layer, not :encoding(UTF-8), so you pass raw bytes to decode_json. If you have already decoded the input to characters yourself, use the OO API with utf8(0) instead: my $data = JSON->new->utf8(0)->decode($string);. For output, encode_json gives you bytes ready to write to a :raw file or send over a socket; the OO equivalent without utf8 gives you a character string you can print to a :encoding(UTF-8) file handle. The two paths are equivalent — pick one and stay consistent.
Why does my Perl JSON output have escaped Unicode?
By default, the OO encoder escapes non-ASCII characters as \uXXXX sequences when the ascii or latin1 option is set, or when you encode through certain wrappers. JSON::XS and Cpanel::JSON::XS in their default state will emit raw UTF-8 bytes for non-ASCII characters — encode_json($data) on a hash containing the string "café" produces literal UTF-8 bytes for the é, not \u00e9. If you are seeing \uXXXX everywhere, check whether your code calls ->ascii(1) on the encoder, or whether you are using the JSON wrapper with an older default. To force raw UTF-8 output, use JSON->new->utf8->encode($data) or the procedural encode_json($data). To force ASCII-safe output (useful for environments that mangle UTF-8), use JSON->new->ascii->encode($data) — every non-ASCII character becomes \uXXXX.
How do I represent booleans in Perl that round-trip to JSON?
Perl has no native boolean type, so JSON modules use scalar references for booleans: \1 is true, \0 is false. Every Perl JSON module accepts these references and emits literal true and false in the output. The modules also export sentinel functions — JSON::true, JSON::false, JSON::null, or the equivalents under Cpanel::JSON::XS — that return blessed objects so you can compare with ==, !=, and Boolean context. Round-tripping works in both directions: decode_json turns JSON true into a JSON::PP::Boolean (or Cpanel::JSON::XS::Boolean) object, and encoding that object back gives you true. The traps: a Perl 1 or 0 encodes as a number, not a boolean (you get 1, not true). If you need strict boolean output, wrap the value: my $flag = $cond ? \1 : \0;.
How do I pretty-print JSON in Perl?
Use the OO API with the pretty option: my $json = JSON->new->pretty->encode($data);. The pretty method is shorthand for setting indent(1), space_before(1), and space_after(1) — together they produce a two-space-indented document with spaces around colons and after commas. If you also need sorted keys for deterministic output (useful in tests, diffs, and content-addressed storage), chain canonical(1): JSON->new->pretty->canonical->encode($data). For tighter output that still has line breaks, set indent_length(2) or build your own combination of indent, space_before, and space_after. The procedural encode_json never pretty-prints — it always emits a single-line compact form for speed. To pretty-print existing JSON without changing it semantically, decode and re-encode through the OO API.
Can Perl parse JSON Lines / NDJSON?
Yes — JSON Lines is just one JSON value per line, so a while-readline loop with decode_json on each line is the whole solution. Memory stays bounded regardless of file size, which is the main reason JSON Lines exists. Example: open my $fh, "<:raw", "events.ndjson" or die $!; while (my $line = <$fh>) { chomp $line; next unless length $line; my $record = decode_json($line); process($record); }. The :raw layer is important for the same Unicode reasons as full-file decoding. For writing JSON Lines, use encode_json on each record and print with an explicit newline — print {$fh} encode_json($record), "\n";. Do not use pretty-print mode for JSON Lines output; the per-line format requires that each line be a complete, single-line JSON value.
Further reading and primary sources
- Cpanel::JSON::XS on MetaCPAN — Authoritative reference for the recommended Perl JSON module — full option list and method docs
- JSON::PP on MetaCPAN — Pure-Perl implementation shipped in core since 5.14 — fallback when XS is not available
- JSON wrapper on MetaCPAN — Backend-agnostic wrapper that auto-selects the fastest available JSON implementation
- JSON::SL on MetaCPAN — Streaming JSON parser with JSON Pointer paths for huge single-document inputs
- RFC 8259 — The JSON Data Interchange Format — The JSON specification — the rules every parser, including the Perl modules, must follow
- Parse JSON in Python — Companion guide covering json.loads, ujson, orjson, and Python booleans
- Parse JSON in Ruby — Companion guide covering JSON.parse, Oj, and Ruby symbol keys
- Parse JSON in PHP — Companion guide covering json_decode, associative vs object mode, and JSON_THROW_ON_ERROR
- Parse JSON in Bash — Companion guide covering jq, jaq, and shell-friendly JSON pipelines
- JSON Lines format — The streaming-friendly one-record-per-line variant of JSON, perfect for large data sets in Perl
- JSON encode/decode across languages — A cross-language reference for encode/decode semantics, Unicode, and boolean handling