Flatten Nested JSON in Python
Flattening nested JSON converts a multi-level object into a single-level dictionary with compound keys. For example, {"address": {"city": "London"}} becomes {"address.city": "London"}. This is essential for loading JSON into spreadsheets, databases, or pandas DataFrames that expect flat key-value structures.
Validate your JSON structure in Jsonic before flattening.
Open JSON FormatterManual recursive flatten function
The cleanest zero-dependency approach is a recursive function that walks the nested structure and builds compound keys as it descends. This works for any combination of dicts and lists.
def flatten_json(obj, parent_key='', sep='.'):
"""Flatten a nested dict/list to a single-level dict with compound keys."""
items = {}
if isinstance(obj, dict):
for k, v in obj.items():
new_key = f"{parent_key}{sep}{k}" if parent_key else k
items.update(flatten_json(v, new_key, sep))
elif isinstance(obj, list):
for i, v in enumerate(obj):
new_key = f"{parent_key}{sep}{i}" if parent_key else str(i)
items.update(flatten_json(v, new_key, sep))
else:
items[parent_key] = obj
return items
# Example
data = {
"user": {
"name": "Alice",
"address": {
"city": "London",
"zip": "SW1A 1AA"
}
},
"score": 95
}
flat = flatten_json(data)
# {
# "user.name": "Alice",
# "user.address.city": "London",
# "user.address.zip": "SW1A 1AA",
# "score": 95
# }Change sep='_' to use underscores instead of dots in compound keys — useful when the output will be used as SQL column names or CSV headers.
Flatten a JSON file
To flatten JSON loaded from disk, combine json.load() with the recursive function above:
import json
def flatten_json(obj, parent_key='', sep='.'):
items = {}
if isinstance(obj, dict):
for k, v in obj.items():
new_key = f"{parent_key}{sep}{k}" if parent_key else k
items.update(flatten_json(v, new_key, sep))
elif isinstance(obj, list):
for i, v in enumerate(obj):
new_key = f"{parent_key}{sep}{i}" if parent_key else str(i)
items.update(flatten_json(v, new_key, sep))
else:
items[parent_key] = obj
return items
with open('data.json') as f:
data = json.load(f)
flat = flatten_json(data)
print(json.dumps(flat, indent=2))The output is a plain Python dict you can write back to JSON with json.dumps(), convert to CSV rows, or insert into a database.
Flatten an array of objects
When the JSON root is a list of records, apply flatten_json to each element with a list comprehension:
records = [
{"id": 1, "user": {"name": "Alice", "role": "admin"}},
{"id": 2, "user": {"name": "Bob", "role": "viewer"}},
]
flat_records = [flatten_json(r) for r in records]
# [
# {"id": 1, "user.name": "Alice", "user.role": "admin"},
# {"id": 2, "user.name": "Bob", "user.role": "viewer"},
# ]This is useful for converting a JSON array into a CSV-ready list of flat dictionaries. Pass flat_records directly to csv.DictWriter or pd.DataFrame(flat_records).
pandas json_normalize (easiest for DataFrames)
If you are already using pandas, pd.json_normalize flattens nested dicts automatically and returns a DataFrame with dot-separated column names:
import pandas as pd
import json
with open('data.json') as f:
data = json.load(f) # should be a list of dicts
df = pd.json_normalize(data)
print(df.columns.tolist())
# ['id', 'user.name', 'user.role']
# Custom separator
df = pd.json_normalize(data, sep='_')
# columns: ['id', 'user_name', 'user_role']
# Save to CSV (now flat)
df.to_csv('output.csv', index=False)pd.json_normalize handles nested dicts automatically. It does not expand arrays by default — use the record_path parameter for nested arrays (see next section).
Expand nested arrays with json_normalize record_path
When a JSON object contains an array of sub-records, use record_path to expand those items into separate rows while keeping parent fields via meta:
data = [
{"order_id": 1, "items": [{"sku": "A1", "qty": 2}, {"sku": "B2", "qty": 1}]},
{"order_id": 2, "items": [{"sku": "C3", "qty": 5}]},
]
df = pd.json_normalize(
data,
record_path='items', # expand this array
meta=['order_id'], # keep these parent fields
meta_prefix='order_', # optional prefix for parent fields
)
# sku qty order_order_id
# A1 2 1
# B2 1 1
# C3 5 2Each item in the items array becomes its own row. Parent fields like order_id are repeated on each row so you can join or group later.
flatten_dict library
The flatten-dict package provides more key-reducer options than a hand-rolled function, including dot notation, tuple keys, and fully custom reducers:
# pip install flatten-dict
from flatten_dict import flatten
data = {"a": {"b": {"c": 1}, "d": 2}, "e": 3}
# Dot-separated keys
flat = flatten(data, reducer='dot')
# {"a.b.c": 1, "a.d": 2, "e": 3}
# Tuple keys (default)
flat = flatten(data)
# {("a","b","c"): 1, ("a","d"): 2, ("e",): 3}
# Custom reducer
flat = flatten(data, reducer=lambda k1, k2: f"{k1}__{k2}" if k1 else k2)flatten_dict supports multiple key reducers and is more configurable than a hand-rolled function. Note that it does not handle lists — use the recursive approach above if your JSON contains arrays.
Comparison: three flattening approaches
| Approach | Dependency | Handles arrays | Use when |
|---|---|---|---|
flatten_json() (recursive) | None (stdlib) | Yes (as indexed keys) | No external deps allowed |
pd.json_normalize | pandas | Yes (with record_path) | Already using pandas, loading into DataFrame |
flatten_dict | flatten-dict | No (dicts only) | Need custom key reducers |
Frequently asked questions
What does flattening JSON mean?
Flattening converts a deeply nested JSON object into a single-level dictionary where all keys are compound strings representing the original path. {"a": {"b": 1}} becomes {"a.b": 1}. The separator (. or _) is configurable.
How do I choose the separator for flattened keys?
Use a dot (.) when the keys will be read by humans or used in pandas. Use an underscore (_) when the result will go into a SQL column name or CSV header that does not allow dots. Avoid separators that already appear in your key names to prevent ambiguity.
What happens to arrays when flattening?
The manual recursive function converts array items to indexed keys: [1,2,3] becomes {"0":1,"1":2,"2":3}. pandas json_normalize requires you to specify arrays via record_path to expand them into rows. flatten_dict by default does not handle lists.
How do I unflatten a flat dict back to nested JSON?
Write a recursive unflatten: split each key on the separator, walk down the output dict creating nested dicts for each part, and set the leaf value. There is also an unflatten() function in flatten_dict: from flatten_dict import unflatten.
How do I flatten only one level deep?
Merge just the top-level nested dicts without recursing: result = {**top, **nested_dict} or use dict comprehensions to unpack only the first layer. For pandas, json_normalize with max_level=1 limits normalization to one level deep.
Is flattening the same as normalizing JSON?
Flattening and normalizing overlap in practice. pandas uses the term "normalize" (json_normalize) for the same operation. In databases, normalization means something different (removing redundancy). In the context of JSON, both terms typically mean converting nested structures into flat key-value pairs.
Validate your JSON
Paste your JSON into Jsonic to check for syntax errors and inspect the nested structure before running your flatten script.
Open JSON Formatter