Convert JSON to CSV in Python

Converting a JSON array of objects to CSV in Python takes 3 lines with the built-in csv module — no pip install needed. For a JSON array like [{"name":"Alice","age":30},{"name":"Bob","age":25}], csv.DictWriter writes each object as a row with keys as column headers. pandas.read_json().to_csv() is an even shorter one-liner but requires pandas. Both handle the core case; the complexity comes with nested objects (you need to flatten first) and missing keys (different objects may have different fields). This guide covers both approaches plus handling nested JSON, inconsistent keys, and large JSON files with streaming. To learn more about parsing JSON in Python generally, see the dedicated guide. For background on the conversion format, see JSON to CSV concepts.

Convert your JSON to CSV instantly — no code required.

Open JSON to CSV Converter

csv.DictWriter: built-in, no dependencies

The csv module is part of Python's standard library — no pip install required. csv.DictWriter takes a list of field names and writes each dictionary in your data as one row. The key steps are:

  • fieldnames: extract from the first object's keys (preserves insertion order in Python 3.7+), or define explicitly to control column order.
  • DictWriter(f, fieldnames=fieldnames) — creates the writer bound to an open file handle.
  • writeheader() — writes the header row using the field names.
  • writerows(data) — writes all rows in one call.
import json
import csv

# Load JSON from file
with open('data.json', encoding='utf-8') as f:
    data = json.load(f)

# data = [{"name": "Alice", "age": 30, "city": "Paris"},
#          {"name": "Bob",   "age": 25, "city": "London"}]

# Extract field names from the first record (preserves key order)
fieldnames = list(data[0].keys())
# Or define explicitly to control column order:
# fieldnames = ['name', 'age', 'city']

with open('output.csv', 'w', newline='', encoding='utf-8') as f:
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(data)

# output.csv:
# name,age,city
# Alice,30,Paris
# Bob,25,London

Two parameters handle edge cases cleanly:

  • restval='' — if an object is missing a key that's in fieldnames, write an empty string instead of raising a ValueError.
  • extrasaction='ignore' — if an object has keys not in fieldnames, silently skip them (default is 'raise', which throws an error).
import json
import csv

with open('data.json', encoding='utf-8') as f:
    data = json.load(f)

# Collect ALL unique keys across every object (handles inconsistent records)
all_keys = set().union(*[d.keys() for d in data])
fieldnames = sorted(all_keys)  # sorted for deterministic column order

with open('output.csv', 'w', newline='', encoding='utf-8') as f:
    writer = csv.DictWriter(
        f,
        fieldnames=fieldnames,
        restval='',           # fill missing keys with empty string
        extrasaction='ignore' # skip keys not in fieldnames
    )
    writer.writeheader()
    writer.writerows(data)

# For values that may contain commas, use quoting=csv.QUOTE_ALL
# to wrap every field in double quotes:
with open('output.csv', 'w', newline='', encoding='utf-8') as f:
    writer = csv.DictWriter(
        f,
        fieldnames=fieldnames,
        restval='',
        quoting=csv.QUOTE_ALL
    )
    writer.writeheader()
    writer.writerows(data)

pandas: one-liner conversion

If you have pandas installed (pip install pandas), reading a JSON file and writing CSV is a single chained call. pandas automatically infers column types, handles missing values, and supports many output options.

import pandas as pd

# From a JSON file — one line
pd.read_json('data.json').to_csv('output.csv', index=False)

# index=False suppresses the default 0-based integer index column
# Without it, the CSV gains an extra first column: 0, 1, 2, ...

# From a JSON string in memory
from io import StringIO

json_str = '[{"name":"Alice","age":30},{"name":"Bob","age":25}]'
pd.read_json(StringIO(json_str)).to_csv('output.csv', index=False)

Common options you'll need in practice:

import pandas as pd

df = pd.read_json('data.json')

# Control column ordering explicitly
df[['name', 'age', 'email']].to_csv('output.csv', index=False)

# Excel-compatible UTF-8 (adds BOM so Excel opens with correct encoding)
df.to_csv('output.csv', index=False, encoding='utf-8-sig')

# European CSV format — semicolons instead of commas
df.to_csv('output.csv', index=False, sep=';')

# Combine options: ordered columns + Excel UTF-8 + no index
df[['name', 'age', 'city']].to_csv(
    'output.csv',
    index=False,
    encoding='utf-8-sig'
)

# Write to a string instead of a file (useful for HTTP responses)
csv_string = df.to_csv(index=False)
print(csv_string)

The encoding='utf-8-sig' option is particularly important for files that will be opened in Microsoft Excel on Windows. Without it, Excel interprets the file as Windows-1252, corrupting accented characters and non-Latin scripts. The BOM (byte order mark) is invisible to non-Excel tools and causes no harm elsewhere.

Handling nested JSON objects

Flat JSON arrays convert to CSV trivially, but nested objects like {"user": {"id": 1, "name": "Alice"}} need a decision: what should the CSV column name be, and how many columns should the nested object produce? There are three practical approaches. For more detail on the flattening step itself, see flatten nested JSON in Python.

Option 1: pd.json_normalize() — automatic flattening

import json
import pandas as pd

with open('data.json', encoding='utf-8') as f:
    data = json.load(f)

# data = [
#   {"id": 1, "address": {"city": "Paris", "zip": "75001"}},
#   {"id": 2, "address": {"city": "London", "zip": "EC1A"}}
# ]

df = pd.json_normalize(data, sep='_')
df.to_csv('output.csv', index=False)

# output.csv:
# id,address_city,address_zip
# 1,Paris,75001
# 2,London,EC1A

Option 2: Recursive flatten function — full control

import json
import csv

def flatten(d, parent_key='', sep='_'):
    """Recursively flatten a nested dict into a single-level dict."""
    items = []
    for k, v in d.items():
        new_key = parent_key + sep + k if parent_key else k
        if isinstance(v, dict):
            items.extend(flatten(v, new_key, sep).items())
        else:
            items.append((new_key, v))
    return dict(items)

with open('data.json', encoding='utf-8') as f:
    data = json.load(f)

flat_data = [flatten(record) for record in data]

all_keys = set().union(*[d.keys() for d in flat_data])
fieldnames = sorted(all_keys)

with open('output.csv', 'w', newline='', encoding='utf-8') as f:
    writer = csv.DictWriter(f, fieldnames=fieldnames, restval='')
    writer.writeheader()
    writer.writerows(flat_data)

Option 3: Serialize nested objects as JSON strings

import json
import csv

with open('data.json', encoding='utf-8') as f:
    data = json.load(f)

# Serialize any nested dict/list value to a JSON string in the cell
def stringify_nested(record):
    return {
        k: json.dumps(v, ensure_ascii=False) if isinstance(v, (dict, list)) else v
        for k, v in record.items()
    }

flat_data = [stringify_nested(record) for record in data]

fieldnames = list(flat_data[0].keys())

with open('output.csv', 'w', newline='', encoding='utf-8') as f:
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(flat_data)

# The "address" column will contain: '{"city": "Paris", "zip": "75001"}'

Use this decision table to choose the right approach:

ApproachWhen to useColumns produced
json_normalize()1–2 nesting levels, need to query by nested fieldsOne flat column per nested field (address_city)
Recursive flattenDeep nesting, custom separator, or pre-pandas workflowCustom flat columns with full path as key
JSON string in cellComplex or varying nested structure, downstream tool can parse JSONOne column per top-level key; nested value is a JSON string

Handling inconsistent keys and large files

Inconsistent keys

Real-world JSON exports often have records with different sets of keys — one user object has "email", another doesn't. If you derive fieldnames only from the first record, you'll miss columns that appear later. The fix is to scan all records first:

import json
import csv

with open('data.json', encoding='utf-8') as f:
    data = json.load(f)

# Collect the union of all keys across every record
all_keys = set().union(*[d.keys() for d in data])
fieldnames = sorted(all_keys)  # sorted = deterministic column order

with open('output.csv', 'w', newline='', encoding='utf-8') as f:
    writer = csv.DictWriter(
        f,
        fieldnames=fieldnames,
        restval=''  # write empty string for missing keys
    )
    writer.writeheader()
    writer.writerows(data)

# With pandas — automatic: pd.DataFrame(data) fills missing keys with NaN
# NaN becomes an empty cell in the CSV output
import pandas as pd
pd.DataFrame(data).to_csv('output.csv', index=False)

Large JSON files: streaming with ijson

Loading a 1 GB JSON file with json.load() reads the entire file into memory — easily 3 GB or more after Python object overhead. The ijson library (pip install ijson) parses JSON as a stream, keeping memory usage around 10 MB regardless of file size.

import ijson
import csv

# ijson streams the JSON file item by item — constant memory usage
# Assumes the JSON is an array at the top level: [{...}, {...}, ...]

fieldnames = ['id', 'name', 'email', 'city']  # must be known in advance

with open('large_data.json', 'rb') as json_file,      open('output.csv', 'w', newline='', encoding='utf-8') as csv_file:

    writer = csv.DictWriter(
        csv_file,
        fieldnames=fieldnames,
        restval='',
        extrasaction='ignore'
    )
    writer.writeheader()

    # ijson.items() yields one parsed object per array element
    for item in ijson.items(json_file, 'item'):
        writer.writerow(item)

# Memory comparison for a 1 GB JSON file:
# json.load()  → ~3 GB peak RAM (full object graph in memory)
# ijson stream → ~10 MB peak RAM (one item at a time)

The 'item' prefix in ijson.items(f, 'item') tells ijson to yield each element of the top-level array. If your data is nested under a key (e.g., {"records": [...]}), use 'records.item' as the prefix instead. You can also convert JSON to a pandas DataFrame in chunks using pd.read_json(..., chunksize=1000) for a pandas-native streaming approach.

Frequently asked questions

How do I convert a JSON file to CSV in Python?

Use csv.DictWriter from the standard library: open the JSON file with json.load(), get field names from the first record with list(data[0].keys()), then open the output CSV with open('output.csv', 'w', newline='') and write with DictWriter — call writeheader() then writerows(data). With pandas (pip install pandas), it's one line: pd.read_json('data.json').to_csv('output.csv', index=False). Use the online JSON to CSV converter for a quick no-code conversion.

How do I convert a JSON string to CSV in Python (without a file)?

With csv.DictWriter and io.StringIO: parse the string with json.loads(), create an io.StringIO() buffer, write to it exactly like a file, then call output.getvalue() to get the CSV string. With pandas: from io import StringIO; import pandas as pd; csv_str = pd.read_json(StringIO(json_str)).to_csv(index=False). Use io.StringIO as an in-memory file for both approaches — it behaves identically to a real file handle.

How do I handle nested JSON when converting to CSV in Python?

Use pandas.json_normalize(): from pandas import json_normalize; df = json_normalize(data, sep='_'); df.to_csv('output.csv', index=False). This flattens {"address": {"city": "Paris"}} into an address_city column. For deeper control, write a recursive flatten function: iterate over the dict, and when a value is itself a dict, recurse with the current key prepended. See the flatten nested JSON in Python guide for complete implementations.

Why does my CSV have an extra first column with numbers?

This is the pandas DataFrame index — suppress it with index=False: df.to_csv('output.csv', index=False). By default, pandas adds a 0-based integer index as the first column when writing CSV. Using index=False omits this column, which is almost always what you want when the source JSON already has IDs or when you're sharing the CSV with non-Python tools like Excel or databases.

How do I handle missing keys in JSON objects when converting to CSV?

JSON objects in an array may have different keys — some records have "email", others don't. Use csv.DictWriter's restval parameter to fill missing values: writer = csv.DictWriter(f, fieldnames=all_fields, restval=''). First collect all unique keys: all_fields = sorted(set().union(*[d.keys() for d in data])). With pandas, pd.DataFrame(data) automatically fills missing keys with NaN, and to_csv() writes them as empty cells.

How do I preserve Unicode characters (non-ASCII) in the CSV output?

Open the output file with encoding='utf-8': open('output.csv', 'w', newline='', encoding='utf-8'). For Excel compatibility, use encoding='utf-8-sig' — this adds a UTF-8 BOM (byte order mark) that Excel uses to detect UTF-8, preventing mangled accented characters. With pandas: df.to_csv('output.csv', index=False, encoding='utf-8-sig'). If you're generating the CSV in memory with io.StringIO, Unicode is preserved automatically since Python strings are always Unicode internally — encoding only matters when writing to bytes (files, HTTP responses).

Try the JSON to CSV converter

Paste any JSON array and download a clean CSV file — no Python setup needed.

Open JSON to CSV Converter