Python Dataclasses and JSON: Serialization Guide
Last updated:
Python dataclasses (introduced in Python 3.7) give you structured, typed data classes with minimal boilerplate. But they don't come with built-in JSON support — you need to know the right tools to serialize them to JSON and deserialize JSON back into typed dataclass instances. This guide covers every approach: from the standard library's asdict(), to dacite, marshmallow-dataclass, and Pydantic.
Key Terms
- dataclass
- A Python class decorated with
@dataclassfrom thedataclassesmodule. Automatically generates__init__,__repr__, and__eq__methods from annotated class fields. Available since Python 3.7. - asdict
dataclasses.asdict(obj)— a standard library function that recursively converts a dataclass instance (and any nested dataclass fields) into a plain Python dict suitable for passing tojson.dumps().- dacite
- A third-party library (
pip install dacite) that creates dataclass instances from dicts. Handles nested dataclasses,Optionalfields,Uniontypes, and type coercion — filling the gap thatdataclassesitself leaves for deserialization. - frozen dataclass
- A dataclass created with
@dataclass(frozen=True). All fields are read-only after construction — setting any attribute raisesFrozenInstanceError. Automatically generates__hash__, making instances usable as dict keys or set members. - Pydantic BaseModel
- The base class for Pydantic v2 models. Provides runtime type validation, JSON serialization (
model_dump_json()), deserialization (model_validate_json()), and JSON Schema generation (model_json_schema()) — all built in, without additional libraries.
1. Dataclass to JSON: asdict() + json.dumps()
The standard library approach: dataclasses.asdict() converts the dataclass to a dict, then json.dumps() serializes it.
import json
import dataclasses
from dataclasses import dataclass, field
from typing import List
@dataclass
class Product:
name: str
price: float
tags: List[str] = field(default_factory=list)
# Serialize to JSON
product = Product(name="Widget", price=9.99, tags=["sale", "new"])
d = dataclasses.asdict(product)
# {'name': 'Widget', 'price': 9.99, 'tags': ['sale', 'new']}
json_str = json.dumps(d, indent=2)
print(json_str)
# {
# "name": "Widget",
# "price": 9.99,
# "tags": ["sale", "new"]
# }
# One-liner
json_str = json.dumps(dataclasses.asdict(product))To deserialize a simple flat dataclass (no nested types), unpack the parsed dict directly:
data = json.loads(json_str)
product = Product(**data)
print(product.name) # Widget
print(product.price) # 9.99
# WARNING: This only works if all keys in data match field names exactly
# and all types are already correct (json.loads returns str, float, list, dict)2. JSON to Dataclass: dacite
For deserializing JSON into dataclasses — especially with nested types — install dacite:
pip install daciteimport json
from dataclasses import dataclass
from typing import Optional
import dacite
@dataclass
class Address:
street: str
city: str
country: str = "US"
@dataclass
class User:
name: str
email: str
age: Optional[int] = None
address: Optional[Address] = None
json_str = '''
{
"name": "Ada Lovelace",
"email": "ada@example.com",
"age": 36,
"address": {
"street": "123 Main St",
"city": "London",
"country": "GB"
}
}
'''
data = json.loads(json_str)
user = dacite.from_dict(User, data)
print(user.name) # Ada Lovelace
print(user.address.city) # London
print(type(user.address)) # <class 'Address'> -- properly typed!dacite raises dacite.exceptions.WrongTypeError if a field has the wrong type, and dacite.exceptions.MissingValueError if a required field is absent.
3. Nested Dataclasses
dataclasses.asdict() handles nested dataclasses recursively — no extra code needed for serialization:
import json
import dataclasses
from dataclasses import dataclass
from typing import List
@dataclass
class Tag:
name: str
color: str
@dataclass
class Article:
title: str
body: str
tags: List[Tag]
article = Article(
title="Python Tips",
body="Use dataclasses for structured data.",
tags=[Tag("python", "blue"), Tag("tips", "green")]
)
print(json.dumps(dataclasses.asdict(article), indent=2))
# {
# "title": "Python Tips",
# "body": "Use dataclasses for structured data.",
# "tags": [
# {"name": "python", "color": "blue"},
# {"name": "tips", "color": "green"}
# ]
# }For deserialization, dacite handles lists of nested dataclasses via the List[Tag] type annotation automatically:
import dacite
data = json.loads(json_str)
article = dacite.from_dict(Article, data)
print(article.tags[0].color) # blue
print(type(article.tags[0])) # <class 'Tag'>4. Handling datetime and Enums
datetime and Enum are not JSON-serializable by default. Here are the patterns for each.
datetime Fields
import json
import dataclasses
from dataclasses import dataclass
from datetime import datetime
@dataclass
class Event:
title: str
created_at: datetime
event = Event(title="Launch", created_at=datetime(2026, 5, 19, 12, 0, 0))
# Option 1: default=str (simplest — calls str() on non-serializable objects)
json_str = json.dumps(dataclasses.asdict(event), default=str)
# {"title": "Launch", "created_at": "2026-05-19 12:00:00"}
# Option 2: default=str with isoformat (explicit)
def json_serial(obj):
if isinstance(obj, datetime):
return obj.isoformat()
raise TypeError(f"Type {type(obj)} not serializable")
json_str = json.dumps(dataclasses.asdict(event), default=json_serial)
# {"title": "Launch", "created_at": "2026-05-19T12:00:00"}
# Deserialization: parse the string back to datetime
data = json.loads(json_str)
data["created_at"] = datetime.fromisoformat(data["created_at"])
event = Event(**data)Enum Fields
import json
import dataclasses
from dataclasses import dataclass
from enum import Enum
class Status(str, Enum):
ACTIVE = "active"
INACTIVE = "inactive"
PENDING = "pending"
@dataclass
class Account:
name: str
status: Status
account = Account(name="Ada", status=Status.ACTIVE)
# str Enum: inheriting from str makes Enum values JSON-serializable directly
json_str = json.dumps(dataclasses.asdict(account))
# {"name": "Ada", "status": "active"} -- works because Status inherits from str
# For plain Enum (not str Enum), use default=lambda o: o.value
class Priority(Enum):
LOW = 1
HIGH = 2
@dataclass
class Task:
name: str
priority: Priority
task = Task("Fix bug", Priority.HIGH)
json_str = json.dumps(dataclasses.asdict(task), default=lambda o: o.value)
# {"name": "Fix bug", "priority": 2}
# Deserialization with dacite and Enum
import dacite
data = json.loads(json_str)
# dacite can cast int/str to Enum with strict=False or cast config
task = dacite.from_dict(Task, data, config=dacite.Config(cast=[Enum]))5. marshmallow-dataclass for Validation
marshmallow-dataclass generates a full marshmallow Schema from a dataclass, adding validation and field-level error reporting on top of serialization.
pip install marshmallow marshmallow-dataclassfrom marshmallow_dataclass import dataclass
from marshmallow import ValidationError
from typing import Optional
from datetime import datetime
@dataclass
class User:
name: str
email: str
age: Optional[int] = None
created_at: Optional[datetime] = None
# Schema is auto-generated
schema = User.Schema()
# Serialize: dump() -> dict, dumps() -> JSON string
user = User(name="Ada", email="ada@example.com", age=30,
created_at=datetime(2026, 5, 19))
json_str = schema.dumps(user)
# '{"name": "Ada", "email": "ada@example.com", "age": 30, "created_at": "2026-05-19T00:00:00"}'
# Deserialize + validate: load() raises ValidationError on bad data
try:
user = schema.load({"name": "Ada", "email": "ada@example.com", "age": "not-a-number"})
except ValidationError as e:
print(e.messages)
# {'age': ['Not a valid integer.']}
# Successful deserialization
user = schema.load({"name": "Ada", "email": "ada@example.com", "age": 30})
print(type(user)) # <class 'User'>
print(user.age) # 30marshmallow-dataclass handles datetime, UUID, and custom field types automatically — without any default= workarounds.
6. Pydantic vs dataclasses
Pydantic v2 BaseModel is the most capable option for JSON-heavy applications. Here is a direct comparison:
| Feature | dataclass + dacite | Pydantic BaseModel |
|---|---|---|
| Serialize to JSON | json.dumps(asdict(obj)) | obj.model_dump_json() |
| Deserialize from JSON | dacite.from_dict(C, json.loads(s)) | MyModel.model_validate_json(s) |
| Runtime validation | No (type errors only) | Yes (rich error messages) |
| datetime / UUID / Enum | Manual encoder needed | Built-in support |
| JSON Schema | Not available | MyModel.model_json_schema() |
| Immutability | frozen=True | model_config frozen=True |
| Dependency size | stdlib + ~30 KB (dacite) | ~1.5 MB |
pip install pydanticfrom pydantic import BaseModel, EmailStr
from typing import Optional, List
from datetime import datetime
class Address(BaseModel):
street: str
city: str
country: str = "US"
class User(BaseModel):
name: str
email: str
age: Optional[int] = None
created_at: datetime = None
address: Optional[Address] = None
# Deserialize from JSON string — validates types automatically
json_str = '''
{
"name": "Ada",
"email": "ada@example.com",
"age": 30,
"created_at": "2026-05-19T12:00:00",
"address": {"street": "123 Main", "city": "London", "country": "GB"}
}
'''
user = User.model_validate_json(json_str)
print(user.address.city) # London
print(type(user.created_at)) # <class 'datetime.datetime'> -- auto-parsed!
# Serialize back to JSON
print(user.model_dump_json(indent=2))
# Generate JSON Schema for OpenAPI docs
import json
print(json.dumps(User.model_json_schema(), indent=2))7. Custom JSON Encoder for Dataclasses
For maximum control, write a custom json.JSONEncoder subclass that handles your specific non-serializable types:
import json
import dataclasses
from dataclasses import dataclass
from datetime import datetime, date
from enum import Enum
from uuid import UUID
from typing import Any
class DataclassEncoder(json.JSONEncoder):
"""
JSON encoder that handles:
- dataclass instances (via asdict)
- datetime / date (via isoformat)
- Enum (via .value)
- UUID (via str)
"""
def default(self, obj: Any) -> Any:
if dataclasses.is_dataclass(obj) and not isinstance(obj, type):
return dataclasses.asdict(obj)
if isinstance(obj, (datetime, date)):
return obj.isoformat()
if isinstance(obj, Enum):
return obj.value
if isinstance(obj, UUID):
return str(obj)
return super().default(obj)
# Usage
@dataclass
class OrderLine:
product_id: UUID
quantity: int
unit_price: float
@dataclass
class Order:
order_id: UUID
created_at: datetime
lines: list[OrderLine]
from uuid import uuid4
order = Order(
order_id=uuid4(),
created_at=datetime.now(),
lines=[
OrderLine(uuid4(), 2, 9.99),
OrderLine(uuid4(), 1, 29.99),
]
)
# Serialize directly — no need to call asdict() first
json_str = json.dumps(order, cls=DataclassEncoder, indent=2)
print(json_str)
# {
# "order_id": "f47ac10b-...",
# "created_at": "2026-05-19T12:00:00",
# "lines": [
# {"product_id": "...", "quantity": 2, "unit_price": 9.99},
# ...
# ]
# }Note that DataclassEncoder.default() is called for objects that the standard encoder cannot handle. Dataclass instances are not dicts, so they hit default() — where we call asdict() to convert them. This approach works even for deeply nested structures.
FAQ
How do I convert a Python dataclass to JSON?
Use dataclasses.asdict(obj) to get a plain dict, then json.dumps() to serialize it: json.dumps(dataclasses.asdict(product)). For non-serializable fields like datetime or Enum, add default=str or a custom encoder class to json.dumps().
How do I deserialize JSON into a Python dataclass?
Parse the JSON with json.loads(), then either unpack with MyClass(**data) (for flat dataclasses) or use dacite.from_dict(MyClass, data) (for nested dataclasses or complex types). For validation during deserialization, use Pydantic's MyModel.model_validate_json() or marshmallow-dataclass.
Does dataclasses.asdict() handle nested dataclasses?
Yes — asdict() is recursive. Nested dataclass fields become nested dicts, and lists of dataclasses become lists of dicts. The result is always a plain Python object tree that json.dumps() can serialize, as long as all leaf values are JSON-serializable types.
How do I handle datetime fields when serializing a dataclass to JSON?
The simplest approach is json.dumps(dataclasses.asdict(obj), default=str). This calls str() on any non-serializable value, producing ISO-format strings for datetime objects. For precise control, pass a custom function to default= that calls .isoformat() for datetime and raises TypeError for anything unexpected.
What is dacite and when should I use it?
dacite converts a plain dict (typically from json.loads()) into a typed dataclass instance, handling nested dataclasses, Optional fields, and Union types automatically. Use it when you need structured deserialization without adding a heavy library like Pydantic. Note that dacite does not validate data — it raises type errors on mismatches but does not provide field-level error messages suitable for API clients.
What is the difference between Pydantic BaseModel and Python dataclasses for JSON?
Pydantic BaseModel has built-in JSON support: serialize with model_dump_json(), deserialize with model_validate_json(), and generate JSON Schema with model_json_schema(). It also handles datetime, UUID, and Enum without custom encoders, and validates values at construction time with rich error messages. Plain dataclasses require asdict(), json.dumps(), dacite, and custom encoders for the same functionality. Choose Pydantic for APIs and data pipelines; choose plain dataclasses for lightweight internal data structures where Pydantic's dependency weight is not justified.
What does @dataclass(frozen=True) do?
frozen=True makes the dataclass immutable — all fields are set at construction and cannot be changed afterward (any attempt raises FrozenInstanceError). It also auto-generates __hash__, so frozen dataclass instances can be used as dict keys or in sets. Serialization with asdict() works identically on frozen dataclasses. To "update" a frozen dataclass, use dataclasses.replace(obj, field=new_value) which returns a new instance.
How do I use marshmallow-dataclass for JSON serialization with validation?
Decorate your dataclass with from marshmallow_dataclass import dataclass instead of the standard one. This attaches a .Schema attribute to the class. Use MyClass.Schema().load(data_dict) to deserialize and validate, or MyClass.Schema().dumps(obj) to serialize. It handles datetime, UUID, and Enum natively, and raises marshmallow.ValidationError with per-field error dicts on invalid input — suitable for returning to API clients.
Further reading and primary sources
- Python Dataclasses Docs — Official Python dataclasses module documentation
- dacite Library — Simple dataclass creation from dicts
- Pydantic Docs — Pydantic data validation library