Working with JSON in Python
Python's standard library has had excellent JSON support since 2.6. The json module covers the basics with zero dependencies, and the ecosystem around it — dataclasses, Pydantic, orjson — covers the advanced cases. This guide walks through what you need to know.
The json module basics
import json
text = '{"id": 42, "name": "Ada"}'
data = json.loads(text) # dict
data["name"] # "Ada"
back = json.dumps(data) # compact string
pretty = json.dumps(data, indent=2)
# File helpers
with open("user.json") as f:
data = json.load(f)
with open("out.json", "w") as f:
json.dump(data, f, indent=2)Four functions cover string and file I/O: loads/load for reading, dumps/dump for writing.
Type mapping
- JSON object ↔ Python
dict - JSON array ↔ Python
list - JSON string ↔ Python
str - JSON number ↔ Python
intorfloat - JSON true/false ↔ Python
True/False - JSON null ↔ Python
None
Note that Python tuples serialize as JSON arrays but deserialize back as lists — the round trip isn't symmetric.
Common pitfalls
Non-serializable types
Datetimes, Decimal, set, bytes, and custom classes all raise TypeError: Object of type X is not JSON serializable. Two ways to fix it:
# Option 1: custom default
import json, datetime
def default(o):
if isinstance(o, (datetime.date, datetime.datetime)):
return o.isoformat()
raise TypeError
json.dumps({"now": datetime.datetime.now()}, default=default)
# Option 2: subclass JSONEncoder
class MyEncoder(json.JSONEncoder):
def default(self, o):
if isinstance(o, datetime.datetime):
return o.isoformat()
return super().default(o)
json.dumps({"now": datetime.datetime.now()}, cls=MyEncoder)NaN and Infinity
Python's json.dumps emits NaN, Infinity, and -Infinity as literal tokens by default — which is valid JavaScript but invalid JSON. Strict consumers will reject it. Pass allow_nan=False to raise an error instead, then decide how your app should handle those values.
Unicode
json.dumps escapes non-ASCII characters by default. If you want the raw UTF-8 output (smaller and more readable), pass ensure_ascii=False. Make sure your file is opened withencoding="utf-8" when you do.
Dataclasses and serialization
from dataclasses import dataclass, asdict
import json
@dataclass
class User:
id: int
name: str
u = User(42, "Ada")
json.dumps(asdict(u)) # '{"id": 42, "name": "Ada"}'asdict recursively converts a dataclass (and any nested dataclasses) into a dict you can pass to json.dumps. This is the cleanest zero-dep approach for typed JSON in Python.
Pydantic for runtime validation
When you need to validate JSON coming from an API or user input, Pydantic is the standard choice:
from pydantic import BaseModel
class User(BaseModel):
id: int
name: str
email: str
user = User.model_validate_json(text) # raises ValidationError on bad input
user.model_dump_json(indent=2) # serialize backPerformance: orjson
The stdlib jsonmodule is fast enough for most cases, but if you're processing millions of documents, orjson is typically 3–10x faster and handles datetimes and dataclasses out of the box.
Debugging broken JSON
If json.loads fails, the exception message includes a line and column. For more interactive debugging, paste the input into the JSON Validator — it will surface the same error with highlighted context, and the formatter will pretty-print valid JSON so you can see the structure at a glance.