Tutorial

Python Pickle Example: A Guide to Serialization & Deserialization

Practical python pickle example guide: dump/load, custom classes, protocols (including Protocol 5), compression, security best practices, and alternatives.

Drake Nguyen

Founder · System Architect

3 min read
Python Pickle Example: A Guide to Serialization & Deserialization
Python Pickle Example: A Guide to Serialization & Deserialization

Introduction

This python pickle example explains how to serialize and deserialize Python objects using the pickle module. Pickle converts Python objects to a binary stream so you can save them to disk, send them between processes, or cache results. It supports complex types and custom classes, making it useful for model persistence, caching, and object persistence in Python serialization workflows.

Security note: never unpickle data from untrusted sources because pickle can execute arbitrary code during deserialization.

Key takeaways

  • Pickle is powerful for Python-only serialization and model persistence.
  • Use pickle.dump and pickle.load (or dumps/loads) to write/read binary pickles.
  • Prefer highest protocol (Protocol 5 on Python 3.8+) for performance, but consider compatibility.
  • For untrusted data choose safe alternatives (JSON, MessagePack, Protocol Buffers) and apply integrity checks.

Basic python pickle example — dump and load

Save and restore a simple Python object with pickle.dump and pickle.load. This is a common python pickle example dump and load pattern.

import pickle

# Write (dump) to file
items = [1, 'two', {'three': 3}]
with open('data.pkl', 'wb') as f:
    pickle.dump(items, f, protocol=pickle.HIGHEST_PROTOCOL)

# Read (load) from file
with open('data.pkl', 'rb') as f:
    loaded = pickle.load(f)
print(loaded)

Pickle with a custom class

To pickle custom class objects, make sure the class is importable from the same module path when unpickling.

from dataclasses import dataclass
import pickle

@dataclass
class User:
    name: str
    age: int

users = [User('Alice', 30), User('Bob', 25)]
with open('users.pkl', 'wb') as f:
    pickle.dump(users, f, protocol=pickle.HIGHEST_PROTOCOL)

# Later
with open('users.pkl', 'rb') as f:
    restored = pickle.load(f)
print(restored)

Pickle protocols and protocol 5 example

Pickle protocols determine the serialization format. Protocol 5 (python 3.8+) adds out-of-band buffers and improved performance. Use HIGHEST_PROTOCOL to get the newest available protocol.

import pickle

# Explicit protocol 5 example (requires Python 3.8+)
with open('obj_p5.pkl', 'wb') as f:
    pickle.dump(obj, f, protocol=5)

# Or use HIGHEST_PROTOCOL for best performance
with open('obj_best.pkl', 'wb') as f:
    pickle.dump(obj, f, protocol=pickle.HIGHEST_PROTOCOL)

Compression with gzip

Compress pickles to save space. This example shows compress pickle file gzip example python usage.

import pickle, gzip

with gzip.open('data.pkl.gz', 'wb') as f:
    pickle.dump(large_obj, f, protocol=pickle.HIGHEST_PROTOCOL)

with gzip.open('data.pkl.gz', 'rb') as f:
    obj = pickle.load(f)

Security best practices

Because pickle can execute code during unpickling, never unpickle untrusted data. Follow these safe unpickling python best practices:

  • Only load pickles from trusted sources and verified storage.
  • Add integrity checks (HMAC or cryptographic hashes) before unpickling.
  • Consider schema validation or converting to JSON for external data.
  • In networked environments use encryption + signing when transporting pickle files.

Simple integrity wrapper

import pickle, hmac, hashlib

def sign_bytes(data_bytes, key):
    return hmac.new(key, data_bytes, hashlib.sha256).hexdigest()

# Store: pickled = pickle.dumps(obj); signature = sign_bytes(pickled, key)
# Verify before pickle.loads to reduce risk

Alternatives to pickle

When cross-language compatibility or safety is required, use alternatives:

  • JSON — safe and human readable; limited to basic types.
  • MessagePack — compact binary, faster than JSON for many workloads.
  • Protocol Buffers — schema-based, efficient and language-agnostic.
  • Joblib — optimized for large scikit-learn models (joblib vs pickle).

Performance and tips

For large objects, use protocol 4 or 5 and consider compression trade-offs. Benchmark to choose between pickle, json, and msgpack for your data. For big ML models use joblib.dump which can be faster for large numpy arrays.

Troubleshooting common issues

AttributeError during unpickling

This often happens when class definitions change. Use __getstate__/__setstate__ or provide defaults to preserve backward compatibility.

Protocol compatibility

If you see "unsupported pickle protocol" errors, serialize with a lower protocol (e.g., protocol=2) to increase cross-version compatibility.

Handling large files and memory

To avoid memory spikes, stream data, use mmap, or split large structures. Compressing and using joblib for models can help.

Frequently asked questions

Is pickle safe?

No — pickle is not safe for untrusted data. For untrusted input use JSON or other secure serialization and add integrity checks.

When should I use pickle vs json?

Use pickle when you need to preserve arbitrary Python objects and performance matters within a trusted Python environment. Use JSON for interoperability and safety.

How Netalith I securely transport pickle files between servers?

Encrypt and sign the pickle file (for example with cryptography.Fernet plus HMAC) and transfer over secure channels (SFTP/SSH/HTTPS).

Conclusion

This python pickle example guide covered dump/load usage, custom class pickling, protocol selection (including pickle protocol 5 example python 3.8), compression, security best practices, and alternatives. Use pickle for Python-specific serialization and model persistence, but treat unpickling from external sources with caution and apply validation, integrity, and encryption where appropriate.

Stay updated with Netalith

Get coding resources, product updates, and special offers directly in your inbox.