PEP 574: Pickle Protocol 5 and Out-of-Band Data
This is Part 1 of a two-part series on Python zero-copy serialization. Part 2 covers the practical choice between Protocol 5 and shared_memory.
Pickle was designed in 1995 for disk persistence. Today it’s the backbone of Python’s multiprocessing IPC, Dask task graphs, and Ray’s object store. The problem: serializing a large NumPy array triggers three redundant memory copies. PEP 574, shipped in Python 3.8, fixes this with a mechanism called out-of-band data.
The Copy Problem
Serializing a bytearray with the old protocol:
bytearray.__reduce_ex__creates abytescopy of the data.pickle.dumpscopies thatbytesobject into the pickle stream.- On deserialization, a temporary
bytesobject is created again before the final object is reconstructed.
Three copies for one array. Third-party libraries (Dask, PyArrow) worked around this with custom serialization, but that only helped objects that supported it — plain Python objects were stuck with Pickle.
The Design: Metadata + Data Separation
PEP 574’s core idea: split the pickle stream into two channels.
- Metadata stream: object type, structure, references — the existing pickle format, unchanged.
- Data buffers: large binary payloads sent as out-of-band buffers, passed as raw memory views with no copy.
The producer signals which parts of an object are “large data” by wrapping them in PickleBuffer. The consumer decides whether to handle them out-of-band (zero-copy) or inline.
New APIs
| API | Role |
|---|---|
pickle.PickleBuffer | Wraps a buffer for out-of-band handling in __reduce_ex__ |
buffer_callback (pickling) | Called with each out-of-band buffer; return False to transmit out-of-band |
buffers (unpickling) | Iterable of pre-allocated buffers to inject on deserialization |
| Protocol 5 | New protocol version required for all of the above |
Existing code that doesn’t use these APIs is completely unaffected — Protocol 5 is backwards compatible for all previously serializable objects.
Producer API: PickleBuffer
A class implements out-of-band support by returning PickleBuffer from __reduce_ex__:
import pickle
class LargeArray:
def __reduce_ex__(self, protocol):
if protocol >= 5:
return type(self), (pickle.PickleBuffer(self),)
# fall back to copying for older protocols
return type(self), (bytes(self),)
PickleBuffer wraps anything that implements the PEP 3118 buffer protocol — NumPy arrays, bytearray, memoryview. It exposes:
memoryview(pb)— get shape, strides, formatpb.raw()— raw contiguous view (needed for Fortran-order arrays)pb.release()— explicitly release the underlying buffer
The buffer must be contiguous (C or Fortran order). Non-contiguous buffers raise an error at pickle time.
Consumer API: Pickling
buffers = []
data = pickle.dumps(obj, protocol=5, buffer_callback=buffers.append)
When buffer_callback is provided, it’s called for each PickleBuffer the object returns. The callback’s return value controls routing:
- Return
False(orNone) → buffer goes out-of-band, stored inbuffers, aNEXT_BUFFERopcode is written into the pickle stream - Return
True→ buffer is serialized inline (same as old behavior)
This lets the caller decide per-buffer: transmit large arrays out-of-band, but serialize small metadata inline.
Consumer API: Unpickling
obj = pickle.loads(data, buffers=buffers)
buffers is an iterable consumed in order. Each NEXT_BUFFER opcode in the stream pops the next buffer from this iterable. The consumer can pre-allocate these buffers in shared memory, GPU memory, or any target — the deserializer writes into them directly with no intermediate copy.
New Protocol 5 Opcodes
| Opcode | Effect |
|---|---|
BYTEARRAY8 | Construct bytearray from inline stream data |
NEXT_BUFFER | Pop the next buffer from the buffers iterable and push it on the stack |
READONLY_BUFFER | Convert the top-of-stack buffer to a read-only view |
READONLY_BUFFER is important for objects like bytes that are immutable — the consumer can map a read-only memory region as the deserialized value.
Zero-Copy Shared Memory Example
With buffer_callback and buffers, two objects can point at the same physical memory:
import numpy as np
import pickle
a = np.zeros(10)
buffers = []
# serialize: buffer_callback captures the array's memory view, no copy
data = pickle.dumps(a, protocol=5, buffer_callback=buffers.append)
# deserialize: inject the same buffers back — no copy
b = pickle.loads(data, buffers=buffers)
b[0] = 42
print(a[0]) # 42.0 — a and b share the same memory
In a real multiprocessing scenario, the consumer would allocate shared memory and pass memoryview objects as buffers, giving the deserialized array a zero-copy view into shared memory.
What Was Rejected
Persistent load interface: persistent_id() / persistent_load() could theoretically carry buffers, but every object — even int — would trigger a persistent_id() call. Too slow.
Batch buffer passing: passing a list of buffers instead of a callback prevents the callback from signaling inline vs. out-of-band per buffer. Rejected because the flexibility matters.
PickleBuffer in old protocols: allowing PickleBuffer in Protocol 4 would require two extra copies to embed the buffer inline, defeating the purpose. Protocol 5 is required.
Ecosystem Status
| Project | Status |
|---|---|
| CPython 3.8+ | Merged |
pickle5 (PyPI) | Backport for Python 3.6/3.7 |
| NumPy | Protocol 5 + out-of-band buffer support added |
| Apache Arrow | Support in Python bindings |
Python 3.14 will make Protocol 5 the default. Until then, you need protocol=5 explicitly.
Summary
- Old Pickle copies large arrays 3 times during serialization. Protocol 5 reduces this to zero copies on the data path.
PickleBufferis the producer-side signal;buffer_callbackandbuffersare the consumer-side hooks.- The caller controls routing: each buffer can independently go out-of-band or inline.
- The deserialization target is caller-supplied — shared memory, GPU memory, or any buffer-protocol object.
- Requires
protocol=5explicitly until Python 3.14.