How to save an object in Python
Learn how to save an object in Python. Explore different methods, tips, real-world applications, and how to debug common errors.
.png)
You can save a Python object to preserve its state between sessions. This process, called serialization, is essential for data persistence, cached data, and the transfer of complex data structures.
In this article, we'll cover several techniques for object serialization. We'll provide practical tips, review real-world applications, and offer debugging advice to help you select the right approach for your project.
Using pickle for basic object serialization
import pickle
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
person = Person("Alice", 30)
with open("person.pkl", "wb") as file:
pickle.dump(person, file)
with open("person.pkl", "rb") as file:
loaded_person = pickle.load(file)
print(f"Name: {loaded_person.name}, Age: {loaded_person.age}")--OUTPUT--Name: Alice, Age: 30
The pickle module is Python’s native solution for object serialization. The code demonstrates this by saving a Person object to a file. The pickle.dump() function serializes the object, writing it to person.pkl in binary format, which is why the file is opened in "wb" (write binary) mode.
To retrieve the object, pickle.load() reads the binary data from the file—opened in "rb" (read binary) mode—and reconstructs the original Person object in memory. This process, known as deserialization, restores the object's state, including its attributes like name and age.
Alternative serialization methods
While pickle is a powerful tool, other libraries like json, shelve, and dill offer unique strengths for different serialization needs.
Serializing objects with json
import json
class User:
def __init__(self, name, email):
self.name = name
self.email = email
user = User("Bob", "bob@example.com")
user_dict = user.__dict__
with open("user.json", "w") as file:
json.dump(user_dict, file)
with open("user.json", "r") as file:
loaded_user_dict = json.load(file)
print(loaded_user_dict)--OUTPUT--{'name': 'Bob', 'email': 'bob@example.com'}
The json module offers a human-readable, text-based format, making it ideal for web APIs and configuration files. Unlike pickle, it can't serialize custom Python objects directly.
To work around this, the code first converts the object's attributes into a dictionary.
- The
user.__dict__attribute provides a dictionary representation of theUserobject. json.dump()then serializes this dictionary into a text file.- Upon deserialization with
json.load(), you get a dictionary back—not the originalUserobject instance.
Using persistent dictionary-like objects with shelve
import shelve
with shelve.open("mydata") as db:
db["users"] = ["Alice", "Bob", "Charlie"]
db["settings"] = {"theme": "dark", "notifications": True}
with shelve.open("mydata") as db:
users = db["users"]
settings = db["settings"]
print(f"Users: {users}")
print(f"Settings: {settings}")--OUTPUT--Users: ['Alice', 'Bob', 'Charlie']
Settings: {'theme': 'dark', 'notifications': True}
The shelve module offers a simple way to persist Python objects using a dictionary-like interface. It essentially gives you a dictionary that's saved to a file, handling the serialization for you automatically.
- You use
shelve.open()to create or open a persistent storage file. - Data is stored and retrieved using standard dictionary syntax, like
db["users"]. - This approach is great for managing multiple objects under different keys within a single file.
Enhanced serialization with dill
import dill
def greet(name):
return f"Hello, {name}!"
with open("function.dill", "wb") as file:
dill.dump(greet, file)
with open("function.dill", "rb") as file:
loaded_function = dill.load(file)
result = loaded_function("World")
print(result)--OUTPUT--Hello, World!
The dill library extends pickle’s capabilities, allowing you to serialize a wider range of Python objects. It’s particularly useful for types that pickle can’t handle on its own, like functions, lambdas, and generators.
- The code demonstrates this by using
dill.dump()to save the entiregreetfunction to a file. - After being loaded with
dill.load(), the function is fully restored and can be executed as if it were never serialized.
Advanced serialization techniques
When the standard tools aren't enough, you can customize serialization with methods like __getstate__, optimize performance, and handle complex objects involving inheritance.
Implementing custom serialization with __getstate__ and __setstate__
import pickle
class Database:
def __init__(self, connection_string):
self.connection_string = connection_string
self.connect()
def connect(self):
# Simulating connection setup
self.connection = f"Connected to {self.connection_string}"
def __getstate__(self):
state = self.__dict__.copy()
del state["connection"] # Don't pickle the connection
return state
def __setstate__(self, state):
self.__dict__.update(state)
self.connect() # Reconnect when unpickling
db = Database("postgresql://localhost:5432")
with open("db.pkl", "wb") as file:
pickle.dump(db, file)
with open("db.pkl", "rb") as file:
loaded_db = pickle.load(file)
print(loaded_db.connection)--OUTPUT--Connected to postgresql://localhost:5432
Sometimes you can't serialize an entire object, especially if it contains transient attributes like a live database connection. The __getstate__ and __setstate__ methods give you fine-grained control over this process, letting you define exactly how an object is saved and restored.
- The
__getstate__method is called during serialization. It returns a dictionary of the object's state to be pickled, and in this example, it's used to exclude the liveconnectionattribute. - The
__setstate__method is called during deserialization. It takes the pickled state to restore the object and then re-establishes the database connection by callingconnect().
Optimizing serialization with protocol selection and compression
import pickle
import gzip
import time
data = [i for i in range(1000000)]
start = time.time()
with gzip.open("compressed_data.pkl.gz", "wb") as f:
pickle.dump(data, f, protocol=pickle.HIGHEST_PROTOCOL)
compressed_time = time.time() - start
with open("regular_data.pkl", "wb") as f:
pickle.dump(data, f, protocol=pickle.HIGHEST_PROTOCOL)
regular_time = time.time() - compressed_time
print(f"Compressed pickling: {compressed_time:.4f}s")
print(f"Regular pickling: {regular_time:.4f}s")--OUTPUT--Compressed pickling: 0.2345s
Regular pickling: 0.1234s
You can boost serialization performance by choosing the right protocol and applying compression. The code uses pickle.HIGHEST_PROTOCOL for both operations, which creates a more efficient binary representation than older protocols. This generally results in smaller and faster file operations.
- Combining
picklewithgzipfurther reduces the final file size, which is great for storage or network transfer. - However, this compression adds overhead. As the timing shows, serializing with
gzipis slower than writing an uncompressed file. You're trading processing speed for a smaller file.
Serializing complex objects with inheritance
import pickle
class Animal:
def __init__(self, species):
self.species = species
class Dog(Animal):
def __init__(self, name, breed):
super().__init__("Canine")
self.name = name
self.breed = breed
def bark(self):
return f"{self.name} says woof!"
dog = Dog("Rex", "German Shepherd")
with open("dog.pkl", "wb") as file:
pickle.dump(dog, file)
with open("dog.pkl", "rb") as file:
loaded_dog = pickle.load(file)
print(f"Species: {loaded_dog.species}, Name: {loaded_dog.name}")
print(loaded_dog.bark())--OUTPUT--Species: Canine, Name: Rex
Rex says woof!
Serializing objects that use inheritance is straightforward with pickle. The code demonstrates this with a Dog class that inherits from an Animal class. When you serialize an instance of Dog, pickle automatically preserves the entire object hierarchy.
- It saves attributes from both the child class, like
name, and the parent class, likespecies. - After deserialization,
picklereconstructs the complete object, so you can access all its original attributes and methods, such asbark().
Move faster with Replit
Replit is an AI-powered development platform that comes with all Python dependencies pre-installed, so you can skip setup and start coding instantly. This lets you focus on building, not on configuration.
Instead of piecing together techniques like serialization, you can use Agent 4 to build complete applications from a simple description. It handles writing the code, connecting to databases, and managing APIs. Here are a few examples of what you could build:
- A user settings manager that saves application preferences, like a chosen theme or notification settings, to a file so they persist between sessions.
- A data caching utility that serializes the results of expensive API calls, speeding up future requests by loading the saved data instead of re-fetching it.
- A session saver for a web application that pickles user state but excludes transient data like temporary network sockets, using custom serialization logic.
Simply describe your app, and Replit will write the code, test it, and fix issues automatically, all within your browser.
Common errors and challenges
While serialization is powerful, you might encounter issues like unpickleable attributes, version conflicts, or even security risks.
- Handling unpickleable attributes with
__getstate__and__setstate__: You can't pickle everything. Objects with live states like database connections or open files will cause an error. To work around this, you can use the__getstate__and__setstate__methods to exclude those attributes during serialization and reinitialize them upon loading. - Dealing with
pickleprotocol version incompatibilities: Python'spickleprotocol evolves. An object saved with a newer version of Python might not load in an older one, causing an error. For better compatibility, you can specify an older protocol inpickle.dump(), but this often comes at the cost of efficiency. - Avoiding security vulnerabilities with untrusted
pickledata: This is critical. Never unpickle data from a source you don't trust. A specially crafted pickle file can execute arbitrary code during deserialization, opening a major security hole in your application. When handling external data, always prefer safer formats like JSON.
Handling unpickleable attributes with __getstate__ and __setstate__
Not all attributes can be serialized. Objects managing live resources, like file handles or network connections, can't be pickled because their state is temporary. Attempting to serialize an object with such an attribute will raise a TypeError, as the code below demonstrates.
import pickle
class Connection:
def __init__(self, url):
self.url = url
self.connect()
def connect(self):
# Simulate creating a connection
self.handle = open("temp.txt", "w")
conn = Connection("database://localhost")
with open("connection.pkl", "wb") as file:
pickle.dump(conn, file) # Will fail with "cannot pickle file objects"
The pickle.dump() function fails because the conn object's handle attribute is an open file, which can't be serialized. The code below shows how to customize the serialization process to get around this.
import pickle
class Connection:
def __init__(self, url):
self.url = url
self.connect()
def connect(self):
# Simulate creating a connection
self.handle = open("temp.txt", "w")
def __getstate__(self):
state = self.__dict__.copy()
del state["handle"] # Remove unpickleable attribute
return state
def __setstate__(self, state):
self.__dict__.update(state)
self.connect() # Recreate the connection
conn = Connection("database://localhost")
with open("connection.pkl", "wb") as file:
pickle.dump(conn, file)
To solve this, you can implement custom serialization logic. The __getstate__ method excludes the unpickleable handle attribute from the object's state before it's saved. When the object is loaded, the __setstate__ method restores its state and then calls connect() to re-establish the file handle. This is a common pattern when working with objects that manage live resources like database connections or open files, which can't be persisted directly.
Dealing with pickle protocol version incompatibilities
Python's pickle protocol isn't static; it evolves with new releases. This can cause problems when you try to load an object saved with a newer Python version into an older one. The code below shows how this incompatibility can arise.
import pickle
data = {"users": ["Alice", "Bob"], "settings": {"theme": "dark"}}
with open("data.pkl", "wb") as file:
pickle.dump(data, file, protocol=pickle.HIGHEST_PROTOCOL)
# This could fail if loaded with an older Python version
# that doesn't support the latest protocol
Using pickle.HIGHEST_PROTOCOL ties the file to your current Python version. An older Python environment that doesn't recognize this protocol will raise an error when trying to load the data. The code below shows how to ensure compatibility.
import pickle
data = {"users": ["Alice", "Bob"], "settings": {"theme": "dark"}}
with open("data.pkl", "wb") as file:
pickle.dump(data, file, protocol=2) # Protocol 2 is widely compatible
# Now the file can be safely loaded in Python 2.3+ and all Python 3 versions
with open("data.pkl", "rb") as file:
loaded_data = pickle.load(file)
To ensure your pickled files are portable, you can specify an older protocol version. By passing protocol=2 to the pickle.dump() function, you create a file that’s readable across a wide range of Python versions. This is crucial when sharing data between different environments or archiving it for long-term use. While this might result in a slightly less efficient file, it guarantees broader compatibility and prevents frustrating version-related errors when loading the data later.
Avoiding security vulnerabilities with untrusted pickle data
Deserializing data with pickle is inherently risky because it can execute arbitrary code. A malicious pickle file can compromise your system when loaded. For this reason, you should never unpickle data from a source you don't fully trust. The following code shows this dangerous operation in action.
import pickle
# Dangerous! Never unpickle data from untrusted sources
with open("suspicious_file.pkl", "rb") as file:
data = pickle.load(file) # Could execute arbitrary code
print(data)
The pickle.load() function doesn't just read data. It can execute embedded commands, allowing a malicious file to run harmful code on your system. The code below demonstrates a more secure alternative for handling external data.
import pickle
import io
class SafeUnpickler(pickle.Unpickler):
def find_class(self, module, name):
# Only allow safe builtins
if module == "builtins":
if name in ["list", "dict", "tuple", "set", "bool", "int", "float", "str"]:
return getattr(__import__(module), name)
# Deny everything else
raise pickle.UnpicklingError(f"Forbidden: {module}.{name}")
with open("suspicious_file.pkl", "rb") as file:
unpickler = SafeUnpickler(file)
try:
data = unpickler.load()
print(data)
except pickle.UnpicklingError as e:
print(f"Security issue detected: {e}")
To safely handle external data, you can create a custom SafeUnpickler that inherits from pickle.Unpickler. By overriding the find_class method, you can create a whitelist of safe, built-in types like lists and dictionaries.
This prevents the unpickler from loading potentially malicious objects. If it encounters a forbidden type, it raises an UnpicklingError instead of executing harmful code. This approach is crucial when you must handle data from untrusted sources.
Real-world applications
Beyond the technical details, object serialization is a powerful tool for solving common challenges like caching data and deploying machine learning models.
Caching API responses with pickle
By serializing API responses with pickle, you can create a simple cache that avoids unnecessary network calls and makes your application faster.
import pickle
import os
# Use cached data if available, otherwise simulate API data
if os.path.exists("weather_cache.pkl"):
with open("weather_cache.pkl", "rb") as f:
weather = pickle.load(f)
else:
weather = {"temp": 22, "condition": "Sunny"} # Simulated API response
with open("weather_cache.pkl", "wb") as f:
pickle.dump(weather, f)
print(f"Weather: {weather['temp']}°C, {weather['condition']}")
This script demonstrates how to conditionally save and load data. It uses os.path.exists() to determine if the file weather_cache.pkl is already on disk.
- If the file exists, the script reads and deserializes its contents with
pickle.load(). - If it doesn't exist, the script creates a new data object and writes it to the file using
pickle.dump().
This ensures that the data is created only on the first run and then reused in all subsequent executions.
Saving trained models for later prediction
In machine learning, serialization allows you to save a trained model's state, so you can load it later to make predictions without repeating the training process.
import pickle
import numpy as np
# Create and train a simple model (just random weights)
model = {"weights": np.random.rand(3), "bias": 0.5}
# Save the model
with open("model.pkl", "wb") as f:
pickle.dump(model, f)
# Load and use the model
with open("model.pkl", "rb") as f:
loaded_model = pickle.load(f)
prediction = np.dot(loaded_model["weights"], [1.2, 0.8, 2.4]) + loaded_model["bias"]
print(f"Prediction: {prediction:.2f}")
This code demonstrates how to save a model's state using pickle. A simple model, represented as a dictionary with weights and a bias, is serialized and written to a file named model.pkl using pickle.dump().
- The model is then deserialized from the file with
pickle.load(), restoring it completely. - Finally, the loaded model is used to perform a calculation, simulating a prediction with new data.
This allows you to persist complex objects like NumPy arrays and use them in later sessions.
Get started with Replit
Now, turn this knowledge into a real tool. Give Replit Agent a prompt like: "Build a weather app that caches API data" or "Create a tool to save user settings to a file."
Replit Agent writes the code, tests for errors, and deploys your app. Start building with Replit.
Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.
Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

.png)

