How to decode base64 in Python
Learn to decode Base64 in Python. This guide covers various methods, practical tips, real-world applications, and common error debugging.

Base64 is a common method to transmit binary data across text-only channels. Python’s standard library provides powerful, built-in tools to decode this data back into its original form.
In this article, you'll explore techniques to decode Base64, with practical tips and real-world applications. You'll also get debugging advice to help you confidently resolve common issues that arise.
Basic decoding with base64 module
import base64
encoded_string = "SGVsbG8gV29ybGQh"
decoded_bytes = base64.b64decode(encoded_string)
decoded_string = decoded_bytes.decode('utf-8')
print(decoded_string)--OUTPUT--Hello World!
The core of this process is the base64.b64decode() function. It takes the encoded string and returns a bytes object, not a regular string. This is because Base64 is a binary-to-text encoding scheme, so the first step is always to get back to the raw binary data.
Once you have the bytes, you need to convert them into a human-readable string. That's where .decode('utf-8') comes in. It interprets the sequence of bytes using the UTF-8 character set, revealing the original message.
Common techniques for base64 decoding
Beyond the basics, you'll often encounter variations like incorrect padding, URL-safe encoding, or the need to work directly with the binary data itself.
Handling padding in base64 strings
import base64
# Base64 strings should have padding (=) for correct length
encoded_with_padding = "SGVsbG8gV29ybGQ="
encoded_without_padding = "SGVsbG8gV29ybGQ"
decoded = base64.b64decode(encoded_without_padding + "==")
print(decoded.decode('utf-8'))--OUTPUT--Hello World
Base64 strings require a length that is a multiple of four. The = character acts as padding to satisfy this rule. If this padding is missing, base64.b64decode() will often raise an error because the data is incomplete.
- A simple fix is to manually add padding back to the string. You can append one or two
=characters until the string’s length is a multiple of four, which allows the decoder to process it correctly.
Decoding URL-safe base64
import base64
# URL-safe base64 uses '-' and '_' instead of '+' and '/'
url_safe_encoded = "SGVsbG8gV29ybGQh"
decoded_bytes = base64.urlsafe_b64decode(url_safe_encoded)
print(decoded_bytes.decode('utf-8'))--OUTPUT--Hello World!
Standard Base64 can be problematic in URLs because the + and / characters have special meanings. The URL-safe variant solves this by replacing them with - and _, making the encoded string safe to transmit as a URL parameter or path segment.
- To handle this, you can use the
base64.urlsafe_b64decode()function. - It works just like the standard decoder, returning a bytes object that you then convert to a string.
Decoding to binary data
import base64
# Decode binary data (like images)
binary_base64 = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNk+A8AAQUBAScY42YAAAAASUVORK5CYII="
binary_data = base64.b64decode(binary_base64)
print(f"Decoded {len(binary_data)} bytes of binary data")
print(f"First 10 bytes: {binary_data[:10].hex()}")--OUTPUT--Decoded 68 bytes of binary data
First 10 bytes: 89504e470d0a1a0a0000
Sometimes, Base64 doesn't encode simple text. It's frequently used to transmit binary files like images, audio, or executables. In these situations, your goal isn't to get a readable string but to reconstruct the original binary data.
- The process starts the same way, using
base64.b64decode()to convert the encoded string into abytesobject. - The crucial difference is that you stop there. You don't call
.decode('utf-8')because the data isn't text. This raw binary data can then be saved to a file or used by another part of your application.
Advanced base64 decoding approaches
For more complex scenarios, you can move beyond the standard base64 module to handle errors gracefully, process large files, and use alternative decoding libraries.
Using binascii for alternative decoding
import binascii
encoded_data = "SGVsbG8gV29ybGQh"
# Alternative way to decode base64 using binascii
decoded_data = binascii.a2b_base64(encoded_data)
print(decoded_data.decode('utf-8'))--OUTPUT--Hello World!
The binascii module provides an alternative for handling Base64 conversions. Its function, a2b_base64(), accomplishes the same core task as the base64 module by converting an encoded string back into a bytes object. The name a2b is a helpful shorthand for "ASCII to binary."
- A key difference is its strictness. Unlike the
base64module,binasciidoesn't accept whitespace characters within the encoded data, which can be useful for validation. - The function returns a
bytesobject, so you'll still need to call.decode('utf-8')to convert the binary data into a readable string.
Adding error handling to base64 decoding
import base64
def safe_decode(encoded_string):
try:
decoded = base64.b64decode(encoded_string)
return decoded.decode('utf-8')
except Exception as e:
return f"Error decoding: {str(e)}"
print(safe_decode("SGVsbG8gV29ybGQh"))
print(safe_decode("Invalid Base64!"))--OUTPUT--Hello World!
Error decoding: Invalid base64-encoded string: number of data characters (13) cannot be 1 more than a multiple of 4
Decoding can easily fail if the input string isn't valid Base64, which could crash your application. You can prevent this by wrapping the decoding logic in a try...except block. This approach makes your code more resilient by allowing it to handle malformed data without stopping execution.
- The
safe_decodefunction first attempts to runbase64.b64decode(). - If successful, it returns the original string.
- If an error occurs, the
exceptblock catches it and returns a descriptive message instead of letting the program crash.
Processing base64 data in chunks
import base64
import io
# Useful for handling large base64 encoded files
def decode_in_chunks(base64_string, chunk_size=1024):
bytes_io = io.BytesIO()
for i in range(0, len(base64_string), chunk_size):
chunk = base64_string[i:i+chunk_size]
bytes_io.write(base64.b64decode(chunk + "=" * (-len(chunk) % 4)))
return bytes_io.getvalue()
result = decode_in_chunks("SGVsbG8gV29ybGQh")
print(result.decode('utf-8'))--OUTPUT--Hello World!
When dealing with large files, decoding the entire Base64 string at once can consume a lot of memory. A more efficient method is to process the data in chunks. This function reads the encoded string piece by piece, decodes each segment, and writes the resulting binary data to an in-memory buffer, making it ideal for memory-sensitive applications.
- The
decode_in_chunksfunction iterates over the string, handling achunk_sizeportion at a time. - It uses
io.BytesIOto create a binary stream that collects the decoded bytes. - A key step is adding correct padding (
=) to each chunk before decoding, ensuring thatbase64.b64decode()can process it correctly.
Move faster with Replit
Replit is an AI-powered development platform that transforms natural language into working applications. Describe what you want to build, and Replit Agent creates it—complete with databases, APIs, and deployment.
For the Base64 decoding techniques covered in this article, Replit Agent can turn them into production-ready tools.
- Build a web utility that converts Base64 strings back into downloadable image files.
- Create a debugging tool that safely decodes URL-safe Base64 strings, automatically handling padding and other common errors.
- Deploy a data processing pipeline that efficiently handles large, Base64-encoded files by decoding them in chunks.
Describe your app idea, and the agent writes the code, tests it, and fixes issues automatically. Try Replit Agent to turn your concepts into working applications faster.
Common errors and challenges
Even with the right tools, you can run into issues with character encoding, corrupted data, or incorrect padding when decoding Base64 strings.
Handling encoding issues with encode() and decode()
A common mistake is confusing bytes and strings. The base64.b64decode() function returns a bytes object, which is a raw sequence of bytes. If you treat this object like a regular string, you'll likely get unexpected behavior or a UnicodeDecodeError.
- To convert the
bytesobject into a human-readable string, you must call.decode()on it, specifying the character set, such as.decode('utf-8'). - The reverse is also true. Before encoding a string with
base64.b64encode(), you first have to turn it into bytes using.encode('utf-8').
Fixing corrupted base64 strings with re.sub()
Sometimes, Base64 data gets corrupted with extra characters, like whitespace or line breaks, that aren't part of the standard alphabet. These invalid characters will cause the decoder to raise an error.
You can clean the string before attempting to decode it. Python's regular expression module offers a powerful function, re.sub(), that can strip out any characters that don't belong. By defining a pattern that matches only valid Base64 characters, you can use re.sub() to remove everything else, leaving a clean string that decodes properly.
Troubleshooting base64 padding with % operator
An Incorrect padding error is one of the most frequent issues you'll encounter. It happens when the Base64 string's length isn't a multiple of four, meaning it's missing one or two = padding characters at the end.
You can programmatically fix this by calculating the missing padding. The modulo operator (%) is perfect for this. By taking the length of the string modulo four—len(your_string) % 4—you can determine exactly how many = characters you need to append to make the length valid for decoding.
Handling encoding issues with encode() and decode()
This issue becomes especially clear when working with non-ASCII characters. If you forget to call .encode() before passing a string to base64.b64encode(), you'll trigger a TypeError. The code below shows this exact scenario with a Japanese string.
import base64
# A string with non-ASCII characters
original = "こんにちは世界" # "Hello World" in Japanese
encoded = base64.b64encode(original)
print(encoded)
decoded = base64.b64decode(encoded).decode('utf-8')
print(decoded)
The base64.b64encode() function expects binary data, but the Japanese string is passed as a text string, causing a TypeError. The function cannot process the non-ASCII characters directly. The corrected code shows the necessary adjustment before encoding.
import base64
# A string with non-ASCII characters
original = "こんにちは世界" # "Hello World" in Japanese
encoded = base64.b64encode(original.encode('utf-8'))
print(encoded)
decoded = base64.b64decode(encoded).decode('utf-8')
print(decoded)
To fix the TypeError, you must convert the string to bytes before encoding it. The corrected code does this with original.encode('utf-8'). This step is crucial because base64.b64encode() is designed to work with binary data, not text strings. You'll often run into this issue when your data includes non-ASCII characters, such as those found in languages like Japanese. Always remember to encode your strings to bytes first.
Fixing corrupted base64 strings with re.sub()
If a Base64 string contains invalid characters—like spaces or punctuation—the decoder will fail. This happens because it only recognizes a specific set of characters. The following code shows the error that occurs when you attempt to decode a corrupted string.
import base64
# This base64 string has invalid characters
corrupted_base64 = "SGVsbG8gV 29ybGQh!"
decoded = base64.b64decode(corrupted_base64)
print(decoded.decode('utf-8'))
The b64decode function fails because the input string contains a space and an exclamation mark. Since these characters aren't part of the Base64 alphabet, the decoder raises an error. The corrected code below shows how to handle this.
import base64
import re
# This base64 string has invalid characters
corrupted_base64 = "SGVsbG8gV 29ybGQh!"
cleaned = re.sub(r'[^A-Za-z0-9+/=]', '', corrupted_base64)
try:
decoded = base64.b64decode(cleaned)
print(decoded.decode('utf-8'))
except Exception as e:
print(f"Decoding error: {e}")
The corrected code uses Python's regular expression module to sanitize the data. The re.sub() function strips out any characters that aren't part of the standard Base64 alphabet. It's especially useful when you're processing data from external sources—like user uploads or API responses—where extra whitespace or invalid characters can sneak in. By removing them first, you ensure the decoder receives a clean string, preventing unexpected errors and making your code more robust.
Troubleshooting base64 padding with % operator
You'll often run into an Incorrect padding error when a Base64 string is missing its trailing = characters. This makes its length invalid for decoding. The code below triggers this common error by attempting to process an incomplete string.
import base64
# Base64 string with incorrect length
incomplete_base64 = "SGVsbG8gV29ybGQ" # Missing padding
decoded = base64.b64decode(incomplete_base64)
print(decoded.decode('utf-8'))
The b64decode() function requires the string's length to be a multiple of four. Since the provided string doesn't meet this rule, the decoding fails. The corrected code below shows how to fix this programmatically.
import base64
# Base64 string with incorrect length
incomplete_base64 = "SGVsbG8gV29ybGQ" # Missing padding
padding_needed = len(incomplete_base64) % 4
if padding_needed:
incomplete_base64 += "=" * (4 - padding_needed)
decoded = base64.b64decode(incomplete_base64)
print(decoded.decode('utf-8'))
The corrected code programmatically fixes padding errors by ensuring the string's length is a multiple of four. This is a robust way to handle data that might have been truncated during transmission, preventing your decoder from crashing.
- It uses the modulo operator (
%) to calculate how many padding characters are needed. - Then, it appends the correct number of
=characters, making the string valid forb64decode().
Real-world applications
With a solid grasp of decoding, you can now tackle real-world applications like saving image files and parsing JWT authentication tokens.
Decoding base64 to save an image file
A practical application of Base64 decoding is converting an image embedded as a string back into a binary file that you can save and view.
import base64
# Base64 string representing a small image (truncated for brevity)
image_base64 = "iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAACXBIWXMAAAsTAAALEwEAmpwYAAABp0lEQVR4nKWTzUtUURjGf+85904jODYwpOIiYhBcuJB2QkGLoP9AEtwU1SpoJbjRFrUscON/ELhVCCJoIYiLoEUfkNMiJCFrqJCZ0bk693NxZ8bRibbwgZfDec/7Pu/zcQ78b8n/MN3ubJbMWjA3KPwIgkDDhDx0tR/TMzNBrVqkVi2S+DWW1zBnFEWssGyZUBc6U8qoaqpU+E6+MIZfiOEXYtQaOSoNn6phzilsAc8WRjdVtfT+8RO6olEAVgprrD4/QlP1NGjnmZ6ZYK44z2pjhXJjjQ1VJpFkRFDc3lPVww+ePsMYQ0fK58WTx3x684q9CuTa28gnr6mqMqvKjAQByMfjvTVjOH/lKt++fGZm+j3X7tyju7ePwoer5BIwCd+BQR+gs7OLsdk5jDH8+PqVuaP7SVXJJODl0iLOr/clsBPUDwKkq5NLN27hOQ5+xzb6Ll7CWstS6Rf5BFjg0NYQYjubVEtlXj56SK1WY3R8kkw2S2FlmcnREzRE6AXOhqfYbO8BLN/GKdVr9B4aYGhoiGB9nY+zs9TVkNvCeBiVrR1sy/+yMn9bvwGleoeVVF+mRAAAAABJRU5ErkJggg=="
# Decode and save the image
with open("decoded_image.png", "wb") as image_file:
image_data = base64.b64decode(image_base64)
image_file.write(image_data)
print(f"Image saved: decoded_image.png ({len(image_data)} bytes)")
This code reconstructs an image from its Base64 text representation. The process hinges on decoding the string back into binary data and then writing that data to a new file.
- The
base64.b64decode()function converts the string into abytesobject, which contains the raw image data. - A new file is opened using write-binary mode (
"wb"). This mode is essential because you are handling binary content, not plain text. - Finally, the
write()method saves the raw bytes to disk, creating a viewable image file from the original string.
Parsing JWT authentication tokens with base64
JSON Web Tokens (JWTs) rely on Base64 to package information, and decoding the token's header and payload is a common way to inspect its contents.
import base64
import json
# A sample JWT token (this is a dummy token)
jwt_token = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c"
# Split the token into its parts
header_b64, payload_b64, signature = jwt_token.split('.')
# Decode the header and payload
def decode_jwt_part(part):
padding = '=' * (4 - len(part) % 4) if len(part) % 4 != 0 else ''
decoded = base64.urlsafe_b64decode(part + padding)
return json.loads(decoded)
header = decode_jwt_part(header_b64)
payload = decode_jwt_part(payload_b64)
print("Header:", header)
print("Payload:", payload)
This code inspects a JSON Web Token (JWT) by decoding its header and payload. Since a JWT is made of three parts separated by dots, the code first uses jwt_token.split('.') to isolate them.
- The custom
decode_jwt_partfunction handles the core logic. It calculates and adds any missing=padding, which is often omitted in URL-safe Base64. - It then uses
base64.urlsafe_b64decode()to convert the URL-safe string back into binary data. - Finally,
json.loads()parses the resulting JSON into a readable Python dictionary.
Get started with Replit
Turn what you've learned into a real tool with Replit Agent. Describe what you want, like “a web utility that decodes Base64 to a downloadable file” or “a JWT inspector that displays the decoded payload.”
The agent writes the code, tests for errors, and deploys your application automatically. Start building with Replit and bring your ideas to life.
Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.
Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.


.png)
.png)