How to get the file size in Python
Learn how to get file size in Python. This guide covers various methods, tips, real-world applications, and common error debugging.

You often need to know a file's size in Python. This is crucial when you manage disk space, validate uploads, or optimize data processes. Python offers several built-in methods for this task.
In this article, we'll cover techniques to get file sizes. You'll also find practical tips, real-world applications, and advice on how to debug common issues for your projects.
Using os.path.getsize() for basic file size check
import os
file_path = "example.txt"
size_in_bytes = os.path.getsize(file_path)
print(f"File size: {size_in_bytes} bytes")--OUTPUT--File size: 1024 bytes
The os.path.getsize() function offers the most direct way to check a file's size. It's part of Python's standard os module, which provides a portable way of using operating system-dependent functionality. You just need to provide the path to your file.
The function returns the file's size as an integer representing the number of bytes. This raw byte count is a precise, low-level measurement, making it perfect for programmatic checks where accuracy matters—like verifying file integrity or managing storage quotas.
Standard file size operations
While os.path.getsize() gives you the raw byte count, Python offers more advanced methods for gathering detailed file metadata and formatting the output.
Using os.stat() to get detailed file information
import os
file_info = os.stat("example.txt")
print(f"Size: {file_info.st_size} bytes")
print(f"Last modified timestamp: {file_info.st_mtime}")--OUTPUT--Size: 1024 bytes
Last modified timestamp: 1617283456.123456
When you need more than just the file size, os.stat() is the way to go. It returns a special object that’s packed with file metadata. While the size is available through the st_size attribute, you also get access to other useful details.
- Timestamps: Find out when the file was last modified (
st_mtime) or last accessed (st_atime). - Permissions: Check the file's permissions and mode using the
st_modeattribute.
Using pathlib.Path for object-oriented file handling
from pathlib import Path
file_path = Path("example.txt")
size = file_path.stat().st_size
print(f"The file is {size} bytes")--OUTPUT--The file is 1024 bytes
The pathlib module offers a modern, object-oriented way to handle file paths. Instead of passing strings to functions, you create a Path object that represents your file. This approach is often preferred for its readability and cleaner code structure.
- The
Pathobject bundles path-related operations into convenient methods. - You can call
.stat()directly on it to get the same stat result asos.stat(), accessing the size via thest_sizeattribute.
Converting file size to human-readable format
import os
def format_size(size_bytes):
for unit in ['B', 'KB', 'MB', 'GB']:
if size_bytes < 1024 or unit == 'GB':
return f"{size_bytes:.2f} {unit}"
size_bytes /= 1024
print(format_size(os.path.getsize("example.txt")))--OUTPUT--1.00 KB
While a byte count is precise, it isn't always easy to read. This custom format_size function converts the raw byte value from os.path.getsize() into a more familiar format like kilobytes or megabytes.
- The function repeatedly divides the size by 1024, moving up through units like
'B','KB', and'MB'. - It stops and returns a formatted string once the number is small enough, attaching the correct unit.
This memory-efficient approach makes file sizes much easier to understand at a glance in user interfaces or logs.
Advanced file size techniques
Moving beyond individual files, you'll often need to handle more complex cases, like calculating the total size of a directory or checking a remote file.
Getting size of multiple files using glob
import os
import glob
total_size = sum(os.path.getsize(f) for f in glob.glob("*.txt"))
print(f"Total size of all text files: {total_size} bytes")--OUTPUT--Total size of all text files: 5120 bytes
When you need to work with multiple files at once, the glob module is incredibly useful. It finds all pathnames matching a specified pattern, like all text files in a directory. This example combines it with a generator expression for an efficient, one-line solution.
- The
glob.glob("*.txt")function returns a list of all files in the current directory ending with.txt. - A generator expression then iterates through this list, calling
os.path.getsize()on each file. - Finally, the
sum()function adds up all the individual file sizes to give you a grand total.
Walking directory tree to calculate folder size
import os
def get_dir_size(path):
total = 0
with os.scandir(path) as it:
for entry in it:
if entry.is_file():
total += entry.stat().st_size
elif entry.is_dir():
total += get_dir_size(entry.path)
return total
print(f"Folder size: {get_dir_size('my_folder')} bytes")--OUTPUT--Folder size: 10485760 bytes
Calculating a folder's total size requires you to account for every file within it, including those in subdirectories. This custom get_dir_size function recursively "walks" the directory tree to accomplish this. It uses os.scandir(), which is an efficient way to iterate through a directory's contents.
- The function checks if an entry is a file using
entry.is_file()and adds its size to the total. - If the entry is a directory, confirmed with
entry.is_dir(), the function calls itself. This recursive step calculates the subdirectory's size and adds it to the running total.
Checking remote file size with requests
import requests
url = "https://www.example.com/sample.txt"
response = requests.head(url)
size = int(response.headers.get('content-length', 0))
print(f"Remote file size: {size} bytes")--OUTPUT--Remote file size: 2048 bytes
You don't need to download a remote file to find its size. The requests library handles this efficiently by sending a HEAD request, which fetches only the file's metadata—not its content. This saves bandwidth and time.
- The server's response includes headers, and the file's size is found in the
content-lengthheader. - Using
response.headers.get('content-length', 0)is a safe way to retrieve this value, as it provides a default of0if the header is missing.
Move faster with Replit
Replit is an AI-powered development platform that comes with all Python dependencies pre-installed, so you can skip setup and start coding instantly. You don't need to configure environments or manage packages.
Instead of just piecing together techniques, you can build a complete application. Agent 4 takes your idea and builds a working product by handling the code, databases, APIs, and deployment directly from your description. You can create tools like:
- A disk usage analyzer that recursively calculates folder sizes with
os.scandir()and displays the results in a human-readable format. - A batch processing utility that calculates the total size of all matching files using
globbefore starting a task. - A website monitoring tool that checks the
content-lengthof remote assets to verify their integrity without a full download.
Simply describe your app, and Replit will write the code, test it, and fix issues automatically, all within your browser.
Common errors and challenges
Getting a file's size is usually straightforward, but a few common errors can trip you up if you're not prepared for them.
- Handling
FileNotFoundError: This error occurs if the file path is incorrect or the file doesn't exist. Functions likeos.path.getsize()will fail immediately. You can prevent crashes by wrapping your code in atry...exceptblock to catch theFileNotFoundErrorand handle it gracefully, like logging a warning or skipping the file. - Avoiding
PermissionError: You'll encounter aPermissionErrorif your script lacks the necessary rights to access a file. This often happens with system files or in restricted environments. Again, atry...exceptblock is your best tool for catching this error and allowing your program to continue running without interruption. - Incorrect file size units: A classic mix-up is using 1000 instead of 1024 for conversions. File systems operate in binary, so a kilobyte is 1024 bytes, not 1000. This mistake can lead to significant inaccuracies with large files, so always use powers of 1024 for correct human-readable formatting.
Handling FileNotFoundError when checking file size
One of the most frequent issues you'll face is the FileNotFoundError. It's Python's way of telling you the file you're looking for doesn't exist. This error will immediately stop your script if you don't handle it. The following code demonstrates this problem in action.
import os
def get_file_size(filename):
return os.path.getsize(filename)
size = get_file_size("nonexistent_file.txt")
print(f"File size: {size} bytes")
The code calls os.path.getsize() on "nonexistent_file.txt", a file that isn't there. Since the function can't find the file, it raises a FileNotFoundError and crashes the script. The following example shows how to manage this.
import os
def get_file_size(filename):
if os.path.exists(filename):
return os.path.getsize(filename)
else:
return None
size = get_file_size("nonexistent_file.txt")
if size is not None:
print(f"File size: {size} bytes")
else:
print("File does not exist")
To avoid a crash, you can check if a file exists before trying to read it. The updated code uses os.path.exists() to verify the file's presence first. If the file is found, os.path.getsize() returns its size. Otherwise, the function returns None, allowing your program to handle the missing file gracefully. This is crucial when dealing with files that might not exist, such as user inputs or temporary files.
Avoiding PermissionError when accessing restricted files
A PermissionError arises when your script tries to access a file or directory without the necessary permissions. This often happens with protected system folders. The code below demonstrates how this error can crash your program when it encounters a restricted directory.
import os
def get_sizes(directory):
sizes = {}
for item in os.listdir(directory):
path = os.path.join(directory, item)
sizes[item] = os.path.getsize(path)
return sizes
print(get_sizes("/root")) # May cause PermissionError
The function attempts to read a protected directory with os.listdir(). When os.path.getsize() is called on an item inside, the script’s lack of access rights raises a PermissionError. The following example demonstrates a safer approach.
import os
def get_sizes(directory):
sizes = {}
for item in os.listdir(directory):
path = os.path.join(directory, item)
try:
sizes[item] = os.path.getsize(path)
except (PermissionError, OSError):
sizes[item] = "Access denied"
return sizes
print(get_sizes("/root")) # Handles permission issues gracefully
The improved function wraps the os.path.getsize() call within a try...except block. This prevents the script from crashing when it hits a file you don't have permission to read. Instead of halting, it catches the PermissionError or OSError and simply records "Access denied" for that item. This approach is essential when your script needs to scan entire directories that might contain protected files, ensuring it can finish its job without interruption.
Incorrect file size units conversion
A common pitfall is confusing binary and decimal prefixes. File systems use binary, where one kilobyte equals 1024 bytes, but it's easy to mistakenly divide by 1000. This small error can lead to significant miscalculations, especially with larger files.
The following code demonstrates this issue, where dividing by 1000 instead of 1024 gives an inaccurate result.
import os
# Convert file size to MB
file_path = "database.db"
size_mb = os.path.getsize(file_path) / 1000
print(f"File size: {size_mb} MB")
The code divides the byte count by 1000, which is based on decimal prefixes. This results in a slightly inflated and inaccurate file size because storage is measured in binary. The following example shows the correct approach.
import os
# Convert file size to MB using binary units
file_path = "database.db"
size_mb = os.path.getsize(file_path) / (1024 * 1024)
print(f"File size: {size_mb:.2f} MB")
The corrected code uses the proper binary conversion factor. To get megabytes from bytes, you divide by 1024 * 1024. Using 1000 is a common mistake that leads to inaccurate sizes. This distinction is crucial for large files, where the difference becomes significant. You'll want to stick to powers of 1024 anytime you're formatting file sizes for display in logs or user interfaces to ensure the numbers are correct.
Real-world applications
Now that you can reliably get a file's size, you can use that skill for practical tasks like managing disk space and analyzing compression.
Finding the largest files in a directory using heapq
For finding just the top few largest files in a directory, Python's heapq module offers a much more efficient solution than sorting every single file. Learn more about using heapq in Python for efficient data processing.
import os
import heapq
def find_largest_files(directory, n=5):
file_sizes = []
for root, _, files in os.walk(directory):
for file in files:
filepath = os.path.join(root, file)
size = os.path.getsize(filepath)
file_sizes.append((filepath, size))
return heapq.nlargest(n, file_sizes, key=lambda x: x[1])
# Find the 3 largest files in the current directory
for file, size in find_largest_files('.', 3):
print(f"{file}: {size} bytes")
This find_largest_files function identifies the biggest files within a directory. It uses os.walk() to recursively explore all subdirectories.
- For every file found, it pairs the file's path with its size.
- The
heapq.nlargest()function then processes this list, efficiently selecting the topnfiles. It uses alambdafunction as thekeyto focus specifically on the size for comparison.
This gives you the largest files without processing the full list more than necessary, making it perfect for AI coding with Python projects that require efficient file analysis.
Calculating compression efficiency of files
By comparing a file's size before and after compression, you can calculate exactly how much space an algorithm saves you. This check_compression_ratio function does just that. It creates a compressed copy of a file using gzip and then grabs the sizes of both the original and the new compressed file with os.path.getsize().
With both sizes in hand, the function calculates the efficiency as a percentage. This tells you precisely how much storage you've saved. This kind of analysis is invaluable when you're trying to optimize storage or speed up data transfers, as it helps you see which files benefit most from compression.
import os
import gzip
import shutil
def check_compression_ratio(filename):
compressed_name = filename + ".gz"
with open(filename, 'rb') as f_in:
with gzip.open(compressed_name, 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
original_size = os.path.getsize(filename)
compressed_size = os.path.getsize(compressed_name)
ratio = (original_size - compressed_size) / original_size * 100
return original_size, compressed_size, ratio
# Check compression efficiency of a text file
orig, comp, ratio = check_compression_ratio("example.txt")
print(f"Original: {orig} bytes, Compressed: {comp} bytes")
print(f"Space saving: {ratio:.1f}%")
The check_compression_ratio function offers a practical workflow for measuring compression. It reads an original file in binary mode ('rb') and uses the gzip module to create a compressed version.
- The
shutil.copyfileobjfunction efficiently streams data from the original file to the compressed one, which is ideal for large files. - Once the
.gzfile is created,os.path.getsizeretrieves the sizes of both files. - The final calculation shows the space-saving percentage, letting you quantify the compression's effectiveness.
Get started with Replit
Turn these techniques into a real tool. Tell Replit Agent: “Build a disk space analyzer that finds large files” or “Create a tool to check remote file sizes from a URL.”
Replit Agent writes the code, tests for errors, and deploys the app directly from your description. Start building with Replit.
Describe what you want to build, and Replit Agent writes the code, handles the infrastructure, and ships it live. Go from idea to real product, all in your browser.
Describe what you want to build, and Replit Agent writes the code, handles the infrastructure, and ships it live. Go from idea to real product, all in your browser.



