How to get the file size in Python

Learn how to get file size in Python. This guide covers various methods, tips, real-world applications, and common error debugging.

How to get the file size in Python
Published on: 
Tue
Feb 24, 2026
Updated on: 
Mon
Apr 6, 2026
The Replit Team

You often need to know a file's size in Python. This is crucial when you manage disk space, validate uploads, or optimize data processes. Python offers several built-in methods for this task.

In this article, we'll cover techniques to get file sizes. You'll also find practical tips, real-world applications, and advice on how to debug common issues for your projects.

Using os.path.getsize() for basic file size check

import os
file_path = "example.txt"
size_in_bytes = os.path.getsize(file_path)
print(f"File size: {size_in_bytes} bytes")--OUTPUT--File size: 1024 bytes

The os.path.getsize() function offers the most direct way to check a file's size. It's part of Python's standard os module, which provides a portable way of using operating system-dependent functionality. You just need to provide the path to your file.

The function returns the file's size as an integer representing the number of bytes. This raw byte count is a precise, low-level measurement, making it perfect for programmatic checks where accuracy matters—like verifying file integrity or managing storage quotas.

Standard file size operations

While os.path.getsize() gives you the raw byte count, Python offers more advanced methods for gathering detailed file metadata and formatting the output.

Using os.stat() to get detailed file information

import os
file_info = os.stat("example.txt")
print(f"Size: {file_info.st_size} bytes")
print(f"Last modified timestamp: {file_info.st_mtime}")--OUTPUT--Size: 1024 bytes
Last modified timestamp: 1617283456.123456

When you need more than just the file size, os.stat() is the way to go. It returns a special object that’s packed with file metadata. While the size is available through the st_size attribute, you also get access to other useful details.

  • Timestamps: Find out when the file was last modified (st_mtime) or last accessed (st_atime).
  • Permissions: Check the file's permissions and mode using the st_mode attribute.

Using pathlib.Path for object-oriented file handling

from pathlib import Path
file_path = Path("example.txt")
size = file_path.stat().st_size
print(f"The file is {size} bytes")--OUTPUT--The file is 1024 bytes

The pathlib module offers a modern, object-oriented way to handle file paths. Instead of passing strings to functions, you create a Path object that represents your file. This approach is often preferred for its readability and cleaner code structure.

  • The Path object bundles path-related operations into convenient methods.
  • You can call .stat() directly on it to get the same stat result as os.stat(), accessing the size via the st_size attribute.

Converting file size to human-readable format

import os

def format_size(size_bytes):
for unit in ['B', 'KB', 'MB', 'GB']:
if size_bytes < 1024 or unit == 'GB':
return f"{size_bytes:.2f} {unit}"
size_bytes /= 1024

print(format_size(os.path.getsize("example.txt")))--OUTPUT--1.00 KB

While a byte count is precise, it isn't always easy to read. This custom format_size function converts the raw byte value from os.path.getsize() into a more familiar format like kilobytes or megabytes.

  • The function repeatedly divides the size by 1024, moving up through units like 'B', 'KB', and 'MB'.
  • It stops and returns a formatted string once the number is small enough, attaching the correct unit.

This memory-efficient approach makes file sizes much easier to understand at a glance in user interfaces or logs.

Advanced file size techniques

Moving beyond individual files, you'll often need to handle more complex cases, like calculating the total size of a directory or checking a remote file.

Getting size of multiple files using glob

import os
import glob

total_size = sum(os.path.getsize(f) for f in glob.glob("*.txt"))
print(f"Total size of all text files: {total_size} bytes")--OUTPUT--Total size of all text files: 5120 bytes

When you need to work with multiple files at once, the glob module is incredibly useful. It finds all pathnames matching a specified pattern, like all text files in a directory. This example combines it with a generator expression for an efficient, one-line solution.

  • The glob.glob("*.txt") function returns a list of all files in the current directory ending with .txt.
  • A generator expression then iterates through this list, calling os.path.getsize() on each file.
  • Finally, the sum() function adds up all the individual file sizes to give you a grand total.

Walking directory tree to calculate folder size

import os

def get_dir_size(path):
total = 0
with os.scandir(path) as it:
for entry in it:
if entry.is_file():
total += entry.stat().st_size
elif entry.is_dir():
total += get_dir_size(entry.path)
return total

print(f"Folder size: {get_dir_size('my_folder')} bytes")--OUTPUT--Folder size: 10485760 bytes

Calculating a folder's total size requires you to account for every file within it, including those in subdirectories. This custom get_dir_size function recursively "walks" the directory tree to accomplish this. It uses os.scandir(), which is an efficient way to iterate through a directory's contents.

  • The function checks if an entry is a file using entry.is_file() and adds its size to the total.
  • If the entry is a directory, confirmed with entry.is_dir(), the function calls itself. This recursive step calculates the subdirectory's size and adds it to the running total.

Checking remote file size with requests

import requests

url = "https://www.example.com/sample.txt"
response = requests.head(url)
size = int(response.headers.get('content-length', 0))
print(f"Remote file size: {size} bytes")--OUTPUT--Remote file size: 2048 bytes

You don't need to download a remote file to find its size. The requests library handles this efficiently by sending a HEAD request, which fetches only the file's metadata—not its content. This saves bandwidth and time.

  • The server's response includes headers, and the file's size is found in the content-length header.
  • Using response.headers.get('content-length', 0) is a safe way to retrieve this value, as it provides a default of 0 if the header is missing.

Move faster with Replit

Replit is an AI-powered development platform that comes with all Python dependencies pre-installed, so you can skip setup and start coding instantly. You don't need to configure environments or manage packages.

Instead of just piecing together techniques, you can build a complete application. Agent 4 takes your idea and builds a working product by handling the code, databases, APIs, and deployment directly from your description. You can create tools like:

  • A disk usage analyzer that recursively calculates folder sizes with os.scandir() and displays the results in a human-readable format.
  • A batch processing utility that calculates the total size of all matching files using glob before starting a task.
  • A website monitoring tool that checks the content-length of remote assets to verify their integrity without a full download.

Simply describe your app, and Replit will write the code, test it, and fix issues automatically, all within your browser.

Common errors and challenges

Getting a file's size is usually straightforward, but a few common errors can trip you up if you're not prepared for them.

  • Handling FileNotFoundError: This error occurs if the file path is incorrect or the file doesn't exist. Functions like os.path.getsize() will fail immediately. You can prevent crashes by wrapping your code in a try...except block to catch the FileNotFoundError and handle it gracefully, like logging a warning or skipping the file.
  • Avoiding PermissionError: You'll encounter a PermissionError if your script lacks the necessary rights to access a file. This often happens with system files or in restricted environments. Again, a try...except block is your best tool for catching this error and allowing your program to continue running without interruption.
  • Incorrect file size units: A classic mix-up is using 1000 instead of 1024 for conversions. File systems operate in binary, so a kilobyte is 1024 bytes, not 1000. This mistake can lead to significant inaccuracies with large files, so always use powers of 1024 for correct human-readable formatting.

Handling FileNotFoundError when checking file size

One of the most frequent issues you'll face is the FileNotFoundError. It's Python's way of telling you the file you're looking for doesn't exist. This error will immediately stop your script if you don't handle it. The following code demonstrates this problem in action.

import os

def get_file_size(filename):
return os.path.getsize(filename)

size = get_file_size("nonexistent_file.txt")
print(f"File size: {size} bytes")

The code calls os.path.getsize() on "nonexistent_file.txt", a file that isn't there. Since the function can't find the file, it raises a FileNotFoundError and crashes the script. The following example shows how to manage this.

import os

def get_file_size(filename):
if os.path.exists(filename):
return os.path.getsize(filename)
else:
return None

size = get_file_size("nonexistent_file.txt")
if size is not None:
print(f"File size: {size} bytes")
else:
print("File does not exist")

To avoid a crash, you can check if a file exists before trying to read it. The updated code uses os.path.exists() to verify the file's presence first. If the file is found, os.path.getsize() returns its size. Otherwise, the function returns None, allowing your program to handle the missing file gracefully. This is crucial when dealing with files that might not exist, such as user inputs or temporary files.

Avoiding PermissionError when accessing restricted files

A PermissionError arises when your script tries to access a file or directory without the necessary permissions. This often happens with protected system folders. The code below demonstrates how this error can crash your program when it encounters a restricted directory.

import os

def get_sizes(directory):
sizes = {}
for item in os.listdir(directory):
path = os.path.join(directory, item)
sizes[item] = os.path.getsize(path)
return sizes

print(get_sizes("/root")) # May cause PermissionError

The function attempts to read a protected directory with os.listdir(). When os.path.getsize() is called on an item inside, the script’s lack of access rights raises a PermissionError. The following example demonstrates a safer approach.

import os

def get_sizes(directory):
sizes = {}
for item in os.listdir(directory):
path = os.path.join(directory, item)
try:
sizes[item] = os.path.getsize(path)
except (PermissionError, OSError):
sizes[item] = "Access denied"
return sizes

print(get_sizes("/root")) # Handles permission issues gracefully

The improved function wraps the os.path.getsize() call within a try...except block. This prevents the script from crashing when it hits a file you don't have permission to read. Instead of halting, it catches the PermissionError or OSError and simply records "Access denied" for that item. This approach is essential when your script needs to scan entire directories that might contain protected files, ensuring it can finish its job without interruption.

Incorrect file size units conversion

A common pitfall is confusing binary and decimal prefixes. File systems use binary, where one kilobyte equals 1024 bytes, but it's easy to mistakenly divide by 1000. This small error can lead to significant miscalculations, especially with larger files.

The following code demonstrates this issue, where dividing by 1000 instead of 1024 gives an inaccurate result.

import os

# Convert file size to MB
file_path = "database.db"
size_mb = os.path.getsize(file_path) / 1000
print(f"File size: {size_mb} MB")

The code divides the byte count by 1000, which is based on decimal prefixes. This results in a slightly inflated and inaccurate file size because storage is measured in binary. The following example shows the correct approach.

import os

# Convert file size to MB using binary units
file_path = "database.db"
size_mb = os.path.getsize(file_path) / (1024 * 1024)
print(f"File size: {size_mb:.2f} MB")

The corrected code uses the proper binary conversion factor. To get megabytes from bytes, you divide by 1024 * 1024. Using 1000 is a common mistake that leads to inaccurate sizes. This distinction is crucial for large files, where the difference becomes significant. You'll want to stick to powers of 1024 anytime you're formatting file sizes for display in logs or user interfaces to ensure the numbers are correct.

Real-world applications

Now that you can reliably get a file's size, you can use that skill for practical tasks like managing disk space and analyzing compression.

Finding the largest files in a directory using heapq

For finding just the top few largest files in a directory, Python's heapq module offers a much more efficient solution than sorting every single file. Learn more about using heapq in Python for efficient data processing.

import os
import heapq

def find_largest_files(directory, n=5):
file_sizes = []
for root, _, files in os.walk(directory):
for file in files:
filepath = os.path.join(root, file)
size = os.path.getsize(filepath)
file_sizes.append((filepath, size))

return heapq.nlargest(n, file_sizes, key=lambda x: x[1])

# Find the 3 largest files in the current directory
for file, size in find_largest_files('.', 3):
print(f"{file}: {size} bytes")

This find_largest_files function identifies the biggest files within a directory. It uses os.walk() to recursively explore all subdirectories.

  • For every file found, it pairs the file's path with its size.
  • The heapq.nlargest() function then processes this list, efficiently selecting the top n files. It uses a lambda function as the key to focus specifically on the size for comparison.

This gives you the largest files without processing the full list more than necessary, making it perfect for AI coding with Python projects that require efficient file analysis.

Calculating compression efficiency of files

By comparing a file's size before and after compression, you can calculate exactly how much space an algorithm saves you. This check_compression_ratio function does just that. It creates a compressed copy of a file using gzip and then grabs the sizes of both the original and the new compressed file with os.path.getsize().

With both sizes in hand, the function calculates the efficiency as a percentage. This tells you precisely how much storage you've saved. This kind of analysis is invaluable when you're trying to optimize storage or speed up data transfers, as it helps you see which files benefit most from compression.

import os
import gzip
import shutil

def check_compression_ratio(filename):
compressed_name = filename + ".gz"
with open(filename, 'rb') as f_in:
with gzip.open(compressed_name, 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)

original_size = os.path.getsize(filename)
compressed_size = os.path.getsize(compressed_name)
ratio = (original_size - compressed_size) / original_size * 100

return original_size, compressed_size, ratio

# Check compression efficiency of a text file
orig, comp, ratio = check_compression_ratio("example.txt")
print(f"Original: {orig} bytes, Compressed: {comp} bytes")
print(f"Space saving: {ratio:.1f}%")

The check_compression_ratio function offers a practical workflow for measuring compression. It reads an original file in binary mode ('rb') and uses the gzip module to create a compressed version.

  • The shutil.copyfileobj function efficiently streams data from the original file to the compressed one, which is ideal for large files.
  • Once the .gz file is created, os.path.getsize retrieves the sizes of both files.
  • The final calculation shows the space-saving percentage, letting you quantify the compression's effectiveness.

Get started with Replit

Turn these techniques into a real tool. Tell Replit Agent: “Build a disk space analyzer that finds large files” or “Create a tool to check remote file sizes from a URL.”

Replit Agent writes the code, tests for errors, and deploys the app directly from your description. Start building with Replit.

Build your first app today

Describe what you want to build, and Replit Agent writes the code, handles the infrastructure, and ships it live. Go from idea to real product, all in your browser.

Build your first app today

Describe what you want to build, and Replit Agent writes the code, handles the infrastructure, and ships it live. Go from idea to real product, all in your browser.