How to read all files in a directory in Python

Discover how you can read all files in a directory with Python. Get tips, see real-world uses, and learn to debug common errors.

Published on:

Tue

Mar 17, 2026

Updated on:

Tue

Mar 24, 2026

The Replit Team

ON THIS PAGE

Example H2

The ability to read all files in a directory is a core skill for Python developers who manage data or automate workflows. Python provides powerful modules to simplify this common task.

In this article, we'll cover several techniques to list and read files from a directory. We'll also share practical tips, real-world applications, and essential debugging advice to help you confidently master directory operations.

Using `os.listdir()` to get all files

import os directory = "sample_dir" files = os.listdir(directory) for file in files: print(file)--OUTPUT--file1.txt file2.csv data.json image.png subdirectory

The os.listdir() function is a straightforward way to get a list of all entries within a given directory. It returns a Python list containing the names of everything inside, but it's important to note a few things:

It includes both files (like file1.txt) and subdirectories (like subdirectory).
The list is unsorted and returns only the base names, not the full paths.

Because you only get the names, you'll need to join the original directory path to each item before you can read or modify it.

Common approaches for directory traversal

While os.listdir() is useful for simple listings, Python’s os.walk(), glob, and pathlib modules offer more powerful ways to navigate directories and find specific files.

Using `os.walk()` for recursive traversal

import os directory = "sample_dir" for root, dirs, files in os.walk(directory): for file in files: file_path = os.path.join(root, file) print(file_path)--OUTPUT--sample_dir/file1.txt sample_dir/file2.csv sample_dir/data.json sample_dir/image.png sample_dir/subdirectory/nested_file.txt

The os.walk() function is your go-to for recursively exploring a directory. It generates a tuple for each directory it enters, including the top one. This makes it perfect for finding all files, even those nested in subdirectories. For each level, it provides:

root: The path to the current directory.
dirs: A list of subdirectories within it.
files: A list of files within it.

By combining root and a file name with os.path.join(), you get the full path needed to access each file directly.

Using `glob` module for pattern matching

import glob import os directory = "sample_dir" file_pattern = os.path.join(directory, "*.*") files = glob.glob(file_pattern) for file in files: print(file)--OUTPUT--sample_dir/file1.txt sample_dir/file2.csv sample_dir/data.json sample_dir/image.png

The glob module is your best bet for finding files that match a specific pattern, using familiar Unix-style wildcards. The glob.glob() function takes a pattern and returns a list of full paths, so you don't need to join them manually.

The * wildcard matches any sequence of characters. In the example, "*.*" finds all files with an extension.
This approach is more direct than os.walk() if you don't need to search subdirectories, making it ideal for simple filtering.

Using `pathlib` for modern path handling

from pathlib import Path directory = Path("sample_dir") for file_path in directory.iterdir(): if file_path.is_file(): print(file_path)--OUTPUT--sample_dir/file1.txt sample_dir/file2.csv sample_dir/data.json sample_dir/image.png

The pathlib module offers a modern, object-oriented way to handle filesystem paths. Instead of working with strings, you create Path objects that have useful methods, which can make your code cleaner and more intuitive.

The iterdir() method iterates over all items in the directory.
Each item is a Path object, so you can call methods like is_file() directly on it to filter out subdirectories.
This approach avoids manual string joining and often leads to more readable code than older methods.

Advanced techniques and optimizations

Now that you know how to find files, you can build on those skills to read their contents safely, filter them by type, and process them efficiently.

Reading file contents with context managers

import os directory = "sample_dir" for filename in os.listdir(directory): file_path = os.path.join(directory, filename) if os.path.isfile(file_path): with open(file_path, 'r') as file: content = file.read() print(f"{filename}: {content[:20]}...")--OUTPUT--file1.txt: This is the content... file2.csv: name,age,city... data.json: {"key": "value"... image.png: [binary content]...

Once you have a file path, reading its contents safely is the next step. Using a with open(...) statement, known as a context manager, is the standard way to handle files in Python. It’s a crucial practice because it automatically closes the file for you, which prevents errors and resource leaks.

Before reading, it's wise to confirm the path points to a file using os.path.isfile().
Inside the with block, you can use methods like .read() to access the file’s content without worrying about cleanup.

Filtering files by extension

import os directory = "sample_dir" extension = ".txt" txt_files = [f for f in os.listdir(directory) if f.endswith(extension)] print(f"Found {len(txt_files)} {extension} files:") for file in txt_files: print(file)--OUTPUT--Found 1 .txt files: file1.txt

Often, you'll only need files of a specific type. A list comprehension offers a concise way to filter the results from os.listdir(). This technique builds a new list containing only the items that meet your criteria.

The key is the endswith() string method, which checks if a filename ends with a certain suffix, like ".txt".
This approach is efficient and highly readable, making it a popular choice for simple filtering tasks without needing to import other modules.

Using generators for memory-efficient processing

import os def file_reader(directory): for filename in os.listdir(directory): filepath = os.path.join(directory, filename) if os.path.isfile(filepath): yield filepath, filename for filepath, filename in file_reader("sample_dir"): print(f"Processing {filename}")--OUTPUT--Processing file1.txt Processing file2.csv Processing data.json Processing image.png

When you're dealing with a large number of files, creating a list of all file paths at once can consume a lot of memory. A generator function offers a more efficient solution. Instead of returning a complete list, it uses the yield keyword to produce one item at a time, right when you need it.

The function pauses its execution after each yield and resumes from the same spot on the next iteration.
This approach is ideal for large directories because it processes files one by one, keeping memory usage low.

Move faster with Replit

Replit is an AI-powered development platform that transforms natural language into working applications. You can describe what you want to build, and its AI capabilities help bring your idea to life—complete with a user interface, backend logic, and deployment.

The file handling techniques you've learned can be the foundation for powerful tools. With Replit Agent, you can turn these concepts into production-ready applications simply by describing them. The agent can build, test, and deploy entire projects from a single prompt.

For example, you could use Replit Agent to build:

A log file analyzer that uses os.walk() to recursively scan a project directory, read all .log files, and generate a consolidated error report.
A batch data processor that uses glob to find all CSV files in a directory, reads their contents, and aggregates the data into a summary dashboard.
A digital asset manager that uses pathlib to organize files into subdirectories based on their extension or creation date.

Describe your next project, and Replit Agent will write the code, handle dependencies, and deploy it for you, all within your browser.

Common errors and challenges

Navigating directories can sometimes lead to common pitfalls like permission errors, missing folders, or incorrect file path references.

You might encounter a PermissionError when your script tries to access a directory it doesn't have read permissions for, which is common in protected system folders. To prevent your program from crashing, you can wrap your directory traversal code in a try...except PermissionError block. This allows your script to gracefully skip the inaccessible directory and continue its work.

Attempting to list files in a directory that doesn't exist will raise a FileNotFoundError. You can handle this by wrapping your os.listdir() call in a try...except FileNotFoundError block. A more direct approach is to first check if the path is valid with os.path.exists(), ensuring you only try to read from directories that actually exist.

A frequent mistake is trying to open a file using only the name returned by os.listdir(). Since this function provides just the base filename and not the full path, your script will fail unless it's running from inside that same directory.

The problem: Trying to open('report.txt') will look for the file in the current working directory, not the target directory.
The solution: Always construct the full path by joining the directory path with the filename using os.path.join() before attempting to open or process it.

Handling permission errors when traversing directories

Traversing system directories often triggers a PermissionError because your script lacks the required access rights. This is a common issue when working with protected locations like /var/log. The following code demonstrates what happens when you run into this problem.

import os directory = "/var/log" for root, dirs, files in os.walk(directory): for file in files: file_path = os.path.join(root, file) with open(file_path, 'r') as f: print(f"Content: {f.read()[:10]}")

This code fails because it tries to open() and read every file, including protected system logs. The operating system denies access, which raises a PermissionError and halts the script. The following example demonstrates how to prevent this crash.

import os directory = "/var/log" for root, dirs, files in os.walk(directory): for file in files: file_path = os.path.join(root, file) try: with open(file_path, 'r') as f: print(f"Content: {f.read()[:10]}") except PermissionError: print(f"Permission denied: {file_path}")

By wrapping the file operation within a try...except PermissionError block, the script can safely handle protected files. If an error occurs, the except block catches it and prints a warning, allowing the loop to continue to the next file. This prevents a single inaccessible file from halting your entire program. It's a crucial pattern to use when scanning directories where you don't control all the permissions, such as system folders or user-generated content.

Dealing with non-existent directories using `os.listdir()`

A common mistake is assuming a directory path is always valid. If you pass a non-existent path to os.listdir(), your program will crash with a FileNotFoundError. This often happens when dealing with user input. The following code demonstrates this exact issue.

import os user_input = input("Enter directory to list: ") files = os.listdir(user_input) for file in files: print(file)

This code directly passes the user's raw input to os.listdir(). If the entered path doesn't exist, the program crashes. The following example demonstrates how to handle this gracefully by first validating the path.

import os user_input = input("Enter directory to list: ") try: files = os.listdir(user_input) for file in files: print(file) except FileNotFoundError: print(f"Directory does not exist: {user_input}")

By wrapping the os.listdir() call in a try...except FileNotFoundError block, you can catch the error when a directory doesn't exist. This prevents the program from crashing. Instead of stopping, the except block runs, printing a user-friendly message. This is especially useful when the directory path comes from user input or a configuration file, where you can't guarantee its validity beforehand.

Using incorrect file paths with `os.listdir()`

A classic mistake is using the filenames from os.listdir() directly. This function only returns the name, not the full path, causing an error if your script isn't in the same directory. The code below shows exactly what goes wrong.

import os directory = "data/logs" files = os.listdir(directory) for file in files: with open(file, 'r') as f: content = f.read() print(f"{file}: {content[:10]}")

The open() call searches for the file in the script's current directory, not inside the data/logs folder. This path mismatch causes a FileNotFoundError. The following example demonstrates the correct approach.

import os directory = "data/logs" files = os.listdir(directory) for file in files: file_path = os.path.join(directory, file) with open(file_path, 'r') as f: content = f.read() print(f"{file}: {content[:10]}")

The fix is to always build the full path. By using os.path.join(directory, file), you combine the directory path with the filename returned by os.listdir(). This creates a complete, correct path that open() can find. This simple step prevents FileNotFoundError and ensures your script reliably accesses files, no matter where it's run from. It's a crucial habit to adopt whenever you're iterating over directory contents.

Real-world applications

With the fundamentals and error handling covered, you can apply these skills to practical tasks like finding large files and organizing directories.

Finding large files with `os.walk()` for disk cleanup

By pairing os.walk() with os.path.getsize(), you can create a simple script to locate large files hidden anywhere in a directory tree, making cleanup much easier.

import os def find_large_files(directory, threshold_mb=10): large_files = [] for root, _, files in os.walk(directory): for file in files: file_path = os.path.join(root, file) size_mb = os.path.getsize(file_path) / (1024 * 1024) if size_mb > threshold_mb: large_files.append((file_path, size_mb)) return large_files results = find_large_files("sample_dir", 5) for path, size in results: print(f"{path}: {size:.2f} MB")

This function, find_large_files, recursively scans a directory to find files exceeding a specified size. It combines a few key operations to get the job done efficiently.

It uses os.walk() to traverse the entire directory tree, ensuring no file is missed.
For each file, os.path.getsize() retrieves its size in bytes, which the code then converts to megabytes.
A simple comparison checks if the file's size is greater than the threshold_mb you provide.

Finally, it returns a list containing the path and size of every file that meets the criteria.

Organizing files by type with `shutil` and `os.makedirs()`

You can bring order to a messy directory by using os.makedirs() to create subfolders for different file types and the shutil module to copy each file into its correct place.

import os import shutil for file in os.listdir("sample_dir"): file_path = os.path.join("sample_dir", file) ext = os.path.splitext(file)[1].lower() # Determine category based on extension category = None if ext in ['.jpg', '.png', '.gif']: category = 'images' elif ext in ['.pdf', '.txt', '.csv']: category = 'documents' if category: os.makedirs(f"organized/{category}", exist_ok=True) shutil.copy2(file_path, f"organized/{category}/{file}") print(f"Copied {file} to {category} folder")

This script automates file organization by looping through a directory. For each file, it uses os.path.splitext() to isolate the extension and then assigns it to a category like 'images' or 'documents'.

The os.makedirs() function creates a destination folder for the category. Using exist_ok=True cleverly prevents the script from crashing if the folder already exists.
Finally, shutil.copy2() copies the file into its new home, preserving important metadata like the original creation and modification times.

Get started with Replit

Now, turn these concepts into a real tool. Tell Replit Agent to “build a script that finds all JPGs over 10MB” or “create a dashboard that aggregates sales data from all CSV files in a directory.”

The agent writes the code, handles testing, and deploys the app for you. It turns your description into a finished product. Start building with Replit.

Get started free

Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

Get started free

Get started for free

Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

Get started for free