How to read a folder in Python

Learn how to read a folder in Python. Explore different methods, tips, real-world applications, and common error debugging.

How to read a folder in Python
Published on: 
Tue
Apr 21, 2026
Updated on: 
Wed
Apr 22, 2026
The Replit Team

The ability to read folders in Python is a core skill for developers. It lets you automate file management, process data collections, and build applications that interact with the file system.

Here, you'll learn several techniques, from basic os.listdir() to advanced pathlib methods. We'll cover practical tips, real-world applications, and advice to debug common issues.

Using os.listdir() to list folder contents

import os
folder_path = "sample_folder"
files = os.listdir(folder_path)
print(f"Contents of {folder_path}:")
for item in files:
print(item)--OUTPUT--Contents of sample_folder:
file1.txt
file2.txt
subfolder
image.jpg

The os.listdir() function offers a straightforward way to get the contents of a directory. It returns a Python list where each item is a string representing the name of a file or subfolder inside the specified path. As you can see from the output, it captures everything—from file1.txt to subfolder—without distinction.

This approach is ideal for quick and simple directory scans. However, it only provides the base names, not the full paths. To work with these items, you'd need to join them with the original folder path. It also doesn't tell you whether an item is a file or a directory, requiring further checks for more complex tasks.

Basic folder reading techniques

To overcome the limitations of os.listdir(), you can use more robust methods that provide detailed file information and advanced filtering capabilities.

Using os.scandir() for detailed file information

import os
with os.scandir('sample_folder') as entries:
for entry in entries:
file_type = "File" if entry.is_file() else "Directory"
print(f"{entry.name}: {file_type}")--OUTPUT--file1.txt: File
file2.txt: File
subfolder: Directory
image.jpg: File

For more demanding tasks, os.scandir() is a significant upgrade. It returns an iterator of os.DirEntry objects, which hold detailed file information right from the start. This makes it much more efficient than combining os.listdir() with separate calls to check file types.

  • Performance: It retrieves file type and other attributes during the initial directory scan, reducing system overhead.
  • Convenience: You can directly use methods like entry.is_file() and access attributes such as entry.name on each object, simplifying your code.

Working with pathlib for object-oriented path handling

from pathlib import Path
folder = Path('sample_folder')
for item in folder.iterdir():
if item.is_file():
print(f"File: {item.name} ({item.suffix})")
else:
print(f"Directory: {item.name}")--OUTPUT--File: file1.txt (.txt)
File: file2.txt (.txt)
Directory: subfolder
File: image.jpg (.jpg)

The pathlib module provides a modern, object-oriented way to interact with filesystem paths. Instead of treating paths as strings, it wraps them in Path objects that come with helpful methods and attributes, making your code more readable and less error-prone.

  • The folder.iterdir() method efficiently iterates over directory contents, similar to os.scandir().
  • Each item is a full path object, so you can directly call methods like item.is_file().
  • Attributes like item.suffix give you easy access to file extensions without manual string manipulation.

Using glob to filter files by pattern

import glob
import os

txt_files = glob.glob(os.path.join('sample_folder', '*.txt'))
print("Text files found:")
for file in txt_files:
print(os.path.basename(file))--OUTPUT--Text files found:
file1.txt
file2.txt

The glob module is your go-to for finding files that match a specific pattern, much like using search in your terminal. It uses wildcards to create flexible search criteria, letting you filter directory contents without manual checks.

  • The glob.glob() function returns a list of paths that match a specified pattern.
  • In the example, the asterisk (*) in '*.txt' is a wildcard that matches any sequence of characters, effectively selecting all files ending with the .txt extension.

Advanced folder manipulation and traversal

Going beyond basic listing and filtering, you can tackle more complex tasks by traversing directory trees, inspecting file metadata, and reading folders asynchronously for better performance.

Walking directory trees with os.walk()

import os
for root, dirs, files in os.walk('sample_folder'):
level = root.replace('sample_folder', '').count(os.sep)
indent = ' ' * 4 * level
print(f"{indent}{os.path.basename(root)}/")
for file in files:
print(f"{indent} {file}")--OUTPUT--sample_folder/
file1.txt
file2.txt
image.jpg
subfolder/
nested_file.txt

When you need to explore a directory and all its subdirectories, os.walk() is your best bet. It recursively walks through a directory tree from the top down. For each folder it visits, it yields a tuple containing the current path (root), a list of its subdirectories (dirs), and a list of its files (files).

  • This structure makes it easy to differentiate between folders and files without extra checks.
  • It’s perfect for tasks like finding all files with a specific extension or performing batch operations across a nested project.

Working with file metadata using os.stat()

import os
import time
folder_path = "sample_folder"
for item in os.listdir(folder_path):
item_path = os.path.join(folder_path, item)
stats = os.stat(item_path)
modified_time = time.ctime(stats.st_mtime)
print(f"{item}: Modified {modified_time}, Size {stats.st_size} bytes")--OUTPUT--file1.txt: Modified Wed Sep 22 10:34:15 2021, Size 256 bytes
file2.txt: Modified Wed Sep 22 11:15:20 2021, Size 128 bytes
subfolder: Modified Wed Sep 22 09:45:30 2021, Size 4096 bytes
image.jpg: Modified Wed Sep 22 12:05:40 2021, Size 24576 bytes

The os.stat() function lets you access detailed file metadata, which is perfect for tasks like organizing files by date or checking their size. Since it requires a full path, you'll often use it with os.path.join() to combine the folder path and item name. The function returns a stat result object containing various details:

  • stats.st_size gives you the file size in bytes.
  • stats.st_mtime provides the last modification time as a timestamp, which you can then format with a function like time.ctime().

Asynchronous directory reading with asyncio and aiofiles

import asyncio
import os
import aiofiles

async def list_folder_async(folder_path):
files = os.listdir(folder_path)
for file in files:
file_path = os.path.join(folder_path, file)
if os.path.isfile(file_path):
async with aiofiles.open(file_path, 'rb') as f:
size = len(await f.read())
print(f"{file}: {size} bytes")

asyncio.run(list_folder_async('sample_folder'))--OUTPUT--file1.txt: 256 bytes
file2.txt: 128 bytes
image.jpg: 24576 bytes

For I/O-heavy tasks like reading multiple files, asynchronous code can significantly boost performance. Using the asyncio library with aiofiles, you can read files concurrently instead of one by one. This prevents your application from freezing while it waits for the disk.

  • The aiofiles library provides non-blocking file operations, such as aiofiles.open().
  • Keywords async and await signal to Python that it can pause a task—like reading a file—and work on something else until the operation is complete, improving overall efficiency.

Move faster with Replit

Replit is an AI-powered development platform where you can skip setup and start coding instantly. It comes with Python dependencies pre-installed, so you don't have to worry about managing environments or packages.

The techniques in this article are powerful building blocks, but moving from individual functions to a finished product is a big leap. Agent 4 helps you make that jump.

Instead of piecing together techniques, describe the app you want to build and Agent 4 will take it from idea to working product:

  • A file organizer that uses os.walk() to scan a directory and sort files into folders based on their modification date.
  • A cleanup script that leverages glob to find and delete all temporary (.tmp) files within a project.
  • A disk usage reporter that uses pathlib and os.stat() to generate a summary of the largest files in your home directory.

Simply describe your app, and Replit will write the code, test it, and fix issues automatically, all within your browser.

Common errors and challenges

Even with the right tools, you might run into a few common roadblocks, but they’re all straightforward to solve with the right approach.

Handling non-existent directories with os.listdir()

One of the most frequent errors is the FileNotFoundError. This happens when you try to use os.listdir() on a path that doesn't exist. To prevent your script from crashing, you can wrap the call in a try...except block or check if the path is valid with os.path.exists() before proceeding.

Fixing path construction with os.path.join()

Since functions like os.listdir() return only the names of files and folders, you can't work with them directly unless your script is in the same directory. You need the full path. While you might be tempted to build it by concatenating strings, this can fail across different operating systems. The correct method is to use os.path.join(), which automatically uses the right path separator (like / or \) for the host system.

Distinguishing files from directories with os.path.isfile()

When you iterate over directory contents, you get a mix of files and subfolders. If you try to perform a file-specific action—like reading its contents—on a directory, you'll get an IsADirectoryError. You can avoid this by using os.path.isfile() to confirm an item is a file before you try to process it. Remember that this check requires the full path, which is where os.path.join() comes in handy again.

Handling non-existent directories with os.listdir()

Attempting to list contents from a directory that doesn't exist is a guaranteed way to trigger a FileNotFoundError. This common mistake will halt your script instantly. The following code snippet shows exactly what happens when you provide an invalid path.

import os
folder_path = "nonexistent_folder"
files = os.listdir(folder_path)
print(f"Contents of {folder_path}:")
for item in files:
print(item)

The os.listdir() function asks the operating system for the folder's contents. When the folder isn't there, the program immediately crashes. The following example shows how to build a safeguard against this common issue.

import os
folder_path = "nonexistent_folder"
try:
files = os.listdir(folder_path)
print(f"Contents of {folder_path}:")
for item in files:
print(item)
except FileNotFoundError:
print(f"Error: The directory {folder_path} does not exist")

By wrapping the os.listdir() call in a try block, you can gracefully handle cases where a directory doesn't exist. If the path is invalid, Python raises a FileNotFoundError, which the except block catches. Instead of crashing, your script can then execute alternative code, like printing a helpful error message. This is a robust way to prevent unexpected stops, especially when your program relies on paths that might not always be present.

Fixing path construction with os.path.join()

When you need to access a file, it’s tempting to just add strings together with a slash. While this works on some systems, it isn't a reliable solution because operating systems use different path separators. The following example demonstrates this brittle approach.

import os
base_dir = "projects"
subfolder = "python_scripts"
filename = "main.py"
file_path = base_dir + "/" + subfolder + "/" + filename
print(f"Looking for file at: {file_path}")

Hardcoding the path separator with + "/" makes your code platform-dependent, and it's likely to fail on systems like Windows. A better approach creates paths that work anywhere. The next example shows how to do this correctly.

import os
base_dir = "projects"
subfolder = "python_scripts"
filename = "main.py"
file_path = os.path.join(base_dir, subfolder, filename)
print(f"Looking for file at: {file_path}")

Using os.path.join() is the correct way to build file paths because it's platform-independent. The function automatically inserts the right path separator—like / on macOS or \ on Windows—so you don't have to guess. This prevents your script from breaking when it runs on a different operating system. It’s a simple change that makes your code far more robust and portable, especially when collaborating with others or deploying to servers.

Distinguishing files from directories with os.path.isfile()

Since os.listdir() returns both files and folders without distinction, you can't assume every item is a file. Treating a directory like a file will crash your script. The following code demonstrates a situation where this can easily happen.

import os
folder_path = "project_folder"
files = os.listdir(folder_path)
python_files = []
for file in files:
if file.endswith('.py'):
python_files.append(file)
print(f"Python files: {python_files}")

The endswith() method only checks the item's name, not its type. A directory named archive.py would be incorrectly added to your list, causing errors down the line. See how to fix this in the next example.

import os
folder_path = "project_folder"
files = os.listdir(folder_path)
python_files = []
for file in files:
full_path = os.path.join(folder_path, file)
if os.path.isfile(full_path) and file.endswith('.py'):
python_files.append(file)
print(f"Python files: {python_files}")

The fix is to combine os.path.join() with os.path.isfile(). This two-step check ensures you don't accidentally treat a directory as a file. First, you construct the full path for an item. Then, you verify it’s a file before running checks like endswith('.py'). This approach is essential for preventing an IsADirectoryError, which occurs if you try to perform a file operation on a directory that happens to have a file-like name.

Real-world applications

With these techniques mastered, you can build practical scripts to organize your files or even find duplicates with functions like hashlib.md5().

Organizing files by extension

You can create a simple script to organize a folder by using os.path.splitext() to get a file's extension and shutil.move() to place it in a corresponding directory.

import os
import shutil

source_dir = "downloads"
for filename in os.listdir(source_dir):
if os.path.isfile(os.path.join(source_dir, filename)):
ext = os.path.splitext(filename)[1][1:].lower()
folder = os.path.join(source_dir, ext)
os.makedirs(folder, exist_ok=True)
shutil.move(os.path.join(source_dir, filename), os.path.join(folder, filename))
print(f"Moved {filename} to {ext}/")

This script automates file organization within a directory. It iterates through each item, first confirming it's a file with os.path.isfile() to avoid processing subfolders. For each valid file, the script performs a few key actions:

  • It extracts the file extension using os.path.splitext().
  • It creates a new directory named after that extension using os.makedirs(folder, exist_ok=True), which won't cause an error if the folder is already there.
  • Finally, it moves the file into the target folder with shutil.move().

Finding duplicate files with hashlib.md5()

You can also identify duplicate files by calculating a unique fingerprint for each one's content using the hashlib.md5() function.

import os
import hashlib

files_by_hash = {}
for root, _, files in os.walk("sample_folder"):
for filename in files:
filepath = os.path.join(root, filename)
with open(filepath, 'rb') as f:
file_hash = hashlib.md5(f.read()).hexdigest()
if file_hash in files_by_hash:
print(f"Duplicate found: {filepath} and {files_by_hash[file_hash]}")
else:
files_by_hash[file_hash] = filepath

This script identifies duplicate files by comparing their content, not just their names. It uses os.walk() to recursively scan a directory and then processes each file it finds.

  • For each file, it calculates a unique MD5 hash—a digital fingerprint—from its binary content using hashlib.md5().
  • A dictionary stores each hash and the path of the first file that generated it.
  • If a new file produces a hash that already exists in the dictionary, the script flags it as a duplicate and prints both paths.

Get started with Replit

Now, turn these concepts into a working utility. Describe your goal to Replit Agent, like "Build a tool that uses glob to count lines of code in all Python files" or "Create a script that uses os.stat() to rename new files based on their creation date."

The Agent writes the code, tests for errors, and can even deploy your app directly from the workspace. Start building with Replit.

Get started free

Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

Get started free

Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.