How to read a file line by line in Python
Learn how to read a file line by line in Python. Discover various methods, tips, real-world applications, and how to debug common errors.

For data processing and log analysis, you often need to read a file line by line in Python. The language offers simple, efficient methods to handle large files without high memory use.
In this guide, you'll explore several techniques with practical examples. We'll also cover real-world applications and debugging advice to help you select the right method for your specific use case.
Using a for loop to read lines
with open('sample.txt', 'r') as file:
for line in file:
print(line.strip())--OUTPUT--Hello
World
Python
Iterating directly on the file object is the most Pythonic approach for reading line by line. It's highly memory-efficient because it doesn't load the entire file into memory. Instead, the file object acts as an iterator, feeding one line at a time into the loop. This makes it perfect for handling large datasets without performance hits.
You'll notice the use of line.strip(). This is a crucial step to remove any leading or trailing whitespace. More importantly, it removes the invisible newline character that's included at the end of each line read from the file, ensuring clean output.
Alternative file reading methods
While iterating with a for loop is highly efficient, methods like readlines() and readline() can offer more control for specific file handling tasks.
Using the readlines() method
with open('sample.txt', 'r') as file:
lines = file.readlines()
for line in lines:
print(line.strip())--OUTPUT--Hello
World
Python
The readlines() method reads the entire file at once and returns a list of strings, with each string representing a line. This approach gives you a complete list of lines to work with from the start.
- The main benefit is flexibility. You can immediately perform list operations, like accessing a specific line by its index.
- The significant drawback is memory usage. Since it loads the whole file into memory, it's not ideal for very large files.
Using readline() in a while loop
with open('sample.txt', 'r') as file:
line = file.readline()
while line:
print(line.strip())
line = file.readline()--OUTPUT--Hello
World
Python
The readline() method offers a more manual way to read a file line by line. It fetches a single line, and the while loop continues as long as that line isn't empty. Once it reaches the end of the file, readline() returns an empty string, which evaluates to False and stops the loop.
- This approach is just as memory-efficient as a
forloop, since only one line is stored at a time. - It provides more control, allowing you to execute code between line reads.
- Remember to call
readline()again inside the loop to advance to the next line and prevent an infinite loop.
Using list comprehension
with open('sample.txt', 'r') as file:
lines = [line.strip() for line in file]
print(lines)--OUTPUT--['Hello', 'World', 'Python']
List comprehension offers a more compact way to achieve a similar result to readlines(). It lets you build a list from an iterable in a single, readable line. Here, it iterates through the file, applies line.strip() to each line, and stores the clean lines in a list.
- This method is highly expressive and very Pythonic, combining the loop and processing into one step.
- Keep in mind that it loads the entire file into memory to create the list, so it's not ideal for extremely large files.
Advanced file reading techniques
For more complex scenarios, Python provides advanced techniques that build on these fundamentals to handle errors, optimize memory, and process multiple files seamlessly.
Reading with error handling
try:
with open('sample.txt', 'r') as file:
for line in file:
print(line.strip())
except FileNotFoundError:
print("The file was not found.")
except IOError:
print("Error reading the file.")--OUTPUT--Hello
World
Python
When you're working with files, things can go wrong. Wrapping your code in a try...except block is a crucial practice for building resilient applications. This structure lets you anticipate and manage errors gracefully instead of letting your program crash.
- The
except FileNotFoundErrorblock runs if the specified file can't be found. except IOErrorcatches more general input and output errors, like permission issues or a corrupted file.
This gives you precise control over how your program responds to problems.
Using generators for memory-efficient reading
def read_large_file(file_path):
with open(file_path, 'r') as file:
for line in file:
yield line.strip()
for line in read_large_file('sample.txt'):
print(line)--OUTPUT--Hello
World
Python
Generators offer a powerful way to create iterators for memory-efficient processing. The read_large_file function becomes a generator because it uses the yield keyword. Instead of returning all lines at once, it produces them one by one, pausing its state between each call.
- This approach is incredibly memory-efficient—it only holds a single line in memory at any given moment, making it ideal for massive files.
- It also encapsulates the file-reading logic into a clean, reusable function that you can use anywhere in your project.
Using the fileinput module
import fileinput
for line in fileinput.input(files=('sample.txt')):
print(f"{fileinput.filename()}, line {fileinput.lineno()}: {line.strip()}")--OUTPUT--sample.txt, line 1: Hello
sample.txt, line 2: World
sample.txt, line 3: Python
The fileinput module simplifies reading from multiple sources. It lets you iterate over lines from a sequence of files, treating them as a single input stream. This is especially useful for writing scripts that process several log files at once.
- It provides helpful metadata. You can use
fileinput.filename()to see which file you're on andfileinput.lineno()for the cumulative line number. - The module is memory-efficient because it reads one line at a time, just like iterating directly on a file object.
Move faster with Replit
Replit is an AI-powered development platform that transforms natural language into working applications. It’s designed to help you build software by simply describing what you want—complete with databases, APIs, and deployment.
For the file-reading techniques we've covered, Replit Agent can turn them into production-ready tools. It understands your descriptions and builds the application for you, handling the underlying code so you can focus on the outcome.
- Build a log analyzer that processes server logs line by line to identify and count specific error types.
- Create a data cleaning utility that reads large CSV files, validates each row, and outputs a sanitized version.
- Deploy a batch processing tool that reads a list of text files and applies a transformation to each line.
You don't have to write all the boilerplate code from scratch. Describe your app idea, and Replit Agent writes the code, tests it, and fixes issues automatically.
Common errors and challenges
Reading files line by line can sometimes lead to tricky errors involving file types, character encoding, and resource management.
Handling binary file reading errors
Attempting to read a binary file—like an image or an executable—in Python's default text mode (`'r'`) will often cause problems. The interpreter tries to decode the file's bytes as text, which usually results in a `UnicodeDecodeError` or garbled output because the byte sequences don't correspond to valid characters. To fix this, you must open the file in binary read mode by using `'rb'`. This tells Python to read the raw bytes without any decoding, which is the correct way to handle non-text files.
Resolving UnicodeDecodeError with proper encoding
A `UnicodeDecodeError` is a common roadblock that occurs when Python tries to read a file using the wrong character encoding. For example, if a file is saved with `UTF-16` encoding but you try to read it using the default `UTF-8`, Python won't be able to interpret the bytes correctly. The solution is to specify the correct encoding when you open the file. You can do this by passing the `encoding` argument to the `open()` function, such as `open('data.csv', 'r', encoding='latin-1')`. Finding the right encoding can sometimes require a bit of investigation, but it's essential for correctly reading text from various sources.
Avoiding resource leaks with proper file closing
Forgetting to close a file after you're done with it can lead to resource leaks. Each open file consumes a system resource, and if your program opens many files without closing them, it can eventually run out of available resources and crash. The most reliable way to prevent this is by using the `with open(...) as file:` syntax. This construct automatically ensures the file is closed as soon as the block is exited, even if an error occurs. While you could manually call `file.close()`, the `with` statement is safer and considered the standard practice in Python for managing file resources effectively.
Handling binary file reading errors
When you try to read a non-text file like an image using the default text mode ('r'), Python will raise an error. This happens because the file's bytes don't represent valid text characters. The code below shows this UnicodeDecodeError in action.
with open('image.jpg', 'r') as file:
content = file.read()
print(content[:20]) # Will raise UnicodeDecodeError
The code opens the image in text mode ('r'), so the read() operation fails when it encounters binary data it can't decode. The following example demonstrates the correct way to handle this.
with open('image.jpg', 'rb') as file:
content = file.read()
print(f"File size: {len(content)} bytes")
To fix the error, you open the file in binary read mode using 'rb'. This tells Python to read the file as raw bytes instead of trying to decode it as text, which prevents the UnicodeDecodeError.
This method is essential whenever you're working with non-text files like images, executables, or compressed archives. The code correctly reads the bytes and prints the file's size, demonstrating how to handle binary data without causing a crash.
Resolving UnicodeDecodeError with proper encoding
A UnicodeDecodeError occurs when Python reads a file with an encoding it doesn't expect—like trying to open a UTF-16 file with the default UTF-8. This mismatch garbles the text and causes the program to fail. The code below shows this error in action.
with open('international.txt', 'r') as file:
content = file.read()
print(content) # Fails with non-UTF-8 encoded files
The code opens the file in default text mode ('r'), but the file contains characters that don't conform to the expected UTF-8 encoding. This mismatch causes a UnicodeDecodeError. The next example shows how to handle this.
with open('international.txt', 'r', encoding='utf-8') as file:
content = file.read()
print(content)
The fix is to explicitly tell Python which encoding to use. By adding the encoding argument to the open() function, such as encoding='utf-8', you ensure the file is read correctly. This prevents the UnicodeDecodeError by matching the read operation with the file's actual format. You'll often run into this issue when handling files from different operating systems or those containing text in multiple languages, as they may not use your system's default encoding.
Avoiding resource leaks with proper file closing
Failing to close a file is a common mistake that leads to resource leaks. Your system allocates resources for every open file, and not releasing them can eventually cause your program to crash. The code below shows how this happens when the file is never explicitly closed.
file = open('data.txt', 'r')
content = file.read()
print(content)
# File never closed, potential resource leak
The code assigns the opened file to the file variable, but no instruction follows to release it. This leaves the file handle open, consuming system resources. The next example shows how to manage this properly.
with open('data.txt', 'r') as file:
content = file.read()
print(content)
# File automatically closed when exiting the with block
The solution is to use the with open(...) as file: statement, which is the standard for file handling in Python. This structure automatically closes the file once the block is exited, even if an error occurs. It prevents resource leaks by ensuring the file handle is always released without you needing to manually call file.close(). This makes your code cleaner, safer, and more reliable, especially when your application opens many files.
Real-world applications
Beyond the mechanics and error handling, reading files line by line is essential for practical tasks like parsing CSVs and analyzing text.
Processing a CSV file with csv module
Python's csv module offers a structured way to parse comma-separated files, and using csv.DictReader conveniently turns each row into a dictionary so you can access data by column headers.
import csv
with open('employees.csv', 'r') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
if int(row['salary']) > 50000:
print(f"{row['name']} - ${row['salary']}")
This snippet demonstrates how to filter a CSV file efficiently. By iterating directly over the csv.DictReader object, you process the file one row at a time, which keeps memory usage low even with large files.
- A key step is converting the salary to a number with
int(). Since all data from a CSV is initially read as text, you'll need to do this for any numerical comparisons. - The code then prints the details for employees who meet the salary condition.
Text analysis with word frequency counting
Another practical application is performing text analysis, like counting word frequencies to quickly identify a document's key topics.
from collections import Counter
def count_word_frequency(filename):
with open(filename, 'r') as file:
text = file.read().lower()
words = text.split()
return Counter(words).most_common(5)
print(count_word_frequency('article.txt'))
This function efficiently processes a text file using Python's collections module. First, it reads the file and standardizes the text by converting it to lowercase with .lower(). Next, .split() breaks the content into a list of individual words.
The magic happens with Counter, a dictionary subclass designed for tallying. It takes the word list and builds a map of words to their counts. Finally, .most_common(5) extracts the five most-found words, returning them as a list of pairs.
Get started with Replit
Turn what you've learned into a real tool with Replit Agent. Just describe your goal, like "Build a CSV log analyzer that counts error types" or "Create a utility that finds and replaces text across multiple files."
The agent writes the code, tests for errors, and deploys the app, turning your description into a finished tool. Start building with Replit.
Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.
Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.



.png)