How to merge two sorted lists in Python
Learn how to merge two sorted lists in Python. Explore different methods, tips, real-world applications, and common error debugging.
.png)
You often need to merge two sorted lists in Python. This fundamental operation improves data processing efficiency and is a key part of many complex algorithms.
In this article, we'll cover several techniques to do this. We'll also share practical tips, show real-world applications, and offer advice for debugging common problems.
Basic merge with a while loop
def merge_sorted_lists(list1, list2):
result = []
i = j = 0
while i < len(list1) and j < len(list2):
if list1[i] <= list2[j]:
result.append(list1[i])
i += 1
else:
result.append(list2[j])
j += 1
result.extend(list1[i:])
result.extend(list2[j:])
return result
print(merge_sorted_lists([1, 3, 5, 7], [2, 4, 6, 8]))--OUTPUT--[1, 2, 3, 4, 5, 6, 7, 8]
This function, merge_sorted_lists, uses a classic two-pointer technique. The while loop iterates as long as there are elements in both lists to compare. The logic is straightforward:
- It compares the elements at pointers
iandj. - The smaller element is appended to the
resultlist. - The pointer for that list is then incremented.
Once the loop finishes, one of the lists might still have elements left. The extend method efficiently appends these leftovers. This step is safe because any remaining items are already sorted and are guaranteed to be larger than all elements already in the result.
Standard library approaches
While the manual approach is instructive, Python's standard library offers more concise and often more efficient ways to achieve the same result.
Using heapq.merge() for efficient merging
import heapq
list1 = [1, 3, 5, 7]
list2 = [2, 4, 6, 8]
merged = list(heapq.merge(list1, list2))
print(merged)--OUTPUT--[1, 2, 3, 4, 5, 6, 7, 8]
The heapq module offers a powerful tool for this task. Its heapq.merge() function takes any number of sorted iterables and combines them into a single, sorted output.
- It returns an iterator, which makes it very memory-efficient. Instead of building a new list in memory, it yields items one by one.
- Because it's an iterator, you need to convert the result to a list using
list()to see the final merged collection.
This method is concise, Pythonic, and scales well—making it a great choice for larger datasets.
Using sorted() with list concatenation
list1 = [1, 3, 5, 7]
list2 = [2, 4, 6, 8]
merged = sorted(list1 + list2)
print(merged)--OUTPUT--[1, 2, 3, 4, 5, 6, 7, 8]
This is arguably the most direct method. It first concatenates the two lists using the + operator, creating a single, unsorted list. The built-in sorted() function then sorts this new list from top to bottom.
- This approach is incredibly readable and requires minimal code.
- While simple, it's often less performant than other methods. It doesn't leverage the pre-sorted nature of the input lists, which can make it slower for very large datasets.
Merging with itertools module
from itertools import chain
list1 = [1, 3, 5, 7]
list2 = [2, 4, 6, 8]
merged = sorted(chain(list1, list2))
print(merged)--OUTPUT--[1, 2, 3, 4, 5, 6, 7, 8]
The itertools module offers chain(), a function that treats consecutive sequences as a single sequence. It’s a memory-efficient alternative to list concatenation because it creates an iterator instead of building a whole new list in memory. The combined, unsorted sequence is then passed to the sorted() function to produce the final result.
- This approach is conceptually similar to using the
+operator but can be more performant for very large lists since it avoids creating an intermediate list. - Like the concatenation method, it doesn't leverage the pre-sorted nature of the inputs.
Advanced techniques
If the standard library solutions don't quite fit, you can craft your own advanced merging functions for tailored performance and memory management.
Creating a generator-based merge function
def merge_generator(list1, list2):
i = j = 0
while i < len(list1) and j < len(list2):
if list1[i] <= list2[j]:
yield list1[i]
i += 1
else:
yield list2[j]
j += 1
yield from list1[i:]
yield from list2[j:]
print(list(merge_generator([1, 3, 5, 7], [2, 4, 6, 8])))--OUTPUT--[1, 2, 3, 4, 5, 6, 7, 8]
This function, merge_generator, adapts the logic from our first while loop example into a generator. Instead of building a new list, it uses the yield keyword to produce items one at a time, making it extremely memory-efficient.
- It processes elements lazily, meaning it only computes the next item when asked. This is perfect for handling massive datasets without consuming significant memory.
- The
yield fromsyntax provides a clean way to hand off the remaining items from either list once the main loop finishes.
Since it returns an iterator, you need to consume it—for instance, by wrapping it in list()—to view the complete merged list.
Implementing a divide-and-conquer merging strategy
import heapq
def merge_multiple_lists(lists):
if not lists:
return []
if len(lists) == 1:
return lists[0]
mid = len(lists) // 2
left = merge_multiple_lists(lists[:mid])
right = merge_multiple_lists(lists[mid:])
return list(heapq.merge(left, right))
print(merge_multiple_lists([[1, 5], [2, 6], [3, 7], [4, 8]]))--OUTPUT--[1, 2, 3, 4, 5, 6, 7, 8]
This function, merge_multiple_lists, uses a divide-and-conquer strategy to merge several sorted lists. It recursively splits the collection of lists in half until it's left with individual lists. This approach is highly efficient for combining many lists at once.
- The function finds the middle of the list of lists and splits it into
leftandrighthalves. - It calls itself on each half, continuing to divide them until only single lists remain.
- Finally, it uses
heapq.merge()to combine the sorted halves back together, working its way up the recursion stack.
Optimizing with collections.deque
from collections import deque
def merge_with_deque(list1, list2):
result = []
dq1, dq2 = deque(list1), deque(list2)
while dq1 and dq2:
result.append(dq1.popleft() if dq1[0] <= dq2[0] else dq2.popleft())
result.extend(dq1 or dq2)
return result
print(merge_with_deque([1, 3, 5, 7], [2, 4, 6, 8]))--OUTPUT--[1, 2, 3, 4, 5, 6, 7, 8]
For a clever performance boost, you can use collections.deque. A deque (double-ended queue) is optimized for adding and removing items from both ends, making it a great fit for our merging logic.
- The function converts the input lists into
dequeobjects. - Instead of tracking indices, it uses
popleft()to efficiently remove the smaller of the two leading elements. This operation is much faster than removing from the start of a standard list. - The remaining non-empty
dequeis then appended to the result.
Move faster with Replit
Replit is an AI-powered development platform where you can skip setup and start coding instantly. It comes with all Python dependencies pre-installed, so you don't have to worry about environment configuration.
Instead of just piecing together techniques, you can use Agent 4 to build complete applications from a simple description. It moves you from learning individual functions to creating working products by handling the code, databases, APIs, and deployment for you.
- A price comparison tool that merges sorted product lists from several vendors into a single, unified view.
- A log analysis utility that combines timestamped event logs from different microservices into one chronological stream.
- A dashboard widget that aggregates and merges sorted activity feeds from multiple social media accounts.
Simply describe your app, and Replit will write the code, test it, and fix issues automatically, all within your browser.
Common errors and challenges
Even with the right approach, you might run into a few common pitfalls when merging sorted lists if you're not careful.
Handling empty lists in merge_sorted_lists()
Our manual merge_sorted_lists() function handles empty lists gracefully without any special checks. If one or both lists are empty, the condition for the while loop is immediately false. The code then proceeds to the extend calls, which correctly append the contents of the non-empty list—or nothing at all—to the result.
Avoiding infinite loops in merge algorithms
An infinite loop is a classic bug that usually happens when you forget to increment the pointers i and j inside the while loop. If the pointers aren't advanced after an element is appended, the loop's condition never changes, causing it to run forever. Always ensure that either i += 1 or j += 1 is executed in every iteration.
Handling custom objects in merge algorithms
Merging lists of custom objects, like products or users, requires an extra step. Python doesn't inherently know how to compare two objects, so an expression like object1 <= object2 will raise a TypeError. To fix this, you need to tell Python how to compare them.
- You can define a comparison method, such as
__lt__()(less than), within your custom class. This gives all instances of your class a natural ordering. - Alternatively, when using a function like
sorted(), you can provide akeyargument with a lambda function to specify which attribute to sort by, such askey=lambda item: item.price.
Handling empty lists in merge_sorted_lists()
One of the most common errors is failing to handle empty lists. An unguarded index access on an empty list will immediately raise an IndexError, stopping your program. Consider the following implementation, which can run into this exact issue.
def merge_sorted_lists(list1, list2):
result = []
i = j = 0
while i < len(list1) and j < len(list2):
if list1[i] <= list2[j]: # IndexError if list1 is empty
result.append(list1[i])
i += 1
else:
result.append(list2[j])
j += 1
result.extend(list1[i:])
result.extend(list2[j:])
return result
print(merge_sorted_lists([], [2, 4, 6]))
The while loop condition is meant to prevent this, but the logic can still attempt to access an element in an empty list, triggering an IndexError. See how the corrected function below handles this edge case gracefully.
def merge_sorted_lists(list1, list2):
if not list1:
return list2.copy()
if not list2:
return list1.copy()
result = []
i = j = 0
while i < len(list1) and j < len(list2):
if list1[i] <= list2[j]:
result.append(list1[i])
i += 1
else:
result.append(list2[j])
j += 1
result.extend(list1[i:])
result.extend(list2[j:])
return result
print(merge_sorted_lists([], [2, 4, 6]))
The corrected merge_sorted_lists() adds explicit checks upfront. If either input list is empty, it immediately returns a copy of the other, bypassing the main loop entirely. This makes the function's handling of this edge case more robust and easier to read.
Using .copy() is a key detail. It prevents unintended side effects by ensuring that any later changes to the returned list don't affect the original input list you provided.
Avoiding infinite loops in merge algorithms
Avoiding infinite loops in merge algorithms
Infinite loops are a notorious issue, often caused by subtle logic errors rather than obvious mistakes. A flawed condition or a missed pointer update can trap your function in a cycle that never ends. See how a small oversight creates this problem below.
def merge_sorted_lists(list1, list2):
result = []
i = j = 0
while i < len(list1) or j < len(list2):
if i < len(list1) and (j >= len(list2) or list1[i] <= list2[j]):
result.append(list1[i])
# Forgot to increment i, will cause infinite loop
else:
result.append(list2[j])
j += 1
return result
# Will hang due to infinite loop
merge_sorted_lists([1, 3, 5], [2, 4, 6])
The loop never terminates because the pointer i isn't advanced after its corresponding element is appended. This oversight means the same comparison is made in every cycle, trapping the function. The corrected code below shows the fix.
def merge_sorted_lists(list1, list2):
result = []
i = j = 0
while i < len(list1) or j < len(list2):
if i < len(list1) and (j >= len(list2) or list1[i] <= list2[j]):
result.append(list1[i])
i += 1 # Fixed: properly increment i
else:
result.append(list2[j])
j += 1
return result
print(merge_sorted_lists([1, 3, 5], [2, 4, 6]))
The corrected function fixes the infinite loop by adding i += 1 after appending an element from the first list, ensuring the loop always makes progress. This type of error is especially common in loops with complex or conditions, where it's easy to miss an update for one of the branches. Always double-check that every path inside your loop moves closer to the exit condition to prevent your program from hanging indefinitely.
Handling custom objects in merge algorithms
Merging lists of custom objects, like user profiles or products, presents a unique challenge. Python can't automatically compare two complex objects, so using the standard <= operator will fail. The code below demonstrates how this triggers a TypeError.
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def merge_by_age(people1, people2):
result = []
i = j = 0
while i < len(people1) and j < len(people2):
# TypeError: '<=' not supported between instances
if people1[i] <= people2[j]:
result.append(people1[i])
i += 1
else:
result.append(people2[j])
j += 1
result.extend(people1[i:])
result.extend(people2[j:])
return result
p1 = [Person("Alice", 25), Person("Bob", 35)]
p2 = [Person("Charlie", 30), Person("Dave", 40)]
merge_by_age(p1, p2)
The comparison people1[i] <= people2[j] triggers a TypeError because Python doesn't have a built-in rule for ordering Person objects. See how the corrected implementation below provides the necessary context for a successful comparison.
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def __repr__(self):
return f"{self.name} ({self.age})"
def merge_by_age(people1, people2):
result = []
i = j = 0
while i < len(people1) and j < len(people2):
if people1[i].age <= people2[j].age:
result.append(people1[i])
i += 1
else:
result.append(people2[j])
j += 1
result.extend(people1[i:])
result.extend(people2[j:])
return result
p1 = [Person("Alice", 25), Person("Bob", 35)]
p2 = [Person("Charlie", 30), Person("Dave", 40)]
print(merge_by_age(p1, p2))
The corrected merge_by_age function resolves the TypeError by comparing a specific attribute of the Person objects. Instead of comparing the objects directly, the logic now uses people1[i].age <= people2[j].age. This provides a clear, comparable value for Python to sort by. You'll need to apply this principle whenever you merge lists of custom objects, as Python requires an explicit key for comparison unless a natural order is defined within the class itself.
Real-world applications
Mastering these merge techniques and their common pitfalls unlocks powerful solutions for many real-world data processing challenges.
Creating a timeline from multiple activity streams
You can efficiently create a unified, chronological timeline by merging separate, timestamped event streams from various sources.
import heapq
def create_unified_timeline(user_actions, system_events):
all_events = list(heapq.merge(user_actions, system_events, key=lambda x: x[0]))
return [f"{time}: {event}" for time, event in all_events]
user_actions = [(1000, "Login"), (1500, "Update profile")]
system_events = [(1200, "Server backup"), (2000, "Maintenance start")]
print(create_unified_timeline(user_actions, system_events))
The create_unified_timeline function uses heapq.merge to combine two lists of tuples. The key argument is what makes the logic work correctly.
- The expression
key=lambda x: x[0]tellsmergeto compare items using only the first element of each tuple, which is the timestamp. This ensures the final sequence is ordered properly. - A list comprehension then iterates over the merged result, formatting each tuple into a user-friendly string.
Implementing a k-way merge for external sorting
A k-way merge is a core technique in external sorting, used to combine multiple sorted data chunks when the entire dataset is too large to hold in memory.
import heapq
def k_way_merge(sorted_chunks):
heap = []
result = []
# Add first element from each chunk to the heap
for i, chunk in enumerate(sorted_chunks):
if chunk:
heapq.heappush(heap, (chunk[0], i, 0))
while heap:
val, chunk_idx, pos = heapq.heappop(heap)
result.append(val)
next_pos = pos + 1
if next_pos < len(sorted_chunks[chunk_idx]):
heapq.heappush(heap, (sorted_chunks[chunk_idx][next_pos], chunk_idx, next_pos))
return result
chunks = [[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]
print(k_way_merge(chunks))
This k_way_merge function uses a min-heap to efficiently merge multiple sorted lists, often called chunks. It works by keeping track of the smallest available element from each list.
- First, it initializes a heap with the first element from every chunk. The heap automatically keeps the smallest value at the top.
- The function then repeatedly extracts this smallest element, adds it to the result, and replaces it with the next element from the chunk it came from.
This process continues until all chunks are exhausted, guaranteeing a fully sorted final list.
Get started with Replit
Turn your knowledge into a working tool. Give Replit Agent a prompt like, “Merge two sorted price lists from CSVs” or “Combine two timestamped log files into one chronological stream.”
Replit Agent writes the code, tests for errors, and deploys your application directly from your browser. Start building with Replit.
Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.
Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.



