How to implement BFS in Python

Learn how to implement Breadth-First Search (BFS) in Python. Explore different methods, tips, real-world applications, and debugging common errors.

Published on:

Mon

Apr 6, 2026

Updated on:

Wed

Apr 8, 2026

The Replit Team

ON THIS PAGE

Example H2

Breadth-First Search (BFS) is a core algorithm for graph and tree traversal. Python's data structures make its implementation straightforward, which allows you to explore nodes layer by layer efficiently.

In this article, we'll guide you through a step-by-step BFS implementation. You'll find practical techniques, real-world applications, and debugging advice to help you confidently apply this essential algorithm to your own projects.

Basic BFS implementation

def bfs(graph, start): visited = set([start]) queue = [start] while queue: vertex = queue.pop(0) print(vertex, end=" ") for neighbor in graph[vertex]: if neighbor not in visited: visited.add(neighbor) queue.append(neighbor) graph = {'A': ['B', 'C'], 'B': ['A', 'D', 'E'], 'C': ['A', 'F'], 'D': ['B'], 'E': ['B', 'F'], 'F': ['C', 'E']} bfs(graph, 'A')--OUTPUT--A B C D E F

This implementation hinges on two core components: a queue for managing nodes and a visited set to prevent reprocessing. Using a set for visited is deliberate—it offers near-instant lookups to check if a node has been seen, which is more efficient than searching a list.

The process works like this:

The while loop runs as long as the queue isn't empty.
queue.pop(0) removes the oldest node, ensuring a level-by-level traversal.
Each neighbor is checked against the visited set before being added to the queue, which avoids infinite loops in cyclic graphs.

Efficient BFS implementations

While the basic implementation works, you can significantly improve its performance and adapt it for specific tasks like binary tree traversal and level tracking.

Using `collections.deque` for better performance

from collections import deque def bfs(graph, start): visited = set([start]) queue = deque([start]) while queue: vertex = queue.popleft() # O(1) operation vs O(n) for lists print(vertex, end=" ") for neighbor in graph[vertex]: if neighbor not in visited: visited.add(neighbor) queue.append(neighbor) graph = {'A': ['B', 'C'], 'B': ['A', 'D'], 'C': ['A', 'F'], 'D': ['B'], 'F': ['C']} bfs(graph, 'A')--OUTPUT--A B C D F

Using a standard list as a queue isn't ideal. The pop(0) operation is inefficient because it requires shifting all remaining elements, which slows down traversal in large graphs. For a more performant solution, you should use collections.deque.

A deque is a double-ended queue optimized for adding and removing elements from both ends.
Its popleft() method is an O(1) operation, offering constant-time performance that doesn't degrade as the queue grows.
This simple switch makes your BFS algorithm much more scalable.

BFS for binary tree traversal

from collections import deque class TreeNode: def __init__(self, val=0, left=None, right=None): self.val = val self.left = left self.right = right def bfs_tree(root): if not root: return [] result = [] queue = deque([root]) while queue: node = queue.popleft() result.append(node.val) if node.left: queue.append(node.left) if node.right: queue.append(node.right) return result root = TreeNode(1, TreeNode(2, TreeNode(4), TreeNode(5)), TreeNode(3)) print(bfs_tree(root))--OUTPUT--[1, 2, 3, 4, 5]

BFS adapts naturally to binary trees. Instead of a dictionary, you'll typically use a TreeNode class to define the structure. The traversal logic remains the same, ensuring you visit all nodes at the current depth before moving to the next level.

The queue is initialized with the root node.
As each node is processed, its left and right children are added to the queue if they exist.
Unlike general graph traversal, a visited set is unnecessary since trees are acyclic. This simplifies the implementation.

BFS with level tracking

from collections import deque def level_order_bfs(graph, start): visited = set([start]) queue = deque([(start, 0)]) # (node, level) levels = {} while queue: vertex, level = queue.popleft() if level not in levels: levels[level] = [] levels[level].append(vertex) for neighbor in graph[vertex]: if neighbor not in visited: visited.add(neighbor) queue.append((neighbor, level + 1)) return levels graph = {'A': ['B', 'C'], 'B': ['A', 'D', 'E'], 'C': ['A', 'F'], 'D': ['B'], 'E': ['B'], 'F': ['C']} print(level_order_bfs(graph, 'A'))--OUTPUT--{0: ['A'], 1: ['B', 'C'], 2: ['D', 'E', 'F']}

Sometimes you need to know not just which nodes are reachable, but also how far away they are. This implementation modifies BFS to track the level, or distance, of each node from the start.

The queue now stores tuples of (node, level), starting with the initial node at level 0.
When a vertex is processed, its neighbors are added to the queue with an incremented level: (neighbor, level + 1).
A dictionary groups nodes by their level, making it easy to see the graph's structure layer by layer.

Advanced BFS applications

Beyond simple traversal, these optimized BFS methods unlock powerful applications, like calculating the shortest route in a graph or identifying its separate, interconnected sections.

Finding shortest paths in unweighted graphs

from collections import deque def shortest_path(graph, start, end): visited = {start: None} queue = deque([start]) while queue: current = queue.popleft() if current == end: path = [] while current: path.append(current) current = visited[current] return path[::-1] # Reverse to get path from start to end for neighbor in graph[current]: if neighbor not in visited: visited[neighbor] = current queue.append(neighbor) return None graph = {'A': ['B', 'C'], 'B': ['A', 'D', 'E'], 'C': ['A', 'F'], 'D': ['B'], 'E': ['B'], 'F': ['C']} print(shortest_path(graph, 'A', 'F'))--OUTPUT--['A', 'C', 'F']

BFS is ideal for finding the shortest path in unweighted graphs since it explores level by level, guaranteeing the first time you reach a node is via the shortest route. This function’s key is its visited tracker. It’s a dictionary, not a set, which lets you store the path by mapping each node to the one that led to it.

When the end node is found, the algorithm reconstructs the path by backtracking to the start node using the dictionary.
The path is then reversed to show the correct sequence from start to finish.

Finding connected components using BFS

from collections import deque def find_connected_components(graph): visited = set() components = [] for node in graph: if node not in visited: component = [] queue = deque([node]) visited.add(node) while queue: current = queue.popleft() component.append(current) for neighbor in graph[current]: if neighbor not in visited: visited.add(neighbor) queue.append(neighbor) components.append(component) return components graph = {'A': ['B'], 'B': ['A'], 'C': ['D'], 'D': ['C'], 'E': ['F'], 'F': ['E']} print(find_connected_components(graph))--OUTPUT--[['A', 'B'], ['C', 'D'], ['E', 'F']]

This function is perfect for identifying disconnected subgraphs, or "connected components." It systematically finds all self-contained clusters of nodes within a larger, potentially fragmented graph.

The algorithm iterates through every node. If a node hasn't been seen, it triggers a new BFS traversal from that starting point.
This inner BFS explores all reachable nodes, grouping them into a single component.
The visited set is crucial—it ensures that once a node is part of a component, it isn't processed again, which effectively isolates each cluster.

Bi-directional BFS for faster search

from collections import deque def bidirectional_bfs(graph, start, end): if start == end: return [start] # Forward and backward searches forward_queue = deque([start]) backward_queue = deque([end]) forward_visited = {start: None} backward_visited = {end: None} while forward_queue and backward_queue: # Expand forward current = forward_queue.popleft() for neighbor in graph[current]: if neighbor not in forward_visited: forward_visited[neighbor] = current forward_queue.append(neighbor) if neighbor in backward_visited: # Intersection found return reconstruct_path(forward_visited, backward_visited, neighbor) # Expand backward current = backward_queue.popleft() for neighbor in graph[current]: if neighbor not in backward_visited: backward_visited[neighbor] = current backward_queue.append(neighbor) if neighbor in forward_visited: # Intersection found return reconstruct_path(forward_visited, backward_visited, neighbor) return None def reconstruct_path(forward_visited, backward_visited, intersection): # Reconstruct path from start to intersection forward_path = [] current = intersection while current: forward_path.append(current) current = forward_visited[current] forward_path.reverse() # Reconstruct path from intersection to end backward_path = [] current = backward_visited[intersection] while current: backward_path.append(current) current = backward_visited[current] # Combine paths return forward_path + backward_path graph = {'A': ['B', 'C'], 'B': ['A', 'D', 'E'], 'C': ['A', 'F'], 'D': ['B'], 'E': ['B', 'F'], 'F': ['C', 'E']} print(bidirectional_bfs(graph, 'A', 'F'))--OUTPUT--['A', 'C', 'F']

This clever optimization speeds up pathfinding by running two BFS searches at once—one from the start and another from the end. The searches expand toward each other and stop when they meet in the middle, which is far more efficient in large graphs than one long search.

The algorithm uses two separate queues and visited dictionaries to track each search independently.
An intersection is found when a node from one search appears in the other's visited dictionary.
The reconstruct_path helper then stitches the two paths together to form the complete route.

Move faster with Replit

Replit is an AI-powered development platform where you can start coding Python instantly. All the necessary dependencies come pre-installed, so you can skip the tedious setup and focus on building.

While learning algorithms like BFS is crucial, the real goal is building working applications. This is where Agent 4 helps you move from piecing together individual functions to creating complete products. Simply describe an app that uses these concepts, and Agent will build it:

A simple route planner that uses shortest_path logic to find the most efficient delivery sequence in a city grid.
A social network analysis tool that uses find_connected_components to identify separate user groups or communities.
A "six degrees of separation" calculator that finds the connection path between two nodes in a network, optimized with bi-directional search.

Simply describe your app, and Replit will write the code, test it, and fix issues automatically, all within your browser.

Common errors and challenges

Implementing BFS is straightforward, but a few common pitfalls can easily derail your code.

Forgetting to mark nodes as visited before enqueueing
A classic mistake is waiting to mark a node as visited until after you’ve popped it from the queue. This can cause you to add the same node multiple times, leading to redundant work and, in graphs with cycles, an infinite loop. Always add a node to the visited set the moment you add it to the queue.
Handling missing nodes in the graph with .get()
If your code tries to access neighbors for a node that isn't in your graph dictionary, it will raise a KeyError and crash. You can avoid this by using the .get() method. Calling graph.get(vertex, []) safely returns an empty list for missing nodes, allowing your traversal to continue smoothly.
Performance issues with list instead of deque
While a Python list can technically work as a queue, its performance is poor for this task. The pop(0) operation is slow because it has to shift every remaining element. A deque is purpose-built for this job, offering a much faster popleft() method that ensures your algorithm remains efficient even with large graphs.

Forgetting to mark nodes as `visited` before enqueueing

It’s a classic mistake: waiting to mark a node as visited until after it’s popped from the queue. This causes the same node to be added multiple times, creating redundant work and risking infinite loops. The code below shows this error in action.

from collections import deque def buggy_bfs(graph, start): visited = set() queue = deque([start]) while queue: vertex = queue.popleft() visited.add(vertex) # Wrong: marking as visited after dequeuing print(vertex, end=" ") for neighbor in graph[vertex]: if neighbor not in visited: queue.append(neighbor) graph = {'A': ['B', 'C'], 'B': ['A', 'D'], 'C': ['A', 'D'], 'D': ['B', 'C']} buggy_bfs(graph, 'A') # Will cause duplicate processing

Because visited.add(vertex) runs after dequeuing, a shared neighbor like D gets added to the queue by both B and C. This creates redundant work. The corrected implementation below prevents this with a small adjustment.

from collections import deque def correct_bfs(graph, start): visited = set([start]) # Mark start as visited immediately queue = deque([start]) while queue: vertex = queue.popleft() print(vertex, end=" ") for neighbor in graph[vertex]: if neighbor not in visited: visited.add(neighbor) # Mark as visited when discovered queue.append(neighbor) graph = {'A': ['B', 'C'], 'B': ['A', 'D'], 'C': ['A', 'D'], 'D': ['B', 'C']} correct_bfs(graph, 'A')

The correct_bfs function fixes the issue by marking a node as visited the moment it's discovered, not after it's processed. By moving visited.add(neighbor) before queue.append(neighbor), you guarantee that each node is added to the queue only once.

This simple change is crucial. It prevents redundant processing and stops your algorithm from getting trapped in infinite loops in graphs with cycles, making your traversal both correct and efficient.

Handling missing nodes in the graph with `.get()`

A common oversight is assuming every node listed as a neighbor also exists as a key in the graph. If it doesn't, your code will crash with a KeyError when it tries to access graph[vertex]. The example below illustrates this issue.

from collections import deque def buggy_bfs(graph, start): visited = set([start]) queue = deque([start]) while queue: vertex = queue.popleft() print(vertex, end=" ") for neighbor in graph[vertex]: # KeyError if vertex not in graph if neighbor not in visited: visited.add(neighbor) queue.append(neighbor) graph = {'A': ['B', 'C'], 'B': ['D'], 'C': ['F']} # Missing 'D' and 'F' nodes buggy_bfs(graph, 'A')

The traversal adds D and F to the queue, but they don't exist as keys in the graph. When the loop tries to access neighbors for these nodes, it triggers a KeyError. The corrected implementation below prevents this crash.

from collections import deque def robust_bfs(graph, start): if start not in graph: # Check if start node exists return visited = set([start]) queue = deque([start]) while queue: vertex = queue.popleft() print(vertex, end=" ") for neighbor in graph.get(vertex, []): # Safely get neighbors if neighbor not in visited: visited.add(neighbor) queue.append(neighbor) graph = {'A': ['B', 'C'], 'B': ['D'], 'C': ['F']} # Missing 'D' and 'F' nodes robust_bfs(graph, 'A')

The robust_bfs function prevents this crash by using graph.get(vertex, []). This method safely fetches a node's neighbors. If the node isn't a key in the graph, it returns an empty list instead of raising a KeyError. This makes your traversal resilient to incomplete graph data, where a node might be referenced as a neighbor but not fully defined. It’s a simple change that ensures your algorithm runs smoothly without unexpected interruptions.

Performance issues with `list` instead of `deque`

Using a Python list as a queue might seem intuitive, but it's a performance trap. The pop(0) method is inefficient because it must shift every remaining element, which slows down your algorithm significantly in large graphs. The code below demonstrates this bottleneck.

def slow_bfs(graph, start): visited = set([start]) queue = [start] # Using list as queue (inefficient) while queue: vertex = queue.pop(0) # O(n) operation print(vertex, end=" ") for neighbor in graph[vertex]: if neighbor not in visited: visited.add(neighbor) queue.append(neighbor) large_graph = {str(i): [str(i+1), str(i+2)] for i in range(100)} # slow_bfs(large_graph, '0') # This would be very slow

In slow_bfs, using queue.pop(0) on a list forces Python to re-index every remaining item. This bottleneck worsens as the graph grows, slowing traversal significantly. The corrected implementation below resolves this inefficiency by using a more suitable data structure.

from collections import deque def fast_bfs(graph, start): visited = set([start]) queue = deque([start]) # Using deque (efficient) while queue: vertex = queue.popleft() # O(1) operation print(vertex, end=" ") for neighbor in graph[vertex]: if neighbor not in visited: visited.add(neighbor) queue.append(neighbor) large_graph = {str(i): [str(i+1), str(i+2)] for i in range(100)} # fast_bfs(large_graph, '0') # This is much faster

The fast_bfs function swaps the inefficient list for a collections.deque. A deque is a double-ended queue built for this job. Its popleft() method is an O(1) operation, meaning it takes constant time regardless of the queue's size. This is a huge improvement over a list's pop(0), which is an O(n) operation. This switch is crucial for keeping your BFS algorithm fast and scalable, especially with large graphs.

Real-world applications

Beyond the code, BFS is a powerful tool for solving real-world challenges, from simulating web crawlers to finding a path through a maze.

A web crawler operates just like a graph traversal algorithm. You can think of the internet as a massive graph where each web page is a node and every hyperlink is an edge. BFS is a natural fit for this task because it explores the graph layer by layer, discovering pages that are one click away before moving on to pages that are two clicks away.

Starting from a single URL, a bfs crawler would visit that page, extract all its links, and add them to a queue. It then processes each link, systematically discovering new pages. A visited set is essential here to keep track of crawled URLs, preventing the crawler from getting stuck in loops or wasting resources on duplicate content.

Finding the shortest way out of a maze is another classic problem that BFS solves elegantly. You can model a maze as a grid where each cell is a node and a path between adjacent cells is an edge. The goal is to find the shortest sequence of moves from a start point to an end point.

This is a perfect use case for the shortest_path function discussed earlier. Because BFS explores all possible paths of length one, then all paths of length two, and so on, it's guaranteed to find the shortest route first. The algorithm uses a queue to explore open cells and a visited dictionary to reconstruct the path once the exit is found.

Using `bfs` to simulate a web crawler

Here’s how you can translate that theory into code, building a simple crawler that explores a dictionary of URLs and stops after a set number of pages. The web_crawler_bfs function takes a graph, a starting URL, and a max_urls limit to prevent it from running indefinitely. It’s a practical demonstration of how BFS can be controlled and applied to a finite dataset.

from collections import deque def web_crawler_bfs(site_graph, start_url, max_urls=10): queue = deque([start_url]) visited = set([start_url]) while queue and len(visited) <= max_urls: current_url = queue.popleft() print(f"Crawling: {current_url}") for link in site_graph.get(current_url, []): if link not in visited: visited.add(link) queue.append(link) return visited # Simulated website structure as a graph website = { 'homepage.com': ['homepage.com/about', 'homepage.com/products', 'homepage.com/contact'], 'homepage.com/about': ['homepage.com', 'homepage.com/team'], 'homepage.com/products': ['homepage.com', 'homepage.com/product/1'], 'homepage.com/contact': ['homepage.com'], 'homepage.com/team': ['homepage.com/about'], 'homepage.com/product/1': ['homepage.com/products'] } crawled_urls = web_crawler_bfs(website, 'homepage.com', 5) print(f"\nDiscovered URLs: {list(crawled_urls)}")

This code models a basic web crawler by treating the website as a graph. The web_crawler_bfs function systematically discovers pages in a practical demonstration of BFS.

It uses a deque to manage the list of URLs to visit, ensuring an efficient, level-by-level search.
A visited set prevents reprocessing the same URL, and the loop's condition with len(visited) caps the crawl's scope.
Using site_graph.get() makes the code robust by safely handling pages that don't have any defined outgoing links, preventing potential errors.

Finding the shortest path in a maze using `bfs`

This function adapts the shortest-path algorithm to solve a maze, treating the grid as a graph to find the most direct route from a start to an end coordinate.

The solve_maze function explores a 2D grid where 0 represents an open path and 1 is a wall. Instead of a pre-defined graph, it determines neighbors on the fly by checking adjacent cells.

A queue stores the (row, col) coordinates of cells to visit, ensuring the search expands one level at a time.
The visited dictionary is key—it not only tracks which cells have been seen but also stores the "parent" cell that led to the current one.
Once the end coordinate is reached, the function reconstructs the path by backtracking from the end to the start using the parent pointers stored in visited.
If the queue runs out before the end is found, it means no path exists, and the function returns an empty list.

from collections import deque def solve_maze(maze, start, end): rows, cols = len(maze), len(maze[0]) queue = deque([(start[0], start[1])]) visited = {start: None} # Store parent to reconstruct path directions = [(0, 1), (1, 0), (0, -1), (-1, 0)] # Right, down, left, up while queue: row, col = queue.popleft() if (row, col) == end: # Reconstruct path path = [] current = (row, col) while current: path.append(current) current = visited[current] return path[::-1] # Reverse to get path from start to end for dr, dc in directions: new_row, new_col = row + dr, col + dc new_pos = (new_row, new_col) if (0 <= new_row < rows and 0 <= new_col < cols and maze[new_row][new_col] == 0 and new_pos not in visited): visited[new_pos] = (row, col) queue.append(new_pos) return [] # No path found # Simple maze (0 = open path, 1 = wall) maze = [ [0, 0, 1, 0], [1, 0, 1, 0], [0, 0, 0, 0], [0, 1, 1, 0] ] start = (0, 0) end = (3, 3) path = solve_maze(maze, start, end) print(f"Path from {start} to {end}:") for i, pos in enumerate(path): print(f"Step {i}: {pos}")

This solve_maze function navigates a grid to find the most direct path from start to end. It uses a queue to explore cells layer by layer, which guarantees the first path it finds is the shortest one.

The algorithm checks four potential moves from each cell using the directions list.
A move is only valid if it’s within the grid, isn’t a wall (a 1), and hasn’t already been visited.
The visited dictionary logs the path taken, creating a breadcrumb trail that’s retraced to build the final solution once the end is reached.

Get started with Replit

Turn these algorithms into a real tool. Just tell Replit Agent what you want: “a web app that finds the shortest route on a grid map” or “a utility to identify separate user groups in a network.”

It will write the code, test for errors, and deploy your application, handling the tedious parts of development for you. Start building with Replit.

Get started free

Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

Get started free

Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

Get started for free

Follow @Replit