How to use multiprocessing in Python

Learn how to use multiprocessing in Python. This guide covers different methods, tips, real-world applications, and debugging common errors.

How to use multiprocessing in Python
Published on: 
Tue
Mar 17, 2026
Updated on: 
Tue
Mar 24, 2026
The Replit Team

Python's multiprocessing module lets you run tasks in parallel. This method can dramatically improve your application's performance. It uses multiple CPU cores and bypasses the Global Interpreter Lock.

Here, you'll explore core techniques, practical tips, and real-world applications. You'll also find effective advice to debug your code, so you can confidently implement multiprocessing in your own projects.

Using the multiprocessing module for basic parallelism

import multiprocessing

def worker(num):
return f"Worker {num} result"

if __name__ == "__main__":
with multiprocessing.Pool(processes=3) as pool:
results = pool.map(worker, [1, 2, 3])
print(results)--OUTPUT--['Worker 1 result', 'Worker 2 result', 'Worker 3 result']

The if __name__ == "__main__" guard is a critical safeguard. It ensures the main script's code doesn't re-run inside the child processes that are spawned, which prevents errors and infinite loops.

The multiprocessing.Pool object is what orchestrates the parallel execution. It creates a pool of worker processes—three, in this example. The pool.map() function then distributes the worker function to these processes, applying it to each item in the list. It automatically manages the workload and gathers the results in order once all tasks are complete.

Core multiprocessing techniques

While Pool is great for many cases, you can get more control by managing processes directly with Process and sharing data using Queue or Manager.

Using Process to run functions in parallel

import multiprocessing

def task(name):
print(f"Running task {name}")

if __name__ == "__main__":
processes = []
for i in range(3):
p = multiprocessing.Process(target=task, args=(f"Process-{i}",))
processes.append(p)
p.start()

for p in processes:
p.join()--OUTPUT--Running task Process-0
Running task Process-1
Running task Process-2

The Process class offers fine-grained control by letting you manage individual processes directly. You instantiate it by passing your function to the target parameter and its arguments via args. This setup gives you a handle on each parallel task.

  • Calling p.start() kicks off the process, letting it run independently.
  • Using p.join() makes your main program wait for the process to finish. It’s essential for ensuring all your parallel tasks are complete before the script exits.

Passing data with Queue between processes

from multiprocessing import Process, Queue

def producer(q):
q.put("Hello")
q.put("from")
q.put("another process")

if __name__ == "__main__":
q = Queue()
p = Process(target=producer, args=(q,))
p.start()
p.join()

while not q.empty():
print(q.get())--OUTPUT--Hello
from
another process

When processes need to exchange data, the multiprocessing.Queue provides a safe and simple way to do it. It acts like a pipeline between your main script and worker processes, handling all the complex synchronization for you.

  • The producer process uses q.put() to add items to the queue.
  • Back in the main process, q.get() retrieves items one by one.
  • The queue follows a first-in, first-out order, so you get the data in the same sequence it was sent.

Sharing data with Manager dictionaries

from multiprocessing import Process, Manager

def update_dict(shared_dict, key, value):
shared_dict[key] = value

if __name__ == "__main__":
with Manager() as manager:
shared_dict = manager.dict()
processes = []
for i in range(3):
p = Process(target=update_dict, args=(shared_dict, f"key{i}", i*10))
processes.append(p)
p.start()

for p in processes:
p.join()

print(dict(shared_dict))--OUTPUT--{'key0': 0, 'key1': 10, 'key2': 20}

For sharing complex data structures like dictionaries, the Manager class is a powerful tool. It creates a server process that holds Python objects, allowing multiple processes to modify them safely. This is ideal when you need a shared state rather than just passing messages with a Queue.

  • The with Manager() as manager: block starts this server and ensures it’s cleaned up afterward.
  • You create a shared dictionary using manager.dict().
  • Each process can then update this dictionary directly, and the manager handles the underlying synchronization to prevent data corruption.

Advanced process management

Beyond the basics of Process, Queue, and Manager, the multiprocessing module offers even more powerful tools for synchronizing tasks and sharing memory efficiently.

Synchronizing processes with Lock and Event

from multiprocessing import Process, Lock, Event
import time

def worker(lock, event, worker_id):
with lock:
print(f"Worker {worker_id} acquired the lock")
event.wait()
print(f"Worker {worker_id} received event signal")

if __name__ == "__main__":
lock = Lock()
event = Event()
processes = [Process(target=worker, args=(lock, event, i)) for i in range(2)]

for p in processes:
p.start()

time.sleep(0.5)
event.set()

for p in processes:
p.join()--OUTPUT--Worker 0 acquired the lock
Worker 1 acquired the lock
Worker 0 received event signal
Worker 1 received event signal

Synchronization primitives like Lock and Event help you coordinate complex workflows. They prevent race conditions and manage the timing of your processes.

  • A Lock ensures only one process can access a critical section of code at a time. The with lock: statement automatically acquires and releases the lock, making it a safe way to protect shared resources.
  • An Event acts as a simple signaling mechanism. Processes can wait for a signal using event.wait(), and another process can send the signal to all waiting processes by calling event.set().

Creating a custom Pool with different processing methods

from multiprocessing import Pool
import time

def slow_task(x):
time.sleep(0.1)
return x * x

if __name__ == "__main__":
with Pool(processes=4) as pool:
# Asynchronous processing
result = pool.apply_async(slow_task, (10,))
print(f"Async result: {result.get()}")

# Map for multiple inputs
print(f"Map results: {pool.map(slow_task, [1, 2, 3])}")--OUTPUT--Async result: 100
Map results: [1, 4, 9]

The Pool object gives you flexible ways to manage tasks. You can run a single function asynchronously or apply one function to many inputs at once, depending on your needs.

  • apply_async lets you offload a single task to a worker process. Your main program doesn't have to wait and can continue running. You retrieve the result later by calling .get() on the returned object.
  • map is perfect for parallelizing a function across a list of items. It blocks execution until all tasks are complete and returns the results in a list, maintaining their original order.

Using multiprocessing.Value and Array for shared memory

from multiprocessing import Process, Value, Array
import ctypes

def update_shared(number, arr):
number.value += 100
for i in range(len(arr)):
arr[i] *= 2

if __name__ == "__main__":
num = Value(ctypes.c_int, 0)
arr = Array(ctypes.c_int, [1, 2, 3, 4])

p = Process(target=update_shared, args=(num, arr))
p.start()
p.join()

print(f"Shared number: {num.value}")
print(f"Shared array: {list(arr)}")--OUTPUT--Shared number: 100
Shared array: [2, 4, 6, 8]

For high-performance data sharing, Value and Array let processes modify data in shared memory directly. It's more efficient than using a Manager for simple types because it avoids inter-process communication overhead. These objects are essentially memory-safe wrappers around C data types from the ctypes module.

  • You create a Value for a single piece of data, like a number, and access it using the .value attribute.
  • An Array works for a sequence of items and can be modified in place by worker processes.
  • Both require a ctypes data type, like ctypes.c_int, to define how the data is stored.

Move faster with Replit

Replit is an AI-powered development platform that transforms natural language into working applications. You can take the concepts from this article and use Replit Agent to build complete apps—with databases, APIs, and deployment—directly from a description.

For the multiprocessing techniques we've explored, Replit Agent can turn them into production-ready tools. For example, you could:

  • Build a batch image processor that resizes thousands of files in parallel using a process Pool.
  • Create a web scraper where one process finds links and adds them to a Queue, while multiple worker processes fetch and parse the pages.
  • Deploy a real-time monitoring dashboard that uses shared Value and Array objects to display system metrics from different processes.

Describe your app idea, and Replit Agent writes the code, tests it, and fixes issues automatically, all in your browser.

Common errors and challenges

Navigating multiprocessing requires avoiding a few common pitfalls, from infinite loops to data corruption and unresponsive programs.

  • Forgetting the if __name__ == "__main__" guard is a classic mistake. Without it, each new process re-imports and runs your main script, creating a recursive loop that quickly overwhelms your system.
  • Directly sharing standard Python objects like lists or dictionaries between processes can lead to silent bugs. Each process gets its own copy, so modifications made in one won't appear in another. For shared state, you must use a Manager to ensure all processes are working on the same, synchronized object.
  • A worker process can sometimes hang, causing your main program to wait forever when it calls join(). You can prevent this by adding a timeout, like p.join(timeout=10). This tells the main process to wait for a set number of seconds before moving on, making your application more resilient.

Avoiding recursive spawning with if __name__ == "__main__"

Each new process re-imports the main script. Without the if __name__ == "__main__" guard, your process-creation code runs again inside the child process, triggering an infinite loop that can crash your program. The code below shows this error in action.

import multiprocessing

def worker(num):
return f"Worker {num} result"

# Missing if __name__ == "__main__" guard
pool = multiprocessing.Pool(processes=3)
results = pool.map(worker, [1, 2, 3])
print(results)

When the script is imported by a new process, the multiprocessing.Pool() line runs again, creating another pool. This triggers an infinite loop of process creation. The corrected code below shows how to prevent this from happening.

import multiprocessing

def worker(num):
return f"Worker {num} result"

if __name__ == "__main__":
pool = multiprocessing.Pool(processes=3)
results = pool.map(worker, [1, 2, 3])
print(results)

By wrapping your process-creation logic in an if __name__ == "__main__" block, you ensure it only runs when the script is executed directly. When a child process is created, it imports the script, but the code inside this block won't run again. This is the standard way to prevent the recursive spawning error. It's a crucial habit to adopt whenever you're using the multiprocessing module to keep your applications stable.

Properly handling mutable objects with Manager() vs direct sharing

Directly sharing mutable objects like a list or dict between processes can lead to unexpected behavior. Each process receives a separate copy, not a shared reference, so modifications in one process are invisible to others, causing silent data inconsistencies. The following code illustrates this problem in action.

from multiprocessing import Process

def update_list(lst):
lst.append(100)
print(f"Inside process: {lst}")

if __name__ == "__main__":
my_list = [1, 2, 3]
p = Process(target=update_list, args=(my_list,))
p.start()
p.join()
print(f"In main process: {my_list}") # Still [1, 2, 3]

The update_list function modifies its own private copy of the list, so the original my_list in the main process remains untouched. The changes aren't synchronized back. The corrected code below shows how to fix this.

from multiprocessing import Process, Manager

def update_list(lst):
lst.append(100)
print(f"Inside process: {lst}")

if __name__ == "__main__":
with Manager() as manager:
my_list = manager.list([1, 2, 3])
p = Process(target=update_list, args=(my_list,))
p.start()
p.join()
print(f"In main process: {my_list}") # Now contains 100

The Manager solves this by creating a proxy object—a special version of the list that can be safely shared across processes. This ensures that any modifications are synchronized correctly.

  • When you pass the manager.list() to the new process, both the main and child processes are working on the same underlying data.
  • This guarantees that changes, like lst.append(100), are reflected everywhere.

You'll need this solution whenever multiple processes must read from and write to the same mutable object.

Safely terminating processes with timeout in join()

A worker process can sometimes hang, leaving your main program stuck waiting indefinitely when it calls join(). This can make your entire application unresponsive. The code below shows what happens when a long-running task blocks the main process from continuing.

from multiprocessing import Process
import time

def long_task():
print("Starting long task...")
time.sleep(10) # Simulate a task that might hang
print("Task complete")

if __name__ == "__main__":
p = Process(target=long_task)
p.start()
p.join() # This will wait indefinitely if the process hangs
print("Main process continued")

The main program is stuck because p.join() has no time limit, forcing it to wait for the 10-second sleep to finish. If the task hangs, the application freezes. The corrected code below shows a more resilient approach.

from multiprocessing import Process
import time

def long_task():
print("Starting long task...")
time.sleep(10) # Simulate a task that might hang
print("Task complete")

if __name__ == "__main__":
p = Process(target=long_task)
p.start()
p.join(timeout=5) # Wait for at most 5 seconds
if p.is_alive():
print("Process is taking too long, terminating")
p.terminate()
print("Main process continued")

By adding a timeout to p.join(), you prevent your main program from getting stuck. If the process doesn't finish within the specified time, your script moves on. You can then check if it's still running with p.is_alive() and forcefully stop it using p.terminate(). This is crucial for tasks that might hang, like network requests or long computations, as it keeps your application responsive and stable.

Real-world applications

With the core techniques and error-handling patterns covered, you can now apply multiprocessing to solve practical problems in data processing and task scheduling.

Processing large datasets with Pool.map()

You can significantly speed up data analysis by using Pool.map() to divide a large dataset into manageable chunks and process each one on a separate CPU core.

import multiprocessing
import random

def analyze_chunk(chunk):
# Simulate processing a chunk of data
return sum(chunk) / len(chunk)

if __name__ == "__main__":
data = [random.randint(1, 100) for _ in range(1000)]
chunks = [data[i:i+250] for i in range(0, len(data), 250)]

with multiprocessing.Pool(processes=4) as pool:
results = pool.map(analyze_chunk, chunks)

print(f"Chunk averages: {results}")

This example breaks a large list of random numbers into smaller chunks for parallel processing. A multiprocessing.Pool manages a group of four worker processes to handle the workload, distributing the chunks among them.

  • The pool.map() function is the core of the operation. It assigns the analyze_chunk task to each process, applying it to one of the data chunks.
  • Once all processes complete their calculations, pool.map() gathers the individual results and returns them in a single, ordered list.

Creating a simple parallel task scheduler with Pool.map()

You can also use Pool.map() to build a simple task scheduler, which is perfect for running a batch of independent jobs like data backups and log analysis concurrently.

import multiprocessing
import time
from datetime import datetime

def scheduled_task(task_info):
name, delay = task_info
time.sleep(delay) # Simulate task running time
return name, datetime.now().strftime("%H:%M:%S")

if __name__ == "__main__":
tasks = [("Data backup", 2), ("Log analysis", 1), ("Email sending", 3)]

print(f"Starting tasks at {datetime.now().strftime('%H:%M:%S')}")
with multiprocessing.Pool(processes=3) as pool:
results = pool.map(scheduled_task, tasks)

for task_name, completion_time in results:
print(f"{task_name} completed at {completion_time}")

This code uses a Pool to run several independent functions, each with different arguments. The tasks list defines three jobs, each with a unique name and a simulated delay. A Pool with three processes is then created to handle the work concurrently.

  • The pool.map() function is the core of the operation. It applies the scheduled_task function to every item in the tasks list.
  • Because each task runs in its own process, they don't have to wait for each other to finish.

The program gathers all the results and prints them in order once every task is complete.

Get started with Replit

Turn these concepts into a real tool with Replit Agent. Describe what you want to build, like “a batch image resizer that uses a process Pool” or “a web scraper that uses a Queue to manage tasks.”

Replit Agent writes the code, tests for errors, and deploys your app. Start building with Replit.

Get started free

Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

Get started for free

Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.