How to split a string with multiple delimiters in Python

Learn how to split a Python string with multiple delimiters. Explore various methods, tips, real-world uses, and common error debugging.

How to split a string with multiple delimiters in Python
Published on: 
Tue
Apr 21, 2026
Updated on: 
Wed
Apr 22, 2026
The Replit Team

To split a string with multiple delimiters is a frequent task when you work with data. Python provides flexible methods for complex string manipulation, which simplifies how you process data from various sources.

In this article, you'll explore several techniques, from simple methods to more advanced ones. You'll find practical tips, real-world applications, and advice to fix common errors so you can select the right approach.

Using re.split() with a regex pattern

import re
text = "apple,banana;cherry:grape|orange"
result = re.split(r'[,;:|]', text)
print(result)--OUTPUT--['apple', 'banana', 'cherry', 'grape', 'orange']

The re.split() function from Python's re module is a powerful tool for splitting a string by multiple characters. It uses a regular expression pattern to identify all the delimiters at once, offering more flexibility than basic string methods.

In this example, the pattern r'[,;:|]' does the heavy lifting:

  • The square brackets [] create a character set.
  • The regex engine matches any single character within this set—in this case, a comma, semicolon, colon, or pipe.

The function then splits the string at these points, resulting in a clean list of substrings. This method is efficient for handling inconsistent or mixed data formats.

Basic techniques for splitting with multiple delimiters

For cases where a full regular expression might be overkill, you can use a combination of Python's basic string methods to split your text.

Using str.replace() to standardize delimiters

text = "apple,banana;cherry:grape|orange"
standardized = text.replace(',', '#').replace(';', '#').replace(':', '#').replace('|', '#')
result = standardized.split('#')
print(result)--OUTPUT--['apple', 'banana', 'cherry', 'grape', 'orange']

This approach works in two main steps. First, you chain several str.replace() calls to swap each unique delimiter with a single, uniform character—in this case, '#'. This process creates a standardized string where all the original, mixed separators are gone.

  • Each replace() call targets one specific delimiter.
  • The modified string is then passed to the next replace() call.

After unifying the delimiters, a single call to split('#') is all it takes to get the final list of substrings.

Iterative splitting with multiple calls to split()

text = "apple,banana;cherry:grape|orange"
result = text.split(',')
temp = []
for item in result:
temp.extend(item.split(';'))
result = temp
print(result)--OUTPUT--['apple', 'banana', 'cherry:grape|orange']

Another way to tackle this is by splitting the string iteratively. This method involves breaking down the string one delimiter at a time. The code first uses split(',') to create an initial list from the original text.

  • A for loop then processes each item from that list.
  • Inside the loop, extend() is used with another split(';') call to further divide the substrings and add them to a new list.

This process must be repeated for every delimiter you want to handle. As you can see, the example only processes commas and semicolons, which is why 'cherry:grape|orange' remains intact.

Using string translation with str.maketrans()

text = "apple,banana;cherry:grape|orange"
trans_table = str.maketrans({',': ' ', ';': ' ', ':': ' ', '|': ' '})
result = text.translate(trans_table).split()
print(result)--OUTPUT--['apple', 'banana', 'cherry', 'grape', 'orange']

The str.maketrans() method offers an efficient way to replace multiple characters at once. It builds a translation table that maps each delimiter—like , or ;—to a single space character. This approach is often faster than chaining multiple replace() calls.

  • The translate() method applies this table to the string, swapping all specified delimiters for spaces in one operation.
  • A final call to split() with no arguments then splits the string by any whitespace, neatly handling cases with multiple spaces between words.

Advanced techniques for delimiter handling

Building on these basic methods, Python also provides advanced techniques for when you need more control, such as preserving delimiters or defining a split hierarchy.

Using functools.reduce() for sequential splitting

from functools import reduce
text = "apple,banana;cherry:grape|orange"
delimiters = [',', ';', ':', '|']
result = reduce(lambda acc, delim: sum([i.split(delim) for i in acc], []), delimiters, [text])
print(result)--OUTPUT--['apple', 'banana', 'cherry', 'grape', 'orange']

The functools.reduce() function offers a concise, functional programming approach. It works by applying a function cumulatively to your list of delimiters, starting with an initial list that contains just the original string.

  • The lambda function processes one delimiter at a time, splitting every string fragment currently held in an accumulator.
  • A list comprehension performs the split, and the sum(..., []) pattern cleverly flattens the resulting list of lists back into one.

This cycle repeats for each delimiter, progressively breaking the string down into the final list of words.

Splitting with regex groups to retain delimiters

import re
text = "apple,banana;cherry:grape|orange"
pattern = r'([,;:|])'
result = [x for x in re.split(pattern, text) if x not in ',;:|']
print(result)--OUTPUT--['apple', 'banana', 'cherry', 'grape', 'orange']

You can gain more control over splitting by using a capturing group in your regex. When you wrap the pattern in parentheses, as in r'([,;:|])', you're telling re.split() to not just split by the delimiters but to also keep them in the resulting list.

  • This initially produces a list containing both the substrings and the delimiters that separated them.
  • The code then uses a list comprehension to create a new list, keeping only the elements that aren't the original delimiters, giving you a clean final output.

Creating a custom split function with delimiter priority

def custom_split(text, delimiters):
if not delimiters:
return [text]
parts = text.split(delimiters[0])
return [subpart for part in parts for subpart in custom_split(part, delimiters[1:])]

text = "apple,banana;cherry:grape|orange"
result = custom_split(text, [',', ';', ':', '|'])
print(result)--OUTPUT--['apple', 'banana', 'cherry', 'grape', 'orange']

A custom function gives you full control over the splitting logic, especially when the order of delimiters matters. This custom_split function uses recursion to process each delimiter sequentially. It effectively creates a priority system based on the order of delimiters in your list.

  • The function first splits the text by the initial delimiter, delimiters[0].
  • It then recursively calls itself on each resulting piece, passing along the rest of the delimiters.
  • This continues until there are no delimiters left, which is the base case for the recursion.

Move faster with Replit

Learning individual techniques is one thing, but building a complete application is another. Replit is an AI-powered development platform where all Python dependencies pre-installed, so you can skip setup and start coding instantly. It's designed to help you move from knowing methods like re.split() to shipping full projects with Agent 4.

Instead of piecing together functions, you can describe the app you want to build, and Agent will take it from idea to working product. For example, you could build:

  • A data-cleaning utility that ingests messy text files with mixed delimiters and outputs a clean, standardized CSV.
  • A tag management tool that processes user-inputted tags separated by commas, spaces, or semicolons into a unified list for a database.
  • A log parser that reads server logs and splits each line into structured fields based on a set of defined separators for easier analysis.

Simply describe your app, and Replit will write the code, test it, and fix issues automatically, all within your browser.

Common errors and challenges

Even with powerful tools, you might run into a few common pitfalls when splitting strings with multiple delimiters.

Forgetting to escape special characters in re.split() patterns

Regular expressions use special characters like ., +, or * to define patterns. If your delimiter happens to be one of these characters, you can't just include it directly in your pattern. Forgetting to escape it with a backslash (\) will cause re.split() to interpret its special meaning instead of treating it as a literal character, leading to unexpected splits or errors.

Handling empty strings when using re.split()

You may notice empty strings appearing in your output list. This often happens if your text starts or ends with a delimiter, or if multiple delimiters are next to each other. While sometimes useful, these empty strings are usually noise. You can clean them up afterward by filtering them out of your results, for instance, with a list comprehension.

Using capturing groups with re.split() unexpectedly

When you wrap part of your regex pattern in parentheses, you create a capturing group. If you do this with re.split(), the delimiters themselves are included in the output list. This is a feature, not a bug, and it’s useful when you need to preserve the separators. However, if you add parentheses unintentionally, you’ll get a list that contains both your substrings and the delimiters, which can be confusing if you weren't expecting it.

Forgetting to escape special characters in re.split() patterns

When your delimiters include regex metacharacters like . or +, you can't just place them in a character set. The regex engine will interpret their special function instead of their literal value, which can break your split. The code below shows what happens.

import re
text = "apple.banana.cherry$grape+orange"
result = re.split(r'[.$+]', text)
print(result)

The pattern r'[.$+]' is unreliable because it contains unescaped metacharacters. The regex engine's interpretation can lead to unpredictable splits, making this approach error-prone. See the code below for the correct way to handle these characters.

import re
text = "apple.banana.cherry$grape+orange"
result = re.split(r'[\.\$\+]', text)
print(result)

The solution is to escape the metacharacters within your pattern. By adding a backslash (\) before each special character, you instruct the regex engine to treat them as literal text, not commands.

  • The pattern r'[\.\$\+]' correctly tells re.split() to split on a literal dot, dollar sign, or plus sign.
  • This prevents the engine from misinterpreting them, ensuring your string splits exactly where you intend.

Handling empty strings when using re.split()

This issue often arises when your input data isn't perfectly clean. For example, consecutive delimiters or a delimiter at the start or end of the string will cause re.split() to produce empty strings. The following code demonstrates this common side effect.

import re
text = "apple,,banana;;cherry"
result = re.split(r'[,;]', text)
print(result) # Contains unwanted empty strings

The function splits at every delimiter it finds. When two delimiters appear together, like ,,, the function correctly identifies an empty string between them. See the next example for a clean way to handle this output.

import re
text = "apple,,banana;;cherry"
result = [item for item in re.split(r'[,;]', text) if item]
print(result) # Empty strings removed

A clean way to remove empty strings is with a list comprehension. This technique filters the output from re.split(). The expression if item inside the comprehension checks if each string is non-empty. Since empty strings evaluate to False, they're automatically excluded from the final list. It's a concise, Pythonic way to clean up your data after splitting, especially when dealing with messy inputs that might have consecutive delimiters.

Using capturing groups with re.split() unexpectedly

It's a common slip-up to wrap a regex pattern in parentheses, creating a capturing group. With re.split(), this means the delimiters you're splitting by also appear in your output, which is often not the goal. The code below shows this in action.

import re
text = "apple,banana;cherry:grape"
result = re.split(r'(,|;|:)', text)
print(result) # Includes delimiters in the output

The parentheses in r'(,|;|:)' form a capturing group, telling re.split() to include the delimiters in its output. This mixes your substrings with the separators, creating a list that's more cluttered than you probably intended. To fix this, you need to tell the regex engine to group the delimiters without capturing them. See how to adjust the pattern in the code below.

import re
text = "apple,banana;cherry:grape"
result = re.split(r'(?:,|;|:)', text) # Non-capturing group
print(result) # Only contains the split items

The solution is to use a non-capturing group, written as (?:...). This syntax groups your delimiters for the | operator to work on, but it prevents re.split() from including them in the output.

  • The pattern (?:,|;|:) tells the function to match any of the specified delimiters.
  • Because the group is non-capturing, the delimiters aren't added to the final list.

This gives you a clean output with just the substrings. It's the right approach whenever your pattern needs grouping but you want to exclude the grouped part from the results.

Real-world applications

Now that you can sidestep common errors with re.split(), you're ready to apply it to tasks like parsing logs or creating key-value parsers.

Parsing log files with re.split() to extract structured information

Log files often use multiple delimiters to structure data, making re.split() an ideal tool for breaking down each entry into meaningful parts.

import re

log_entry = "2023-10-15 14:30:45|ERROR|user:admin|module:auth|Failed login attempt"
parts = re.split(r'[|:]', log_entry)
print(f"Timestamp: {parts[0]}, Level: {parts[1]}, User: {parts[3]}")

This code uses re.split() to efficiently parse a structured log entry. The function takes the log_entry string and breaks it apart based on the regular expression pattern r'[|:]'.

  • The square brackets [] create a character set that matches any single character within them.
  • Here, it tells the function to split the string at every occurrence of either a pipe | or a colon :.

This single operation produces a list of substrings, making it simple to access individual data points from the log by their index.

Creating a key-value parser with re.split() and dictionary conversion

You can also use re.split() to build a simple key-value parser, which is perfect for converting a data string with mixed delimiters into a Python dictionary.

import re

data = "name:John,age:30;city:New York|country:USA"
pairs = re.split(r'[,;|]', data)
person = {}
for pair in pairs:
key, value = pair.split(':', 1)
person[key] = value
print(person)

This approach uses a two-step process to parse the data string. First, re.split() uses the pattern r'[,;|]' to break the string into a list of potential key-value pairs. It's a neat way to handle the inconsistent separators between each pair.

  • The code then iterates through each resulting string, like 'name:John' or 'city:New York'.
  • Inside the loop, pair.split(':', 1) divides each string at the first colon it finds. The 1 ensures it only splits once, correctly handling values that might contain colons.

Finally, it populates the person dictionary with these key-value pairs.

Get started with Replit

Turn your knowledge of string splitting into a real tool. With Replit Agent, you can describe what you want: "a script that cleans data separated by commas and pipes" or "a tool that standardizes user-inputted tags."

Replit Agent will write the code, test for errors, and help you deploy your app. Start building with Replit.

Get started free

Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

Get started free

Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.