How to drop a column in Python

Learn how to drop a column in Python. This guide covers various methods, tips, real-world examples, and common error debugging.

Published on:

Fri

Feb 20, 2026

Updated on:

Mon

Apr 6, 2026

The Replit Team

ON THIS PAGE

Example H2

Column removal is a frequent step in data preprocessing, crucial to refine datasets for machine learning or analysis. Python provides simple, effective methods to handle this task with precision and control.

Here, we'll cover key techniques, including the pandas drop() function. You'll also get real-world examples, practical tips, and debugging advice to help you manage your data more effectively.

Basic column dropping with pandas `drop()`

import pandas as pd df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}) result = df.drop('B', axis=1) print(result)--OUTPUT--A C 0 1 7 1 2 8 2 3 9

This example assumes familiarity with creating a DataFrame in Python using pandas constructor syntax.

The drop() method is your primary tool for removing data. The two crucial arguments here are the column label you want to remove—'B' in this case—and the axis=1 parameter.

Setting axis=1 is essential because it tells pandas to operate on columns. If you omit it, pandas defaults to axis=0 and searches for a row index, which would cause an error here. The method returns a new DataFrame, leaving your original df untouched unless you reassign it or use the inplace=True argument.

Basic column manipulation techniques

Building on the basic drop() function, you can also remove multiple columns at once, specify the axis differently, and modify your DataFrame directly using inplace.

Using `drop()` with different axis notation

import pandas as pd df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}) # Using axis='columns' instead of axis=1 result = df.drop('B', axis='columns') print(result)--OUTPUT--A C 0 1 7 1 2 8 2 3 9

For better code readability, pandas allows you to use axis='columns' as a more descriptive alternative to axis=1. This string argument makes your intent clearer without changing the outcome.

Functionality: Both axis='columns' and axis=1 tell the drop() method to target columns, not rows.
Readability: Using the string makes your code more self-documenting, which is helpful when you or others revisit it later.

This is purely a stylistic choice, but one that can significantly improve code clarity.

Dropping multiple columns at once

import pandas as pd df = pd.DataFrame({'A': [1, 2], 'B': [3, 4], 'C': [5, 6], 'D': [7, 8]}) result = df.drop(['B', 'D'], axis=1) print(result)--OUTPUT--A C 0 1 5 1 2 6

To remove multiple columns simultaneously, pass a list of column names to the drop() method. Instead of a single label, you provide a list like ['B', 'D'], and pandas removes all specified columns in one go. It’s a clean and scalable way to trim your DataFrame.

This is much more efficient than calling drop() repeatedly for each column.
The axis=1 argument works the same way, ensuring the operation targets columns.

Using `inplace` parameter with `drop()`

import pandas as pd df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}) df.drop('B', axis=1, inplace=True) print(df)--OUTPUT--A C 0 1 7 1 2 8 2 3 9

By setting inplace=True, you modify the DataFrame directly. This saves you from having to reassign the variable, as the drop() operation alters the original df on the spot. It's a more direct way to manage your data.

Since the DataFrame is changed in place, the method returns None.
This approach can be more memory-efficient for very large datasets because it avoids creating a full copy. For comprehensive strategies on handling large datasets in Python, consider memory optimization techniques beyond just the inplace parameter.

Advanced column operations

Moving beyond basic removal, you can also drop columns dynamically using conditional logic, pattern matching, or by filtering for the columns you wish to retain.

Dropping columns based on conditions

import pandas as pd df = pd.DataFrame({'A': [1, 2], 'B': [3, 4], 'C': [5, 6], 'D': [7, 8]}) # Drop columns where the sum is greater than 10 to_drop = [col for col in df if df[col].sum() > 10] result = df.drop(to_drop, axis=1) print(result)--OUTPUT--A B 0 1 3 1 2 4

You can drop columns dynamically by first identifying them with a condition. This example uses a list comprehension—a concise way to create lists—to build a collection of columns to remove based on their content. This technique builds on fundamental concepts of filtering lists in Python.

The code iterates through each column and checks if its total sum is greater than 10 using df[col].sum() > 10.
Column names that meet this condition are collected into a list named to_drop.
Finally, this list is passed to the drop() method to remove all targeted columns at once.

Dropping columns using regular expressions

import pandas as pd import re df = pd.DataFrame({'A_1': [1, 2], 'A_2': [3, 4], 'B_1': [5, 6], 'B_2': [7, 8]}) # Drop all columns starting with 'A_' pattern = re.compile('^A_') to_drop = [col for col in df.columns if pattern.match(col)] result = df.drop(to_drop, axis=1) print(result)--OUTPUT--B_1 B_2 0 5 7 1 6 8

Regular expressions offer a flexible way to drop columns that follow a specific naming convention. This is especially useful for cleaning datasets with many similarly named columns. The process involves creating a pattern, finding columns that match it, and then removing them. For more advanced pattern matching, see our guide on using regular expressions in Python.

First, you compile a pattern using Python’s re module, like re.compile('^A_'). The ^ anchor ensures the pattern only matches column names that start with A_.
A list comprehension then builds a list of columns to drop by checking each name against the pattern with pattern.match().
Finally, this list is passed to df.drop() to remove all matching columns.

Using column filtering as an alternative to `drop()`

import pandas as pd df = pd.DataFrame({'A': [1, 2], 'B': [3, 4], 'C': [5, 6], 'D': [7, 8]}) # Keep specific columns (equivalent to dropping others) columns_to_keep = ['A', 'C'] result = df[columns_to_keep] print(result)--OUTPUT--A C 0 1 5 1 2 6

Instead of dropping columns, you can simply select the ones you want to keep. This approach, known as column filtering, is often more direct when you know exactly which columns you need for your analysis.

First, you create a list containing the names of the columns you want to retain, such as columns_to_keep.
Next, you pass this list to your DataFrame using square brackets, like df[columns_to_keep].

This action returns a new DataFrame with only the specified columns, effectively dropping all others. It’s a clean and highly readable alternative to using drop().

Move faster with Replit

Replit is an AI-powered development platform where all Python dependencies come pre-installed, so you can skip setup and start coding instantly. While mastering individual functions like drop() is useful, building a full application is the next step. That’s where Agent 4 comes in.

Instead of piecing together techniques, you can describe the app you want to build. The Agent handles everything from writing the code to connecting databases and deploying it live. For example, you could build:

A data cleaning utility that automatically removes columns with too many missing values from an uploaded CSV.
A report generator that filters out personally identifiable information columns before exporting a dataset.
An interactive dashboard that lets users select which data columns to display in a chart, effectively dropping the rest from view.

Simply describe your app, and Replit will write the code, test it, and fix issues automatically, all within your browser.

Common errors and challenges

Dropping columns is straightforward, but a few common mistakes can lead to unexpected results or errors when you're cleaning your data.

Forgetting to assign the result when not using `inplace=True`

A frequent source of confusion is when the drop() method appears to do nothing. By default, drop() returns a new DataFrame with the specified columns removed, leaving the original one untouched. If you don't assign this new DataFrame back to a variable, the change is effectively lost.

Correct approach: You must reassign the result, like df = df.drop('B', axis=1).
Alternative: Use the inplace=True parameter, which modifies the original DataFrame directly and returns None.

Using incorrect axis parameter when dropping columns

The axis parameter is crucial for telling pandas whether to target rows or columns. If you omit it, pandas defaults to axis=0, which targets rows. This causes a KeyError because pandas will search for a row index with the name of the column you intended to drop, which it likely won't find.

Always specify axis=1 or the more readable axis='columns' to ensure you're operating on columns.

Accidentally dropping rows with numeric indexes when targeting columns

This is a subtle but dangerous error. If your DataFrame has integer column names (e.g., 0, 1, 2) and you forget to set axis=1, an operation like df.drop(1) will silently drop the row with index 1 instead of the column named 1. This can corrupt your data without raising an error, making it difficult to debug. Explicitly setting the axis prevents this ambiguity.

Forgetting to assign the result when not using `inplace=True`

It's a common pitfall: you call drop(), but the column remains. This happens because the method returns a new DataFrame by default, leaving the original untouched. If you don't capture this result, your change is lost. The following code demonstrates this.

import pandas as pd df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}) df.drop('B', axis=1) # Missing assignment print(df) # Will still contain column 'B'

The drop() method executes, but its output—a new DataFrame without column 'B'—is discarded. The original df is then printed, which remains unchanged. The next example shows how to capture the result correctly.

import pandas as pd df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}) df = df.drop('B', axis=1) # Assign result back to df print(df) # Column 'B' is now dropped

The fix is to assign the result back to your variable: df = df.drop('B', axis=1). Since drop() returns a new, modified DataFrame by default, this assignment updates your df variable to point to the new version. Without it, the change is simply discarded. This is the standard way to make the change stick when you aren't using the inplace=True parameter. Keep an eye out for this whenever you're transforming data.

Using incorrect axis parameter when dropping columns

The axis parameter is critical for telling pandas whether to drop a row or a column. If you set it to 0 or omit it, pandas will search for a row label instead of a column name, triggering a KeyError. The following code demonstrates this common mistake.

import pandas as pd df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}) result = df.drop('B', axis=0) # Using axis=0 instead of axis=1 print(result) # This will try to drop row with label 'B', not column 'B'

With axis=0, pandas searches for a row index labeled 'B' instead of a column. Since the DataFrame's index is numeric, it can't find the label and raises an error. The next example shows the correct syntax.

import pandas as pd df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}) result = df.drop('B', axis=1) # Using axis=1 for columns print(result) # Column 'B' is dropped

The fix is simple: always specify axis=1 or its more readable alias, axis='columns', when you intend to drop columns. This explicitly tells pandas to operate along the columnar axis, successfully removing column 'B' and returning the corrected DataFrame. This prevents the KeyError that arises from searching for a column name in the row index. This distinction is key to ensuring your data transformations behave as intended.

Accidentally dropping rows with numeric indexes when targeting columns

This error is subtle and can silently corrupt your data. When your DataFrame has integer column names, forgetting to specify axis=1 can cause pandas to drop a row by its index instead of the intended column, often without any warning.

The following code demonstrates how this ambiguity leads to unexpected results.

import pandas as pd df = pd.DataFrame({'A': [1, 2], 'B': [3, 4], 'C': [5, 6]}) # Attempting to drop the second column by its position result = df.drop(1) # This drops the row with index 1 instead! print(result)

Since drop() defaults to targeting rows, the operation removes the row at index 1 instead of the column. This silent error can corrupt your data. The following example shows how to resolve this ambiguity.

import pandas as pd df = pd.DataFrame({'A': [1, 2], 'B': [3, 4], 'C': [5, 6]}) # Specify axis=1 to drop columns, or use column names result = df.drop('B', axis=1) print(result)

The fix is to explicitly target columns by name and specify the axis. Using df.drop('B', axis=1) tells pandas to remove the column labeled 'B', resolving the ambiguity. This is crucial when your DataFrame has integer column names that could be mistaken for row indexes. Always set the axis to ensure your drop() operations behave predictably and you don't accidentally delete the wrong data without warning.

Real-world applications

With an understanding of the common errors, you can confidently apply column dropping techniques to practical data cleaning and aggregation tasks using Python.

Cleaning a sales dataset by removing irrelevant columns

Dropping columns like internal_code and last_updated helps streamline a sales dataset, leaving only the essential information for analysis. In real-world scenarios, these datasets often come from reading CSV files into pandas DataFrames.

import pandas as pd # Sample sales dataset with some irrelevant columns sales_df = pd.DataFrame({ 'product_id': [101, 102, 103, 104], 'product_name': ['Laptop', 'Phone', 'Tablet', 'Monitor'], 'price': [1200, 800, 350, 250], 'stock': [15, 25, 40, 10], 'last_updated': ['2023-01-15', '2023-02-10', '2023-01-30', '2023-02-05'], 'internal_code': ['A123', 'B456', 'C789', 'D012'] }) # Remove columns not needed for sales analysis clean_sales_df = sales_df.drop(['internal_code', 'last_updated'], axis=1) print(clean_sales_df)

This example demonstrates how to remove multiple columns in a single operation. By passing a list of column names, like ['internal_code', 'last_updated'], to the drop() method, you can efficiently prepare your data for a specific task. This is much cleaner than dropping columns one by one.

The axis=1 argument is crucial—it directs pandas to operate on columns.
The result is stored in a new DataFrame, clean_sales_df, which preserves the original sales_df.

Efficient data aggregation using `drop()` for intermediate columns

You can use the drop() method to tidy up your DataFrame after an aggregation by removing columns that were only needed for intermediate calculations.

import pandas as pd # Sales data with multiple product categories sales_data = pd.DataFrame({ 'product': ['A', 'B', 'A', 'C', 'B', 'A'], 'region': ['East', 'West', 'East', 'North', 'East', 'West'], 'units_sold': [10, 15, 12, 8, 20, 25], 'price_per_unit': [100, 80, 100, 120, 80, 100] }) # Calculate revenue and profit with 25% margin sales_data['revenue'] = sales_data['units_sold'] * sales_data['price_per_unit'] sales_data['cost'] = sales_data['revenue'] * 0.75 sales_data['profit'] = sales_data['revenue'] - sales_data['cost'] # Create summary by product, dropping intermediate calculation columns summary = sales_data.groupby('product').sum() summary_clean = summary.drop(['price_per_unit', 'cost'], axis=1) print(summary_clean)

This code calculates revenue and profit before summarizing sales data by product. The groupby('product').sum() method aggregates all numeric columns, but this makes some of them—like price_per_unit—meaningless in the summary.

The drop() method is then called on the aggregated summary DataFrame.
It removes the now-irrelevant price_per_unit and cost columns.

This leaves a clean, final report focused only on total units sold, revenue, and profit for each product, which is the intended outcome.

Get started with Replit

Now, turn your knowledge into a functional tool. Describe what you want to build to Replit Agent, like "a utility that cleans CSVs by removing specified columns" or "an app that drops columns with too many nulls."

The Agent will write the code, test for errors, and deploy your app. Start building with Replit.

Build your first app today

Describe what you want to build, and Replit Agent writes the code, handles the infrastructure, and ships it live. Go from idea to real product, all in your browser.

Get started free

Build your first app today

Describe what you want to build, and Replit Agent writes the code, handles the infrastructure, and ships it live. Go from idea to real product, all in your browser.

Get started for free

Follow @Replit