How to plot a histogram in Python

Learn how to plot histograms in Python. Discover different methods, tips, real-world applications, and how to debug common errors.

How to plot a histogram in Python
Published on: 
Thu
Feb 12, 2026
Updated on: 
Tue
Feb 24, 2026
The Replit Team Logo Image
The Replit Team

A histogram is a powerful tool to visualize the distribution of numerical data. Python libraries offer simple functions to create these plots, which helps you understand data patterns at a glance.

In this article, we'll explore techniques to plot histograms with popular libraries. You'll find practical tips, see real-world applications, and get advice to debug common issues you might face.

Basic histogram with matplotlib

import matplotlib.pyplot as plt
import numpy as np

data = np.random.randn(1000)  # Generate random data
plt.hist(data, bins=30)
plt.title('Simple Histogram')
plt.show()--OUTPUT--[No textual output - displays a histogram of normally distributed data with 30 bins]

The magic happens with the plt.hist() function. It takes your dataset—in this case, 1000 randomly generated numbers—and organizes it visually. The key parameter here is bins, which controls the number of vertical bars in your histogram.

By setting bins=30, you're telling the function to divide the entire range of your data into 30 equal segments. The height of each bar then shows how many data points fall into that specific segment. Choosing the right number of bins is crucial; too few can oversimplify the distribution, while too many might introduce noise and obscure the underlying pattern.

Customizing and enhancing histograms

While a basic histogram is useful, you can unlock more insights by customizing its appearance and using other libraries for deeper data analysis.

Customizing histogram appearance

import matplotlib.pyplot as plt
import numpy as np

data = np.random.randn(1000)
plt.hist(data, bins=20, color='skyblue', edgecolor='black', alpha=0.7)
plt.title('Customized Histogram')
plt.xlabel('Value')
plt.ylabel('Frequency')--OUTPUT--Text(0, 0.5, 'Frequency')

You can easily tweak the histogram's look by passing more arguments into the plt.hist() function. This helps make your chart clearer and more visually appealing.

  • The color argument sets the fill color of the bars.
  • edgecolor defines the border color for each bar, which can improve contrast.
  • alpha adjusts the transparency; a value like 0.7 makes the bars partially see-through.

Adding context is just as important. The plt.xlabel() and plt.ylabel() functions let you label the axes, so anyone looking at your plot immediately understands what the values and frequencies represent.

Using numpy for histogram calculations

import numpy as np
import matplotlib.pyplot as plt

data = np.random.normal(170, 10, 250)  # Height data (mean=170, std=10)
counts, bins = np.histogram(data, bins=15)
plt.stairs(counts, bins)
plt.xlabel('Height (cm)')
plt.ylabel('Count')--OUTPUT--Text(0, 0.5, 'Count')

Instead of plotting directly, you can use NumPy to pre-calculate histogram data. This gives you more control over the results before visualization.

  • The np.histogram() function processes your dataset and returns two key pieces of information: the frequency counts and the bin edges.
  • You can then pass these results to a different plotting function, such as plt.stairs(), which draws a step plot that outlines the histogram's shape.

Creating histograms with pandas

import pandas as pd
import numpy as np

df = pd.DataFrame({'values': np.random.normal(0, 1, 1000)})
hist = df.hist(column='values', bins=25, grid=False, figsize=(8, 6))--OUTPUT--array([[<Axes: title={'center': 'values'}>]], dtype=object)

If your data's already in a pandas DataFrame, you can plot a histogram with the built-in .hist() method. It’s a convenient wrapper around matplotlib that simplifies the process—just call the method on your DataFrame and specify which column to visualize.

  • The bins argument works just as it does in matplotlib.
  • You can pass other formatting options, like grid=False to remove grid lines or figsize to adjust the plot's size.

Advanced histogram techniques

To tell a more compelling story with your data, you can use advanced libraries for richer statistical plots, interactive charts, and direct data comparisons.

Statistical visualization with seaborn

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

data = np.random.normal(0, 1, 1000)
sns.histplot(data, kde=True, stat="density", linewidth=0)
plt.title('Histogram with Kernel Density Estimate')--OUTPUT--Text(0.5, 1.0, 'Histogram with Kernel Density Estimate')

Seaborn is built on matplotlib and simplifies creating statistically-rich visualizations. Its sns.histplot() function offers more than just bars; it can reveal deeper patterns in your data.

  • Setting kde=True adds a Kernel Density Estimate. This is a smooth line that estimates the probability distribution of the data, giving you a clearer view of the shape.
  • Using stat="density" normalizes the histogram. Instead of showing raw counts, the area of the bars sums to one, which makes it easier to compare with the KDE curve.

Interactive histograms with plotly

import plotly.express as px
import numpy as np

data = np.random.normal(0, 1, 1000)
fig = px.histogram(data, nbins=30, marginal="box",
                  title="Interactive Histogram with Box Plot")
fig.show()--OUTPUT--[No textual output - displays an interactive histogram with a box plot]

Plotly specializes in creating interactive charts that you can explore directly. Using px.histogram(), you can generate a plot where hovering over bars reveals exact counts and you can zoom into specific regions. This makes data inspection much more dynamic.

  • A key feature is the marginal="box" argument, which adds a box plot above the histogram.
  • This gives you a quick statistical summary of the data's distribution, including its median and quartiles.

Multiple histograms for comparison

import matplotlib.pyplot as plt
import numpy as np

data1 = np.random.normal(0, 1, 1000)
data2 = np.random.normal(3, 1.5, 1000)
plt.hist([data1, data2], bins=30, alpha=0.7, label=['Distribution 1', 'Distribution 2'])
plt.legend()
plt.title('Comparing Multiple Distributions')--OUTPUT--Text(0.5, 1.0, 'Comparing Multiple Distributions')

You can easily compare distributions by plotting them on the same axes. Simply pass a list of your datasets, such as [data1, data2], directly into the plt.hist() function. This overlays the histograms, making it easy to spot differences in their shapes and spreads.

  • To distinguish between the datasets, provide a list of names to the label argument.
  • Call the plt.legend() function to display these labels on the plot. The transparency set by alpha helps you see where the distributions overlap.

Move faster with Replit

Replit is an AI-powered development platform that transforms natural language into working applications. You can describe what you want to build, and Replit Agent creates it—complete with databases, APIs, and deployment.

For the histogram techniques we've explored, Replit Agent can turn them into production-ready tools. You can use it to build complete applications based on the plotting methods covered in this article.

  • Build a user analytics dashboard that visualizes engagement metrics with interactive histograms, similar to those made with plotly.
  • Create a financial analysis tool that overlays histograms to compare the performance distributions of different stocks.
  • Deploy a scientific data utility that uses density plots from seaborn to analyze and present experimental results.

Describe your app idea, and Replit Agent writes the code, tests it, and fixes issues automatically, all within your browser.

Common errors and challenges

Even with simple functions, you can run into a few common roadblocks when plotting histograms in Python.

The plt.hist() function expects numerical data, so you’ll get a TypeError if you pass it strings or other non-numeric types. The function simply can't perform mathematical operations on text. Before plotting, always ensure your data is cleaned and converted to a numerical format like an integer or float.

When you overlay multiple histograms, they can easily obscure one another, making the chart difficult to read. You can solve this by using the alpha argument to control transparency. Setting a value like alpha=0.7 makes the bars partially see-through, which clearly shows how the different distributions overlap.

If your histogram doesn't appear at all, the problem often lies with the bins parameter. A histogram won't render if bins is set to zero or a negative number. Similarly, if you define custom bin edges that fall completely outside your data's range, the plot will be empty because no data points fall into any of the specified intervals.

Fixing incorrect data types for plt.hist()

The plt.hist() function works only with numerical data. You'll trigger a TypeError if you pass it a list of strings, even if they look like numbers, because it can't mathematically group them into bins. The code below shows this error in action.

import matplotlib.pyplot as plt

# Dataset with mixed types
data = ['1', '2', '3', '4', '5', '3', '2', '1', '2', '3']
plt.hist(data)
plt.title('Histogram of String Data')
plt.show()

The code fails because the data list contains strings instead of numbers. The plt.hist() function can't group text values into bins, which triggers a TypeError. The example below shows how to prepare the data correctly.

import matplotlib.pyplot as plt

# Convert string data to numeric
data = ['1', '2', '3', '4', '5', '3', '2', '1', '2', '3']
numeric_data = [int(x) for x in data]
plt.hist(numeric_data)
plt.title('Histogram of Numeric Data')
plt.show()

The fix is to convert the string data into a numerical format. The solution uses a list comprehension, [int(x) for x in data], to iterate through the list and transform each string element into an integer. This new list of numbers, numeric_data, can then be correctly processed by plt.hist(). This error often appears when you import data from files like CSVs, where numbers can be accidentally read as text by default.

Resolving overlapping histograms with proper alpha settings

When you're plotting multiple histograms together, one can easily block the other, making it impossible to see where they overlap. This happens because the bars are solid by default. The code below shows how the second histogram completely hides parts of the first.

import matplotlib.pyplot as plt
import numpy as np

data1 = np.random.normal(0, 1, 1000)
data2 = np.random.normal(2, 1, 1000)

plt.hist(data1, bins=30, color='blue')
plt.hist(data2, bins=30, color='red')
plt.title('Two Distributions')
plt.show()

By calling plt.hist() twice, the code layers one histogram on top of another. The second plot's red bars are drawn over the first, completely hiding the blue bars wherever the two datasets overlap. The following example demonstrates the fix.

import matplotlib.pyplot as plt
import numpy as np

data1 = np.random.normal(0, 1, 1000)
data2 = np.random.normal(2, 1, 1000)

plt.hist(data1, bins=30, color='blue', alpha=0.5)
plt.hist(data2, bins=30, color='red', alpha=0.5)
plt.title('Two Distributions')
plt.show()

The fix is to add the alpha argument to each plt.hist() call. Setting alpha=0.5 makes the bars semi-transparent, so you can see where the distributions intersect. This reveals the full shape of each dataset, preventing one from completely hiding another.

This technique is crucial whenever you're layering multiple plots on the same axes. It ensures your comparative visualizations are clear and easy to interpret.

Troubleshooting missing histogram when using incorrect bin values

If your histogram doesn't appear, the issue often lies with the bins parameter. Setting it to an invalid number, such as zero, prevents the plot from rendering because it has no intervals to group data into. See what happens below.

import matplotlib.pyplot as plt
import numpy as np

data = np.random.normal(0, 1, 100)
plt.hist(data, bins=0)  # Invalid bin value
plt.title('Histogram with Invalid Bins')
plt.show()

The code passes bins=0 to plt.hist(), which is an invalid argument. With no bins available to sort the data, the function can't generate a plot. The following example shows how to provide a valid value.

import matplotlib.pyplot as plt
import numpy as np

data = np.random.normal(0, 1, 100)
plt.hist(data, bins=10)  # Using a valid bin value
plt.title('Histogram with Valid Bins')
plt.show()

The fix is to provide a positive integer to the bins parameter, such as bins=10. This gives plt.hist() a valid number of intervals to group your data into, allowing it to render the plot correctly. An empty plot can also happen if you manually define bin edges that fall completely outside your data's range, so always double-check your binning strategy if a histogram fails to appear.

Real-world applications

With the technical challenges solved, you can apply histograms to real-world scenarios like analyzing exam scores and detecting outliers.

Analyzing exam scores with plt.hist()

Using plt.hist(), you can visualize the distribution of exam scores to quickly spot performance patterns and understand the overall class dynamic.

import matplotlib.pyplot as plt
import numpy as np

# Simulated exam scores with two performance groups
scores = np.concatenate([np.random.normal(65, 10, 30), np.random.normal(85, 8, 70)])

plt.hist(scores, bins=10, color='green', alpha=0.7)
plt.axvline(scores.mean(), color='red', linestyle='dashed')
plt.title('Distribution of Exam Scores')
plt.xlabel('Score')
plt.ylabel('Number of Students')

This code simulates a bimodal distribution of exam scores, which represents a class with two distinct performance groups. It uses np.concatenate() to combine two separate datasets into one before plotting.

  • The plt.hist() function visualizes the distribution of these combined scores.
  • A dashed vertical line is added with plt.axvline() to mark the overall average score, which lets you compare the mean against the two performance clusters.

Detecting outliers with the np.std() threshold

By calculating the standard deviation with np.std(), you can set a simple threshold on a histogram to visually flag potential outliers in your dataset.

import matplotlib.pyplot as plt
import numpy as np

# Generate data with outliers
data = np.concatenate([np.random.normal(0, 1, 500), np.random.uniform(5, 10, 10)])

plt.hist(data, bins=30)
plt.title('Dataset with Outliers')
plt.axvline(np.mean(data) + 2*np.std(data), color='red', linestyle='dashed')

This code creates a dataset by combining two different distributions with np.concatenate(). The bulk of the data is normally distributed, with a smaller set of uniformly distributed points added to create a wider spread.

  • The plt.hist() function visualizes the combined data across 30 bins.
  • A vertical dashed red line is drawn using plt.axvline().
  • Its position is calculated at two standard deviations (np.std()) above the mean (np.mean()), marking a statistical point of interest on the plot.

Get started with Replit

Turn your knowledge into a real application. Tell Replit Agent: "Build a dashboard to visualize user engagement with interactive histograms" or "Create a tool to compare stock performance distributions."

The agent writes the code, tests for errors, and deploys your app automatically. Start building with Replit.

Get started free

Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

Get started for free

Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.