How to compare NaN in Python

Learn to compare NaN in Python. This guide covers methods, tips, real-world uses, and debugging common errors.

How to compare NaN in Python
Published on: 
Mon
Apr 6, 2026
Updated on: 
Wed
Apr 8, 2026
The Replit Team

In Python, NaN (Not a Number) values present unique comparison challenges. Standard equality checks with == fail, so you need special functions to handle them correctly in data analysis.

Here, you'll learn techniques to compare NaN values, with real-world applications and debugging advice. These tips help you manage data accurately and avoid common pitfalls in your projects.

Using math.isnan() to check for NaN values

import math
x = float('nan')
is_nan = math.isnan(x)
print(f"Is x a NaN? {is_nan}")
print(f"x == x is {x == x}") # NaN is never equal to itself--OUTPUT--Is x a NaN? True
x == x is False

The math.isnan() function is the standard library's answer to the unique behavior of NaN. Because a NaN value is never equal to itself, you can't use the standard equality operator == for checks. The code confirms this by showing x == x returns False, which is a core principle of the IEEE 754 floating-point standard.

Using math.isnan() provides a reliable way to detect these values. It's designed specifically to handle this exception, ensuring your data validation and cleaning processes work as expected without silent failures from faulty comparisons.

Standard library approaches for NaN comparison

While math.isnan() is the standard, you can also handle NaN comparisons by using operators like ==, leveraging the cmath module, or building a custom function.

Using == and != operators with NaN

import math
nan1 = float('nan')
nan2 = float('nan')
print(f"nan1 == nan2: {nan1 == nan2}")
print(f"nan1 != nan2: {nan1 != nan2}")
print(f"nan1 is nan2: {nan1 is nan2}")--OUTPUT--nan1 == nan2: False
nan1 != nan2: True
nan1 is nan2: False

The equality operator == always returns False when comparing two NaN values, which is why nan1 == nan2 evaluates to False. Consequently, the inequality operator != returns True. This behavior is a core feature of how floating-point numbers are defined.

  • You can use the fact that x != x is only true when x is NaN as a clever way to detect it.
  • The identity operator is also returns False because nan1 and nan2 are distinct objects in memory, even if they represent the same concept.

Using the cmath module for complex NaN detection

import cmath
import math
complex_nan = complex(float('nan'), 0)
print(f"Is real part NaN? {math.isnan(complex_nan.real)}")
print(f"Is complex number NaN? {cmath.isnan(complex_nan)}")--OUTPUT--Is real part NaN? True
Is complex number NaN? True

The cmath module is your tool for handling NaN in complex numbers. You can't use math.isnan() on a complex number object directly; it only works on its individual floating-point components.

  • Use math.isnan() to check the .real or .imag parts separately.
  • Use cmath.isnan() to check the entire complex number at once. It returns True if either part is NaN.

Creating a custom nan_equal() function

def nan_equal(a, b):
if math.isnan(a) and math.isnan(b):
return True
return a == b

x, y = float('nan'), float('nan')
print(f"Regular equality: {x == y}")
print(f"Custom equality: {nan_equal(x, y)}")--OUTPUT--Regular equality: False
Custom equality: True

Sometimes you need to treat two NaN values as equal, which standard operators don't allow. A custom function like nan_equal() gives you control over this logic. It's useful when you want to consider all NaNs as representing the same "missing data" state in your analysis.

  • The function first checks if both arguments are NaN using math.isnan(). If they are, it returns True, overriding the default behavior.
  • If one or both values are not NaN, it falls back to a standard a == b comparison.

Advanced techniques for NaN handling

While standard library functions are great for single values, data science libraries like NumPy and pandas offer more powerful, vectorized tools for handling NaN in arrays.

Using NumPy's np.isnan() for array operations

import numpy as np
arr = np.array([1.0, np.nan, 3.0, np.nan, 5.0])
nan_mask = np.isnan(arr)
print(f"Array: {arr}")
print(f"NaN mask: {nan_mask}")
print(f"Non-NaN values: {arr[~nan_mask]}")--OUTPUT--Array: [ 1. nan 3. nan 5.]
NaN mask: [False True False True False]
Non-NaN values: [1. 3. 5.]

NumPy's np.isnan() function is built for speed and efficiency with arrays. Instead of checking elements one by one, it operates on the entire array at once, producing a boolean mask that flags the position of each NaN value.

  • This mask—in this case, [False, True, False, True, False]—pinpoints exactly where the NaNs are.
  • You can then use this mask to easily filter your data. By inverting it with the ~ operator, you can select only the valid, non-NaN numbers from your array.

Working with NaN in pandas dataframes

import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, np.nan, 3], 'B': [np.nan, 5, 6]})
print(df)
print("\nRows with any NaN values:")
print(df[df.isna().any(axis=1)])--OUTPUT--A B
0 1.0 NaN
1 NaN 5.0
2 3.0 6.0

Rows with any NaN values:
A B
0 1.0 NaN
1 NaN 5.0

Pandas is your go-to for handling NaN in tabular data like DataFrames. The isna() method works like NumPy's isnan() but on the entire structure, creating a boolean mask that flags missing values.

  • You can chain .any(axis=1) to this mask. This checks each row for at least one True value, identifying all rows that contain a NaN.
  • This boolean result is then used to filter the original DataFrame, giving you a clean way to isolate and inspect incomplete records.

Using pd.isna() for vectorized NaN checking

import pandas as pd
import numpy as np
values = [1, np.nan, "string", None, pd.NA]
result = pd.isna(values)
for val, is_na in zip(values, result):
print(f"{val!r} is NaN/NA: {is_na}")--OUTPUT--1 is NaN/NA: False
nan is NaN/NA: True
'string' is NaN/NA: False
None is NaN/NA: True
<NA> is NaN/NA: True

The pd.isna() function is a versatile tool for detecting missing values across mixed data types. It's a powerful, vectorized check that goes beyond just floating-point NaNs, giving you a single function to handle various "not available" markers common in data analysis.

  • It correctly identifies not only np.nan but also Python's built-in None object.
  • It also recognizes the experimental pd.NA value, which is pandas' modern approach to handling missing data consistently across different types.

Move faster with Replit

Replit is an AI-powered development platform where all Python dependencies pre-installed, so you can skip setup and start coding instantly. This lets you move from learning individual techniques, like handling NaN values, to building complete applications faster.

Instead of piecing together functions, you can use Agent 4 to build a working product directly from a description. It handles writing the code, connecting to databases, and managing deployment. You can describe the app you want to build, and Agent will take it from there.

  • A data cleaning utility that scans CSV files, identifies rows with NaN or None values using pd.isna(), and exports a sanitized version.
  • A financial data dashboard that pulls stock data, flags missing data points in the time series, and visualizes only the complete data.
  • A data validation tool that checks numerical inputs, using math.isnan() to reject invalid entries before they are saved to a database.

Simply describe your app, and Replit will write the code, test it, and fix issues automatically, all within your browser.

Common errors and challenges

Even with the right tools, NaN values can cause subtle bugs in conditionals, calculations, and data pipelines if you're not careful.

Troubleshooting NaN comparisons in conditionals

A frequent mistake is using the equality operator == to check for NaN inside an if statement. Because a NaN value is never equal to itself, a condition like if my_var == float('nan'): will always fail, even if my_var is NaN. This can lead to silent bugs where your error-handling logic is never triggered.

  • The Fix: Always use math.isnan(my_var) inside conditionals. It's the only reliable way to test if a single floating-point value is NaN.
  • The Takeaway: Don't rely on direct comparison for NaN. It breaks the standard rules of equality you'd expect from other values.

Avoiding division by zero errors with NaN handling

In standard Python, dividing by zero raises a ZeroDivisionError. However, in libraries like NumPy, dividing a float by zero produces inf (infinity), and 0.0 / 0.0 results in NaN. If you're not prepared for this, unexpected NaN values can appear in your data after a calculation and corrupt downstream results.

A good practice is to check for NaN immediately after any vectorized division. This allows you to handle the results gracefully—for example, by replacing the NaN with a zero or another default value—before it affects further analysis.

Debugging issues with NaN propagation in calculations

NaN values are contagious. Any arithmetic operation involving a NaN will produce another NaN. For instance, 5 + np.nan results in NaN. This is called propagation, and it can make debugging a nightmare because the original source of the NaN might be many steps removed from where you finally notice it.

When you find a NaN in your final output, you have to trace it backward. Using functions like pd.isna().sum() can help you pinpoint which columns in a DataFrame are accumulating missing values, giving you a starting point for your investigation.

Troubleshooting NaN comparisons in conditionals

A frequent mistake is using the equality operator == to check for NaN inside a conditional. Since NaN is never equal to itself, this check always fails, leading to unexpected outcomes. The code below shows exactly why this logic is flawed.

import math
value = float('nan')
if value == float('nan'): # This will never be true
print("Value is NaN")
else:
print("Value is not NaN") # This will always execute

The else block runs because the == operator always returns False when comparing NaN values, making the conditional check unreliable. The correct way to test for NaN is demonstrated in the next snippet.

import math
value = float('nan')
if math.isnan(value): # Proper way to check for NaN
print("Value is NaN")
else:
print("Value is not NaN")

This code demonstrates the correct fix. The math.isnan() function is specifically designed to return True for NaN values, making your conditional checks reliable. This approach ensures the if block executes correctly, unlike comparisons with the == operator.

  • Keep an eye out for this issue during data validation or when handling results from calculations, as it's the only way to reliably catch NaN values.

Avoiding division by zero errors with NaN handling

In standard Python, dividing by zero triggers a ZeroDivisionError, which can crash your program if it isn't handled. This differs from libraries like NumPy, where such operations might result in inf or NaN. The following code demonstrates this common runtime error.

def safe_divide(a, b):
return a / b # Will raise ZeroDivisionError if b is 0

result = safe_divide(10, 0) # Raises ZeroDivisionError

The safe_divide function attempts to compute 10 / 0 directly. This operation is invalid in standard Python, triggering a ZeroDivisionError that halts the program. The following code demonstrates how to handle this gracefully.

import math

def safe_divide(a, b):
if b == 0:
return float('nan') # Return NaN instead of raising error
return a / b

result = safe_divide(10, 0)
print(f"Result: {result}, is NaN: {math.isnan(result)}")

This safe_divide function sidesteps a ZeroDivisionError by checking if the divisor is 0. Instead of crashing, it returns float('nan') to represent the undefined result. This strategy is crucial in data processing, as it allows calculations to proceed without interruption. You can then handle the resulting NaN values later in your workflow, preventing a single invalid operation from halting your entire script.

Debugging issues with NaN propagation in calculations

Because NaN values are contagious, they can silently corrupt your calculations. A single NaN in a dataset can spread through your functions, making the final output unusable and hard to debug. The following code shows how this happens in a simple averaging function.

import math

def calculate_average(numbers):
total = 0
for num in numbers:
total += num # NaN will propagate silently
return total / len(numbers)

data = [1, 2, float('nan'), 4]
avg = calculate_average(data)
print(f"Average: {avg}") # Will print: Average: nan

The total += num operation turns the entire sum into NaN the moment it encounters the NaN value, corrupting the final average. The following code shows how to modify the function to handle this case correctly.

import math

def calculate_average(numbers):
valid_nums = [num for num in numbers if not math.isnan(num)]
if not valid_nums:
return 0 # or float('nan') depending on requirements
return sum(valid_nums) / len(valid_nums)

data = [1, 2, float('nan'), 4]
avg = calculate_average(data)
print(f"Average: {avg}") # Will print: Average: 2.3333...

This revised calculate_average function solves the problem by first filtering out any NaN values before performing the calculation.

  • It uses a list comprehension with math.isnan() to create a new list containing only valid numbers.
  • The average is then calculated using this clean list, preventing the NaN from tainting the final result.

This sanitization step is crucial for any data aggregation, as it stops silent errors from spreading through your calculations.

Real-world applications

Beyond debugging, correctly handling NaN values is fundamental to practical data science tasks like interpolation and machine learning preprocessing.

Interpolating missing values in time series data

When working with time series data, you can use methods like pandas’ interpolate() to fill in NaN values by estimating them from the surrounding data points.

import pandas as pd
import numpy as np

# Create time series with missing values
ts = pd.Series([10, np.nan, 15, np.nan, 20, 25])
print("Original time series with NaNs:")
print(ts)

# Interpolate missing values
ts_interpolated = ts.interpolate()
print("\nTime series after interpolation:")
print(ts_interpolated)

This code shows how pandas can intelligently fill gaps in your data. The interpolate() method replaces NaN values by calculating the midpoint between the preceding and following numbers. It's a common technique for cleaning up time series data before analysis.

  • The first NaN is between 10 and 15, so interpolate() replaces it with 12.5.
  • Similarly, the second NaN falls between 15 and 20, becoming 17.5.

This default linear strategy creates a complete dataset from incomplete information, which is essential for tasks like plotting or running statistical models.

Handling missing values in machine learning preprocessing with SimpleImputer

In machine learning, you can use scikit-learn's SimpleImputer to systematically replace NaN values with a calculated statistic, like the mean of a column.

import numpy as np
from sklearn.impute import SimpleImputer

# Sample dataset with missing values
X = np.array([[1, 2], [np.nan, 3], [7, 6], [np.nan, 5]])

# Create an imputer to fill missing values with the mean
imputer = SimpleImputer(strategy='mean')
X_imputed = imputer.fit_transform(X)

print("Original data with NaNs:")
print(X)
print("\nImputed data:")
print(X_imputed)

Scikit-learn's SimpleImputer is a key tool for data preprocessing. It's essential because many machine learning algorithms can't work with datasets that have missing values. The imputer offers a clean way to fix this problem automatically.

  • The strategy='mean' setting instructs the imputer to calculate the average for each column that contains a NaN.
  • The fit_transform() method first learns these averages from the data and then fills in the missing spots, all in one step.

This makes your data complete and ready for modeling.

Get started with Replit

Turn your knowledge into a tool. Prompt Replit Agent with: “Build a CSV cleaner that flags rows with NaN values” or “Create a safe calculator that returns NaN for invalid math operations.”

Replit Agent will write the code, test for errors, and deploy your application. You just provide the instructions. Start building with Replit.

Get started free

Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

Get started free

Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.