How to compare NaN in Python
Learn to compare NaN in Python. This guide covers methods, tips, real-world uses, and debugging common errors.

In Python, NaN (Not a Number) values present unique comparison challenges. Standard equality checks with == fail, so you need special functions to handle them correctly in data analysis.
Here, you'll learn techniques to compare NaN values, with real-world applications and debugging advice. These tips help you manage data accurately and avoid common pitfalls in your projects.
Using math.isnan() to check for NaN values
import math
x = float('nan')
is_nan = math.isnan(x)
print(f"Is x a NaN? {is_nan}")
print(f"x == x is {x == x}") # NaN is never equal to itself--OUTPUT--Is x a NaN? True
x == x is False
The math.isnan() function is the standard library's answer to the unique behavior of NaN. Because a NaN value is never equal to itself, you can't use the standard equality operator == for checks. The code confirms this by showing x == x returns False, which is a core principle of the IEEE 754 floating-point standard.
Using math.isnan() provides a reliable way to detect these values. It's designed specifically to handle this exception, ensuring your data validation and cleaning processes work as expected without silent failures from faulty comparisons.
Standard library approaches for NaN comparison
While math.isnan() is the standard, you can also handle NaN comparisons by using operators like ==, leveraging the cmath module, or building a custom function.
Using == and != operators with NaN
import math
nan1 = float('nan')
nan2 = float('nan')
print(f"nan1 == nan2: {nan1 == nan2}")
print(f"nan1 != nan2: {nan1 != nan2}")
print(f"nan1 is nan2: {nan1 is nan2}")--OUTPUT--nan1 == nan2: False
nan1 != nan2: True
nan1 is nan2: False
The equality operator == always returns False when comparing two NaN values, which is why nan1 == nan2 evaluates to False. Consequently, the inequality operator != returns True. This behavior is a core feature of how floating-point numbers are defined.
- You can use the fact that
x != xis only true whenxisNaNas a clever way to detect it. - The identity operator
isalso returnsFalsebecausenan1andnan2are distinct objects in memory, even if they represent the same concept.
Using the cmath module for complex NaN detection
import cmath
import math
complex_nan = complex(float('nan'), 0)
print(f"Is real part NaN? {math.isnan(complex_nan.real)}")
print(f"Is complex number NaN? {cmath.isnan(complex_nan)}")--OUTPUT--Is real part NaN? True
Is complex number NaN? True
The cmath module is your tool for handling NaN in complex numbers. You can't use math.isnan() on a complex number object directly; it only works on its individual floating-point components.
- Use
math.isnan()to check the.realor.imagparts separately. - Use
cmath.isnan()to check the entire complex number at once. It returnsTrueif either part isNaN.
Creating a custom nan_equal() function
def nan_equal(a, b):
if math.isnan(a) and math.isnan(b):
return True
return a == b
x, y = float('nan'), float('nan')
print(f"Regular equality: {x == y}")
print(f"Custom equality: {nan_equal(x, y)}")--OUTPUT--Regular equality: False
Custom equality: True
Sometimes you need to treat two NaN values as equal, which standard operators don't allow. A custom function like nan_equal() gives you control over this logic. It's useful when you want to consider all NaNs as representing the same "missing data" state in your analysis.
- The function first checks if both arguments are
NaNusingmath.isnan(). If they are, it returnsTrue, overriding the default behavior. - If one or both values are not
NaN, it falls back to a standarda == bcomparison.
Advanced techniques for NaN handling
While standard library functions are great for single values, data science libraries like NumPy and pandas offer more powerful, vectorized tools for handling NaN in arrays.
Using NumPy's np.isnan() for array operations
import numpy as np
arr = np.array([1.0, np.nan, 3.0, np.nan, 5.0])
nan_mask = np.isnan(arr)
print(f"Array: {arr}")
print(f"NaN mask: {nan_mask}")
print(f"Non-NaN values: {arr[~nan_mask]}")--OUTPUT--Array: [ 1. nan 3. nan 5.]
NaN mask: [False True False True False]
Non-NaN values: [1. 3. 5.]
NumPy's np.isnan() function is built for speed and efficiency with arrays. Instead of checking elements one by one, it operates on the entire array at once, producing a boolean mask that flags the position of each NaN value.
- This mask—in this case,
[False, True, False, True, False]—pinpoints exactly where theNaNs are. - You can then use this mask to easily filter your data. By inverting it with the
~operator, you can select only the valid, non-NaNnumbers from your array.
Working with NaN in pandas dataframes
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, np.nan, 3], 'B': [np.nan, 5, 6]})
print(df)
print("\nRows with any NaN values:")
print(df[df.isna().any(axis=1)])--OUTPUT--A B
0 1.0 NaN
1 NaN 5.0
2 3.0 6.0
Rows with any NaN values:
A B
0 1.0 NaN
1 NaN 5.0
Pandas is your go-to for handling NaN in tabular data like DataFrames. The isna() method works like NumPy's isnan() but on the entire structure, creating a boolean mask that flags missing values.
- You can chain
.any(axis=1)to this mask. This checks each row for at least oneTruevalue, identifying all rows that contain aNaN. - This boolean result is then used to filter the original DataFrame, giving you a clean way to isolate and inspect incomplete records.
Using pd.isna() for vectorized NaN checking
import pandas as pd
import numpy as np
values = [1, np.nan, "string", None, pd.NA]
result = pd.isna(values)
for val, is_na in zip(values, result):
print(f"{val!r} is NaN/NA: {is_na}")--OUTPUT--1 is NaN/NA: False
nan is NaN/NA: True
'string' is NaN/NA: False
None is NaN/NA: True
<NA> is NaN/NA: True
The pd.isna() function is a versatile tool for detecting missing values across mixed data types. It's a powerful, vectorized check that goes beyond just floating-point NaNs, giving you a single function to handle various "not available" markers common in data analysis.
- It correctly identifies not only
np.nanbut also Python's built-inNoneobject. - It also recognizes the experimental
pd.NAvalue, which is pandas' modern approach to handling missing data consistently across different types.
Move faster with Replit
Replit is an AI-powered development platform where all Python dependencies pre-installed, so you can skip setup and start coding instantly. This lets you move from learning individual techniques, like handling NaN values, to building complete applications faster.
Instead of piecing together functions, you can use Agent 4 to build a working product directly from a description. It handles writing the code, connecting to databases, and managing deployment. You can describe the app you want to build, and Agent will take it from there.
- A data cleaning utility that scans CSV files, identifies rows with
NaNorNonevalues usingpd.isna(), and exports a sanitized version. - A financial data dashboard that pulls stock data, flags missing data points in the time series, and visualizes only the complete data.
- A data validation tool that checks numerical inputs, using
math.isnan()to reject invalid entries before they are saved to a database.
Simply describe your app, and Replit will write the code, test it, and fix issues automatically, all within your browser.
Common errors and challenges
Even with the right tools, NaN values can cause subtle bugs in conditionals, calculations, and data pipelines if you're not careful.
Troubleshooting NaN comparisons in conditionals
A frequent mistake is using the equality operator == to check for NaN inside an if statement. Because a NaN value is never equal to itself, a condition like if my_var == float('nan'): will always fail, even if my_var is NaN. This can lead to silent bugs where your error-handling logic is never triggered.
- The Fix: Always use
math.isnan(my_var)inside conditionals. It's the only reliable way to test if a single floating-point value isNaN. - The Takeaway: Don't rely on direct comparison for
NaN. It breaks the standard rules of equality you'd expect from other values.
Avoiding division by zero errors with NaN handling
In standard Python, dividing by zero raises a ZeroDivisionError. However, in libraries like NumPy, dividing a float by zero produces inf (infinity), and 0.0 / 0.0 results in NaN. If you're not prepared for this, unexpected NaN values can appear in your data after a calculation and corrupt downstream results.
A good practice is to check for NaN immediately after any vectorized division. This allows you to handle the results gracefully—for example, by replacing the NaN with a zero or another default value—before it affects further analysis.
Debugging issues with NaN propagation in calculations
NaN values are contagious. Any arithmetic operation involving a NaN will produce another NaN. For instance, 5 + np.nan results in NaN. This is called propagation, and it can make debugging a nightmare because the original source of the NaN might be many steps removed from where you finally notice it.
When you find a NaN in your final output, you have to trace it backward. Using functions like pd.isna().sum() can help you pinpoint which columns in a DataFrame are accumulating missing values, giving you a starting point for your investigation.
Troubleshooting NaN comparisons in conditionals
A frequent mistake is using the equality operator == to check for NaN inside a conditional. Since NaN is never equal to itself, this check always fails, leading to unexpected outcomes. The code below shows exactly why this logic is flawed.
import math
value = float('nan')
if value == float('nan'): # This will never be true
print("Value is NaN")
else:
print("Value is not NaN") # This will always execute
The else block runs because the == operator always returns False when comparing NaN values, making the conditional check unreliable. The correct way to test for NaN is demonstrated in the next snippet.
import math
value = float('nan')
if math.isnan(value): # Proper way to check for NaN
print("Value is NaN")
else:
print("Value is not NaN")
This code demonstrates the correct fix. The math.isnan() function is specifically designed to return True for NaN values, making your conditional checks reliable. This approach ensures the if block executes correctly, unlike comparisons with the == operator.
- Keep an eye out for this issue during data validation or when handling results from calculations, as it's the only way to reliably catch
NaNvalues.
Avoiding division by zero errors with NaN handling
In standard Python, dividing by zero triggers a ZeroDivisionError, which can crash your program if it isn't handled. This differs from libraries like NumPy, where such operations might result in inf or NaN. The following code demonstrates this common runtime error.
def safe_divide(a, b):
return a / b # Will raise ZeroDivisionError if b is 0
result = safe_divide(10, 0) # Raises ZeroDivisionError
The safe_divide function attempts to compute 10 / 0 directly. This operation is invalid in standard Python, triggering a ZeroDivisionError that halts the program. The following code demonstrates how to handle this gracefully.
import math
def safe_divide(a, b):
if b == 0:
return float('nan') # Return NaN instead of raising error
return a / b
result = safe_divide(10, 0)
print(f"Result: {result}, is NaN: {math.isnan(result)}")
This safe_divide function sidesteps a ZeroDivisionError by checking if the divisor is 0. Instead of crashing, it returns float('nan') to represent the undefined result. This strategy is crucial in data processing, as it allows calculations to proceed without interruption. You can then handle the resulting NaN values later in your workflow, preventing a single invalid operation from halting your entire script.
Debugging issues with NaN propagation in calculations
Because NaN values are contagious, they can silently corrupt your calculations. A single NaN in a dataset can spread through your functions, making the final output unusable and hard to debug. The following code shows how this happens in a simple averaging function.
import math
def calculate_average(numbers):
total = 0
for num in numbers:
total += num # NaN will propagate silently
return total / len(numbers)
data = [1, 2, float('nan'), 4]
avg = calculate_average(data)
print(f"Average: {avg}") # Will print: Average: nan
The total += num operation turns the entire sum into NaN the moment it encounters the NaN value, corrupting the final average. The following code shows how to modify the function to handle this case correctly.
import math
def calculate_average(numbers):
valid_nums = [num for num in numbers if not math.isnan(num)]
if not valid_nums:
return 0 # or float('nan') depending on requirements
return sum(valid_nums) / len(valid_nums)
data = [1, 2, float('nan'), 4]
avg = calculate_average(data)
print(f"Average: {avg}") # Will print: Average: 2.3333...
This revised calculate_average function solves the problem by first filtering out any NaN values before performing the calculation.
- It uses a list comprehension with
math.isnan()to create a new list containing only valid numbers. - The average is then calculated using this clean list, preventing the
NaNfrom tainting the final result.
This sanitization step is crucial for any data aggregation, as it stops silent errors from spreading through your calculations.
Real-world applications
Beyond debugging, correctly handling NaN values is fundamental to practical data science tasks like interpolation and machine learning preprocessing.
Interpolating missing values in time series data
When working with time series data, you can use methods like pandas’ interpolate() to fill in NaN values by estimating them from the surrounding data points.
import pandas as pd
import numpy as np
# Create time series with missing values
ts = pd.Series([10, np.nan, 15, np.nan, 20, 25])
print("Original time series with NaNs:")
print(ts)
# Interpolate missing values
ts_interpolated = ts.interpolate()
print("\nTime series after interpolation:")
print(ts_interpolated)
This code shows how pandas can intelligently fill gaps in your data. The interpolate() method replaces NaN values by calculating the midpoint between the preceding and following numbers. It's a common technique for cleaning up time series data before analysis.
- The first
NaNis between 10 and 15, sointerpolate()replaces it with 12.5. - Similarly, the second
NaNfalls between 15 and 20, becoming 17.5.
This default linear strategy creates a complete dataset from incomplete information, which is essential for tasks like plotting or running statistical models.
Handling missing values in machine learning preprocessing with SimpleImputer
In machine learning, you can use scikit-learn's SimpleImputer to systematically replace NaN values with a calculated statistic, like the mean of a column.
import numpy as np
from sklearn.impute import SimpleImputer
# Sample dataset with missing values
X = np.array([[1, 2], [np.nan, 3], [7, 6], [np.nan, 5]])
# Create an imputer to fill missing values with the mean
imputer = SimpleImputer(strategy='mean')
X_imputed = imputer.fit_transform(X)
print("Original data with NaNs:")
print(X)
print("\nImputed data:")
print(X_imputed)
Scikit-learn's SimpleImputer is a key tool for data preprocessing. It's essential because many machine learning algorithms can't work with datasets that have missing values. The imputer offers a clean way to fix this problem automatically.
- The
strategy='mean'setting instructs the imputer to calculate the average for each column that contains aNaN. - The
fit_transform()method first learns these averages from the data and then fills in the missing spots, all in one step.
This makes your data complete and ready for modeling.
Get started with Replit
Turn your knowledge into a tool. Prompt Replit Agent with: “Build a CSV cleaner that flags rows with NaN values” or “Create a safe calculator that returns NaN for invalid math operations.”
Replit Agent will write the code, test for errors, and deploy your application. You just provide the instructions. Start building with Replit.
Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.
Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

.png)

.png)