IsNaN¶
Overview¶
The IsNaN expression checks whether a floating-point value is NaN (Not a Number). It returns true if the input value is NaN, and false otherwise, including for null inputs.
Syntax¶
Arguments¶
| Argument | Type | Description |
|---|---|---|
| child | Expression | The floating-point expression to check for NaN |
Return Type¶
BooleanType - Returns a boolean value indicating whether the input is NaN.
Supported Data Types¶
DoubleTypeFloatType
Algorithm¶
- Evaluates the child expression to get the input value
- Returns
falseimmediately if the input value is null - For non-null values, calls the appropriate
isNaN()method based on data type - Uses
Double.isNaN()for double values andFloat.isNaN()for float values - Always returns a non-null boolean result
Partitioning Behavior¶
This expression preserves partitioning since it operates on individual rows without requiring data movement:
- Does not require shuffle operations
- Maintains existing data partitioning
- Can be pushed down to individual partitions
Edge Cases¶
- Null handling: Returns
falsefor null inputs (expression is never nullable) - NaN values: Returns
truefor both positive and negative NaN values - Infinity values: Returns
falsefor positive and negative infinity - Regular numbers: Returns
falsefor all finite floating-point numbers including zero
Code Generation¶
This expression supports Tungsten code generation for both DoubleType and FloatType inputs. It generates optimized Java code that uses Double.isNaN() for runtime evaluation, avoiding the overhead of interpreted execution.
Examples¶
-- Check for NaN values in a double column
SELECT ISNAN(CAST('NaN' AS DOUBLE));
-- Result: true
-- Check regular numeric value
SELECT ISNAN(42.5);
-- Result: false
-- Check null value
SELECT ISNAN(CAST(NULL AS DOUBLE));
-- Result: false
-- Check infinity
SELECT ISNAN(CAST('Infinity' AS DOUBLE));
-- Result: false
// DataFrame API usage
import org.apache.spark.sql.functions._
// Check for NaN values
df.select(isnan(col("double_col")))
// Filter out NaN values
df.filter(!isnan(col("price")))
// Count NaN values in a column
df.select(sum(when(isnan(col("value")), 1).otherwise(0)))
See Also¶
IsNull- Check for null valuesIsNotNull- Check for non-null valuesCoalesce- Handle null and NaN values with defaultsNaNvl- Replace NaN values with alternative values