IsNull¶
Overview¶
The IsNull expression is a predicate function that tests whether a given expression evaluates to null. It returns true if the input expression is null, and false otherwise. This is a fundamental null-checking operation in Spark SQL.
Syntax¶
Arguments¶
| Argument | Type | Description |
|---|---|---|
| child | Expression | The expression to test for null value |
Return Type¶
Boolean - returns true if the input expression is null, false otherwise.
Supported Data Types¶
All data types are supported since null checking is a universal operation that can be applied to any expression regardless of its data type:
- Numeric types (IntegerType, LongType, DoubleType, DecimalType, etc.)
- String types (StringType, BinaryType)
- Date and time types (DateType, TimestampType)
- Complex types (ArrayType, MapType, StructType)
- Boolean type
Algorithm¶
The evaluation follows these steps:
- Evaluate the child expression against the input row
- Compare the result directly with null using
==operator - Return true if the evaluation result equals null, false otherwise
- The result itself is never null (nullable = false)
- Uses context-independent foldability from the child expression for optimization
Partitioning Behavior¶
This expression preserves partitioning characteristics:
- Does not require shuffle operations
- Can be pushed down to individual partitions
- Maintains existing data distribution since it's a row-level predicate
- Does not affect partition boundaries or require data movement
Edge Cases¶
- The expression itself never returns null (nullable = false)
- If the child expression cannot be evaluated due to errors, the error propagates up
- Works correctly with complex nested data types where null checking applies to the outer structure
- Handles all Spark internal data types consistently through the InternalRow interface
Code Generation¶
This expression supports Tungsten code generation for optimal performance:
- Implements
doGenCodemethod for wholestage code generation - Generated code directly uses the child expression's
isNullflag as the result value - Avoids interpreted evaluation overhead in tight loops
- Sets
isNull = FalseLiteralsince the result is never null
Examples¶
-- Basic null checking
SELECT name IS NULL FROM users;
-- In WHERE clause
SELECT * FROM products WHERE price IS NULL;
-- Example from documentation
SELECT IsNull(1);
-- Returns: false
// DataFrame API usage
import org.apache.spark.sql.functions._
df.filter(col("name").isNull)
df.select(col("price").isNull.as("price_is_null"))
See Also¶
IsNotNull- Tests whether an expression is not nullCoalesce- Returns the first non-null expressionNullIf- Returns null if two expressions are equalIfNull- Returns alternative value if expression is null