TryDivide¶
Overview¶
TryDivide is a Spark Catalyst expression that performs division with error handling, returning NULL instead of throwing exceptions when encountering division errors (such as divide by zero). It acts as a runtime replaceable expression that delegates to optimized division implementations based on the input data types.
Syntax¶
Arguments¶
| Argument | Type | Description |
|---|---|---|
| left (dividend) | Expression | The numerator expression to be divided |
| right (divisor) | Expression | The denominator expression to divide by |
| replacement | Expression | Internal replacement expression used for actual evaluation |
Return Type¶
Returns the same numeric type as the result of the division operation between the input types, or NULL when division errors occur.
Supported Data Types¶
- Numeric Types: All Spark numeric types (ByteType, ShortType, IntegerType, LongType, FloatType, DoubleType, DecimalType)
- DateTime Types: Supported through fallback mechanism with TryEval wrapper
- Mixed numeric type combinations follow standard Spark type promotion rules
Algorithm¶
- For numeric type inputs: Delegates to
Divide(left, right, EvalMode.TRY)for optimized execution - For non-numeric types: Uses
TryEval(Divide(left, right, EvalMode.ANSI))as fallback - TRY evaluation mode captures division exceptions and returns
NULLinstead of failing - Inherits analysis rules from the underlying division expression
- Acts as a runtime replaceable expression, substituting itself with the appropriate implementation during query planning
Partitioning Behavior¶
- Preserves Partitioning: Yes, as it operates row-by-row without requiring data redistribution
- Requires Shuffle: No, the expression is evaluated locally on each partition
- Does not affect the partitioning scheme of the input data
Edge Cases¶
- Null Handling: Returns
NULLwhen either dividend or divisor isNULL - Division by Zero: Returns
NULLinstead of throwingArithmeticException - Overflow/Underflow: For numeric types, overflow conditions return
NULLin TRY mode - Type Compatibility: Non-numeric datetime operations fall back to TryEval wrapper for error handling
- Decimal Precision: Follows Spark's decimal arithmetic rules with appropriate precision and scale
Code Generation¶
- Numeric Types: Supports Tungsten code generation through the optimized
Divideexpression withEvalMode.TRY - Non-Numeric Types: Falls back to interpreted mode via
TryEvalwrapper - Code generation eligibility depends on the underlying replacement expression's capabilities
Examples¶
-- Basic division with error handling
SELECT try_divide(10, 2) as result; -- Returns 5.0
SELECT try_divide(10, 0) as result; -- Returns NULL
-- Handling NULL inputs
SELECT try_divide(NULL, 5) as result; -- Returns NULL
SELECT try_divide(10, NULL) as result; -- Returns NULL
-- Mixed numeric types
SELECT try_divide(100, 3.0) as result; -- Returns 33.333...
// DataFrame API usage
import org.apache.spark.sql.functions._
df.select(expr("try_divide(revenue, quantity)").alias("unit_price"))
// Handling potential division by zero in aggregations
df.groupBy("category")
.agg(expr("try_divide(sum(sales), sum(units))").alias("avg_unit_price"))
See Also¶
Divide- Standard division expression that throws exceptions on errorsTryEval- Generic wrapper for converting exceptions to NULL valuesSafeDivide- Alternative safe division implementations- Math Functions: Other mathematical expressions with TRY evaluation modes