DecimalDivideWithOverflowCheck¶
Overview¶
A specialized decimal division expression that provides configurable overflow handling for decimal arithmetic operations. This expression is primarily used internally by Spark SQL for aggregate functions like avg() where precise control over overflow behavior is required, with the ability to either return null on overflow or throw an exception with detailed error context.
Syntax¶
This is an internal Spark Catalyst expression and is not directly exposed in SQL syntax. It is automatically generated by Spark's query planner for decimal division operations in aggregate contexts.
Arguments¶
| Argument | Type | Description |
|---|---|---|
| left | Expression | The dividend expression (must evaluate to DecimalType) |
| right | Expression | The divisor expression (must evaluate to DecimalType) |
| dataType | DecimalType | The target decimal type with specific precision and scale |
| context | QueryContext | Query execution context for error reporting |
| nullOnOverflow | Boolean | Controls overflow behavior - true returns null, false throws exception |
Return Type¶
Returns a DecimalType with the precision and scale specified in the dataType parameter. The result can be null if nullOnOverflow is true and an overflow occurs.
Supported Data Types¶
- Input Types: Both operands must be
DecimalType - Output Type:
DecimalTypewith configurable precision and scale
Algorithm¶
- Evaluates the left operand (dividend) first and checks for null values
- If left operand is null and
nullOnOverflowis false, throws an overflow exception with "try_avg" suggestion - If left operand is null and
nullOnOverflowis true, returns null - Evaluates the right operand (divisor) when left operand is non-null
- Performs decimal division using Spark's internal fractional arithmetic
- Applies precision and scale conversion with
ROUND_HALF_UProunding mode - Returns null if the final precision conversion overflows and
nullOnOverflowis true
Partitioning Behavior¶
This expression does not affect partitioning behavior as it operates on individual rows:
- Preserves existing partitioning schemes
- Does not require data shuffle operations
- Can be applied within existing partition boundaries
Edge Cases¶
- Null Left Operand: Behavior depends on
nullOnOverflowflag - either throws exception or returns null - Null Right Operand: Standard null propagation applies, result is null
- Division by Zero: Handled by underlying decimal arithmetic, typically returns null
- Precision Overflow: When result exceeds target precision/scale, behavior controlled by
nullOnOverflow - Scale Truncation: Uses
ROUND_HALF_UProunding mode for scale adjustments
Code Generation¶
Supports Spark's Tungsten code generation with optimized Java code paths:
- Generates inline Java code for both null checking and arithmetic operations
- Includes specialized null handling code based on
nullOnOverflowsetting - Uses direct method calls to decimal arithmetic operations (
$div) - Integrates error context generation for exception handling
Examples¶
-- This expression is not directly accessible in SQL
-- It's used internally by aggregate functions like:
SELECT AVG(decimal_column) FROM table_name;
// Internal usage in Spark Catalyst (not user-facing API)
val divExpr = DecimalDivideWithOverflowCheck(
left = leftExpr,
right = rightExpr,
dataType = DecimalType(20, 4),
context = queryContext,
nullOnOverflow = true
)
See Also¶
Divide- Standard division expression without overflow checkingDecimalType- Decimal data type specificationBinaryExpression- Base class for binary expressions- Aggregate functions that use decimal division (
avg,mean)