Skip to content

DecimalDivideWithOverflowCheck

Overview

A specialized decimal division expression that provides configurable overflow handling for decimal arithmetic operations. This expression is primarily used internally by Spark SQL for aggregate functions like avg() where precise control over overflow behavior is required, with the ability to either return null on overflow or throw an exception with detailed error context.

Syntax

This is an internal Spark Catalyst expression and is not directly exposed in SQL syntax. It is automatically generated by Spark's query planner for decimal division operations in aggregate contexts.

Arguments

Argument Type Description
left Expression The dividend expression (must evaluate to DecimalType)
right Expression The divisor expression (must evaluate to DecimalType)
dataType DecimalType The target decimal type with specific precision and scale
context QueryContext Query execution context for error reporting
nullOnOverflow Boolean Controls overflow behavior - true returns null, false throws exception

Return Type

Returns a DecimalType with the precision and scale specified in the dataType parameter. The result can be null if nullOnOverflow is true and an overflow occurs.

Supported Data Types

  • Input Types: Both operands must be DecimalType
  • Output Type: DecimalType with configurable precision and scale

Algorithm

  • Evaluates the left operand (dividend) first and checks for null values
  • If left operand is null and nullOnOverflow is false, throws an overflow exception with "try_avg" suggestion
  • If left operand is null and nullOnOverflow is true, returns null
  • Evaluates the right operand (divisor) when left operand is non-null
  • Performs decimal division using Spark's internal fractional arithmetic
  • Applies precision and scale conversion with ROUND_HALF_UP rounding mode
  • Returns null if the final precision conversion overflows and nullOnOverflow is true

Partitioning Behavior

This expression does not affect partitioning behavior as it operates on individual rows:

  • Preserves existing partitioning schemes
  • Does not require data shuffle operations
  • Can be applied within existing partition boundaries

Edge Cases

  • Null Left Operand: Behavior depends on nullOnOverflow flag - either throws exception or returns null
  • Null Right Operand: Standard null propagation applies, result is null
  • Division by Zero: Handled by underlying decimal arithmetic, typically returns null
  • Precision Overflow: When result exceeds target precision/scale, behavior controlled by nullOnOverflow
  • Scale Truncation: Uses ROUND_HALF_UP rounding mode for scale adjustments

Code Generation

Supports Spark's Tungsten code generation with optimized Java code paths:

  • Generates inline Java code for both null checking and arithmetic operations
  • Includes specialized null handling code based on nullOnOverflow setting
  • Uses direct method calls to decimal arithmetic operations ($div)
  • Integrates error context generation for exception handling

Examples

-- This expression is not directly accessible in SQL
-- It's used internally by aggregate functions like:
SELECT AVG(decimal_column) FROM table_name;
// Internal usage in Spark Catalyst (not user-facing API)
val divExpr = DecimalDivideWithOverflowCheck(
  left = leftExpr,
  right = rightExpr, 
  dataType = DecimalType(20, 4),
  context = queryContext,
  nullOnOverflow = true
)

See Also

  • Divide - Standard division expression without overflow checking
  • DecimalType - Decimal data type specification
  • BinaryExpression - Base class for binary expressions
  • Aggregate functions that use decimal division (avg, mean)