Skip to content

CheckOverflowInTableInsert

Overview

The CheckOverflowInTableInsert expression is a wrapper that captures arithmetic overflow errors during numeric casting operations in table insert scenarios. It provides enhanced error messages that include the source and target data types along with the column name where the overflow occurred, making debugging easier for users.

Syntax

This expression is not directly accessible through SQL syntax. It's an internal Catalyst expression used by Spark's query planner to wrap cast operations during table insertions.

// Internal usage in Catalyst
CheckOverflowInTableInsert(castExpression, columnName)

Arguments

Argument Type Description
child Expression The underlying expression (typically a Cast) that may cause overflow
columnName String The name of the target column for error reporting

Return Type

Returns the same data type as the child expression (child.dataType).

Supported Data Types

Supports all data types that the underlying child expression supports, but is primarily designed for numeric casting operations where overflow can occur:

  • Byte, Short, Integer, Long
  • Float, Double
  • Decimal types
  • Any other data types supported by the wrapped expression

Algorithm

  • Executes the child expression normally during evaluation
  • Catches any SparkArithmeticException thrown by the child expression
  • If the child is a Cast operation, throws an enhanced error message with source type, target type, and column name
  • If the child is not a Cast, re-throws the original exception
  • For code generation, wraps the child's generated code in a try-catch block with enhanced error handling

Partitioning Behavior

This expression preserves the partitioning behavior of its child expression:

  • Does not affect partitioning directly as it's a wrapper expression
  • Partitioning behavior depends entirely on the wrapped child expression
  • No additional shuffle operations are introduced

Edge Cases

  • Null handling: Preserves the null handling behavior of the child expression
  • Non-Cast children: When wrapping non-Cast expressions, falls back to standard error handling
  • Proxy expressions: Handles ExpressionProxy wrappers around Cast expressions
  • Overflow detection: Only catches SparkArithmeticException, allowing other exceptions to propagate normally
  • Error message enhancement: Only provides enhanced error messages when the child is identifiable as a Cast operation

Code Generation

Supports full code generation (Tungsten):

  • Generates optimized code when the child is a Cast expression
  • Wraps the child's generated code in a try-catch block
  • Includes proper null handling in generated code
  • Falls back to child's code generation for non-Cast expressions
  • Uses CodegenContext to manage object references for data types and column names

Examples

-- This expression is used internally during operations like:
INSERT INTO target_table (numeric_column) 
SELECT large_value FROM source_table;
-- When large_value causes overflow in numeric_column
// Internal Catalyst usage (not user-facing)
val castExpr = Cast(child = largeIntExpr, dataType = ByteType)
val safeExpr = CheckOverflowInTableInsert(castExpr, "target_column")

See Also

  • Cast - The primary expression this wrapper enhances
  • UnaryExpression - The base class this expression extends
  • ExpressionProxy - Handled specially in cast detection
  • QueryExecutionErrors.castingCauseOverflowErrorInTableInsert - The enhanced error method used