CheckOverflowInTableInsert¶
Overview¶
The CheckOverflowInTableInsert expression is a wrapper that captures arithmetic overflow errors during numeric casting operations in table insert scenarios. It provides enhanced error messages that include the source and target data types along with the column name where the overflow occurred, making debugging easier for users.
Syntax¶
This expression is not directly accessible through SQL syntax. It's an internal Catalyst expression used by Spark's query planner to wrap cast operations during table insertions.
Arguments¶
| Argument | Type | Description |
|---|---|---|
| child | Expression | The underlying expression (typically a Cast) that may cause overflow |
| columnName | String | The name of the target column for error reporting |
Return Type¶
Returns the same data type as the child expression (child.dataType).
Supported Data Types¶
Supports all data types that the underlying child expression supports, but is primarily designed for numeric casting operations where overflow can occur:
- Byte, Short, Integer, Long
- Float, Double
- Decimal types
- Any other data types supported by the wrapped expression
Algorithm¶
- Executes the child expression normally during evaluation
- Catches any
SparkArithmeticExceptionthrown by the child expression - If the child is a Cast operation, throws an enhanced error message with source type, target type, and column name
- If the child is not a Cast, re-throws the original exception
- For code generation, wraps the child's generated code in a try-catch block with enhanced error handling
Partitioning Behavior¶
This expression preserves the partitioning behavior of its child expression:
- Does not affect partitioning directly as it's a wrapper expression
- Partitioning behavior depends entirely on the wrapped child expression
- No additional shuffle operations are introduced
Edge Cases¶
- Null handling: Preserves the null handling behavior of the child expression
- Non-Cast children: When wrapping non-Cast expressions, falls back to standard error handling
- Proxy expressions: Handles
ExpressionProxywrappers around Cast expressions - Overflow detection: Only catches
SparkArithmeticException, allowing other exceptions to propagate normally - Error message enhancement: Only provides enhanced error messages when the child is identifiable as a Cast operation
Code Generation¶
Supports full code generation (Tungsten):
- Generates optimized code when the child is a Cast expression
- Wraps the child's generated code in a try-catch block
- Includes proper null handling in generated code
- Falls back to child's code generation for non-Cast expressions
- Uses
CodegenContextto manage object references for data types and column names
Examples¶
-- This expression is used internally during operations like:
INSERT INTO target_table (numeric_column)
SELECT large_value FROM source_table;
-- When large_value causes overflow in numeric_column
// Internal Catalyst usage (not user-facing)
val castExpr = Cast(child = largeIntExpr, dataType = ByteType)
val safeExpr = CheckOverflowInTableInsert(castExpr, "target_column")
See Also¶
Cast- The primary expression this wrapper enhancesUnaryExpression- The base class this expression extendsExpressionProxy- Handled specially in cast detectionQueryExecutionErrors.castingCauseOverflowErrorInTableInsert- The enhanced error method used