UnresolvedCollation¶
Overview¶
UnresolvedCollation is a temporary leaf expression used during Spark SQL's analysis phase to represent a collation name that has not yet been resolved from a fully qualified collation name. This expression serves as a placeholder until the collation specification can be validated and resolved into a concrete collation implementation.
Syntax¶
Arguments¶
| Argument | Type | Description |
|---|---|---|
| collationName | Seq[String] | A sequence of strings representing the hierarchical collation name parts (e.g., database, schema, collation) |
Return Type¶
This expression throws an UnresolvedException when dataType is accessed, as it is not meant to be evaluated directly. The return type is determined after resolution.
Supported Data Types¶
This expression does not directly support data types as it is an intermediate representation. After resolution, collations typically apply to string-based data types.
Algorithm¶
-
Stores the unresolved collation name as a sequence of string identifiers
-
Implements the
Unevaluabletrait, preventing direct evaluation -
Maintains
resolved = falseto indicate it requires further analysis -
Throws
UnresolvedExceptionwhen data type information is requested -
Uses the
UNRESOLVED_COLLATIONtree pattern for identification during analysis
Partitioning Behavior¶
This expression does not affect partitioning behavior as it is resolved away during the analysis phase before physical planning:
-
Does not preserve or affect partitioning (resolved before physical planning)
-
No shuffle requirements (not present during execution)
Edge Cases¶
-
Throws
UnresolvedExceptionwhendataTypeis accessed before resolution -
Returns
falsefornullableproperty by default -
Cannot be evaluated directly due to
Unevaluabletrait implementation -
Must be resolved during analysis phase or query compilation will fail
Code Generation¶
This expression does not support code generation as it implements the Unevaluable trait and is designed to be resolved away during the analysis phase before code generation occurs.
Examples¶
-- Internal representation during parsing of:
SELECT col COLLATE utf8_general_ci FROM table
-- Creates UnresolvedCollation(Seq("utf8_general_ci"))
// This is an internal expression, not directly created in DataFrame API
// It would be created during SQL parsing of collation specifications
val unresolvedCollation = UnresolvedCollation(Seq("utf8", "general", "ci"))
See Also¶
- ResolvedCollation expressions (after analysis phase)
- Collation-related catalog functions
- String expressions that support collation