ToJavaArray¶
Overview¶
The ToJavaArray expression converts Spark's internal ArrayData representation to a native Java array. This expression is primarily used internally by Spark for interoperability with Java code and external systems that require native Java array types.
Syntax¶
This expression is typically used internally and not directly exposed in SQL syntax. It may be used programmatically in custom expressions or UDFs.
Arguments¶
| Argument | Type | Description |
|---|---|---|
| array | Expression | The array expression to be converted to a Java array |
Return Type¶
Returns a native Java array type corresponding to the element type of the input ArrayType. The specific Java array type depends on the element type (e.g., int[], String[], Object[]).
Supported Data Types¶
- ArrayType with any supported element type
- Supports nested arrays (multidimensional arrays)
- Compatible with all Spark SQL data types that can be represented as array elements
Algorithm¶
- Evaluates the input array expression to get the
ArrayDataobject - Determines the appropriate Java array type based on the element data type
- Calls the corresponding
ArrayData.to{XXX}Array()method to perform the conversion - Returns the native Java array object
- Leverages constant folding optimization when the input expression is foldable
Partitioning Behavior¶
This expression does not affect partitioning behavior:
- Preserves existing partitioning as it operates on individual array values
- Does not require shuffle operations
- Executed locally on each partition
Edge Cases¶
- Null handling: If the input array expression evaluates to null, returns null
- Empty arrays: Empty
ArrayDatais converted to an empty Java array of the appropriate type - Null elements: Null elements within the array are preserved in the resulting Java array
- Type compatibility: Element types must be convertible to corresponding Java types
Code Generation¶
This expression supports Tungsten code generation when possible. The code generation optimization depends on the underlying array element types and the specific conversion method used.
Examples¶
// This expression is primarily used internally
// Example in custom expression implementation
case class CustomArrayProcessor(input: Expression) extends UnaryExpression {
override def child: Expression = ToJavaArray(input)
// Process the Java array...
}
// Example usage in UDF context
import org.apache.spark.sql.catalyst.expressions.ToJavaArray
// Used internally when Spark needs to convert ArrayData to Java arrays
// for external function calls or interoperability
See Also¶
ArrayData- Spark's internal array representation- Array functions (
array,array_contains,array_join) - Collection expressions in Spark Catalyst
ConstantFoldingoptimizer rule