Inline¶
Overview¶
The Inline expression explodes an array of structs into a table, where each struct in the array becomes a separate row. It flattens the array structure by expanding each struct element into individual rows with the struct's fields as columns.
Syntax¶
Arguments¶
| Argument | Type | Description |
|---|---|---|
| child | Expression | An expression that evaluates to an array of struct types |
Return Type¶
Returns a collection of InternalRow objects, where each row has the schema of the struct type contained in the input array.
Supported Data Types¶
- Input:
ArrayTypecontainingStructTypeelements (with or without nulls) - Output: Individual rows with the schema matching the struct fields
Algorithm¶
- Evaluates the child expression to get an
ArrayDataobject - Iterates through each element in the array (from 0 to
numElements() - 1) - For each element, extracts the struct using
getStruct(i, numFields) - Returns each struct as a separate row, using
generatorNullRowfor null structs - Returns empty collection (
Nil) if the input array is null
Partitioning Behavior¶
- Does not preserve partitioning as it generates multiple output rows from a single input row
- May require data redistribution depending on downstream operations
- The number of output rows per input row depends on the array size
Edge Cases¶
- Null array input: Returns empty collection (
Nil) - Null struct elements: Replaced with
generatorNullRow(a row with all null values) - Empty array: Returns empty collection
- Nullable struct fields: Handled based on the array's
containsNullproperty - if true, the output schema is made nullable usingst.asNullable
Code Generation¶
Supports Tungsten code generation through the doGenCode method, though the implementation delegates to the child expression's code generation.
Examples¶
-- Example SQL usage
SELECT INLINE(array_of_people)
FROM (
SELECT array(
struct('John' as name, 25 as age),
struct('Jane' as name, 30 as age)
) as array_of_people
)
-- Results in two rows: (John, 25) and (Jane, 30)
// Example DataFrame API usage
import org.apache.spark.sql.functions._
val df = Seq(
Array(("John", 25), ("Jane", 30))
).toDF("people")
df.select(inline(col("people")))
// Results in DataFrame with columns: _1 (String), _2 (Int)
See Also¶
Explode- for exploding arrays into rows without struct expansionPosExplode- for exploding with position informationCollectionGenerator- parent trait for collection-based generators