Left¶
Overview¶
The Left expression extracts a specified number of characters from the left side of a string or binary value. This expression is implemented as a RuntimeReplaceable that internally uses the Substring expression with a starting position of 1.
Syntax¶
Arguments¶
| Argument | Type | Description |
|---|---|---|
| str | String/Binary | The input string or binary data from which to extract characters |
| len | Integer | The number of characters to extract from the left side |
Return Type¶
Returns the same data type as the input str argument:
- String input returns String
- Binary input returns Binary
Supported Data Types¶
- String types: All string types with collation support (specifically those supporting trim collation)
- Binary type: Raw binary data
Algorithm¶
- Validates input types ensuring
stris string/binary andlenis integer - Performs implicit type casting when necessary through
ImplicitCastInputTypes - Internally replaces the expression with
Substring(str, Literal(1), len)at runtime - The substring operation starts at position 1 (1-based indexing) and extracts
lencharacters - Leverages the existing
Substringexpression implementation for actual evaluation
Partitioning Behavior¶
This expression preserves partitioning behavior:
- Does not require data shuffle as it operates on individual rows
- Maintains existing data partitioning since it's a row-level transformation
- Can be pushed down to individual partitions for parallel execution
Edge Cases¶
- Null handling: If either
strorlenis null, the result is null - Negative length: Behavior depends on underlying
Substringimplementation - Length exceeds string: Returns the entire string when
lenis greater than string length - Zero length: Returns empty string when
lenis 0 - Empty string input: Returns empty string regardless of
lenvalue
Code Generation¶
This expression supports Tungsten code generation through its RuntimeReplaceable nature:
- The replacement
Substringexpression supports code generation - Compilation occurs at query planning time, replacing
Leftwith optimizedSubstringcode - Falls back to interpreted mode only if the underlying
Substringexpression cannot be code-generated
Examples¶
-- Extract first 3 characters
SELECT LEFT('Apache Spark', 3); -- Returns 'Apa'
-- With column reference
SELECT LEFT(name, 5) FROM users;
-- With binary data
SELECT LEFT(CAST('binary_data' AS BINARY), 4);
// DataFrame API examples
import org.apache.spark.sql.functions._
// Extract first 3 characters
df.select(left(col("text_column"), 3))
// Dynamic length based on another column
df.select(left(col("description"), col("max_length")))
// With literal string
df.select(left(lit("Apache Spark"), 5))
See Also¶
Substring- The underlying expression used for implementationRight- Extracts characters from the right side of a stringMid/Substr- General substring extraction with custom start position