StringRepeat¶
Overview¶
The StringRepeat expression repeats a string a specified number of times by concatenating it with itself. It takes a string input and an integer count, returning a new string that contains the original string repeated the specified number of times.
Syntax¶
Arguments¶
| Argument | Type | Description |
|---|---|---|
| str | StringTypeWithCollation | The input string to be repeated |
| times | IntegerType | The number of times to repeat the string |
Return Type¶
Returns the same data type as the input string (str.dataType), which is a string type with collation support.
Supported Data Types¶
- Input string: String types with collation support (including trim collation)
- Input count: Integer type only
- Implicit casting is supported through
ImplicitCastInputTypes
Algorithm¶
- Validates that both input expressions are not null (null intolerant behavior)
- Casts the string input to
UTF8Stringformat internally - Casts the times parameter to
Integertype - Calls the native
repeat()method on the UTF8String object - Returns the concatenated result as a UTF8String
Partitioning Behavior¶
This expression does not affect partitioning behavior:
- Preserves existing partitioning since it's a row-level transformation
- Does not require shuffle operations
- Can be applied within each partition independently
Edge Cases¶
- Null handling: Returns null if either input argument is null (null intolerant)
- Zero repetitions: Returns empty string when times = 0
- Negative repetitions: Behavior depends on underlying UTF8String.repeat() implementation
- Empty string input: Returns empty string regardless of repetition count
- Large repetition counts: May cause memory issues or overflow depending on string length
Code Generation¶
This expression supports Tungsten code generation through the doGenCode method. It generates optimized code that directly calls the repeat() method on UTF8String objects, avoiding interpretation overhead.
Examples¶
-- Basic usage
SELECT repeat('123', 2);
-- Result: '123123'
-- With column reference
SELECT repeat(name, 3) FROM table;
-- Zero repetitions
SELECT repeat('abc', 0);
-- Result: ''
// DataFrame API usage
import org.apache.spark.sql.functions._
df.select(repeat(col("text_column"), 3))
// With literal values
df.select(repeat(lit("hello"), 2))
See Also¶
StringRepeat- Current expression for string repetitionConcat- For concatenating multiple stringsStringPad- For padding strings to specific lengths- Other string manipulation functions in the string_funcs group