Luhncheck¶
Overview¶
The Luhncheck expression validates whether a given string represents a valid number according to the Luhn algorithm (also known as the "modulus 10" algorithm). This is commonly used to validate credit card numbers, IMEI numbers, and other identification numbers that use Luhn checksum validation.
Syntax¶
Arguments¶
| Argument | Type | Description |
|---|---|---|
| input | String | The string to be validated using the Luhn algorithm |
Return Type¶
BooleanType - Returns true if the input string passes Luhn validation, false otherwise.
Supported Data Types¶
- String types with collation support (supports trim collation)
- Input values are implicitly cast to string type if not already strings
Algorithm¶
- Delegates validation to the
ExpressionImplUtils.isLuhnNumberstatic method viaStaticInvoke - Implements the Luhn algorithm which processes digits from right to left
- Doubles every second digit from the right, subtracting 9 if the result exceeds 9
- Sums all digits and checks if the total is divisible by 10
- Returns boolean result indicating whether the checksum validates
Partitioning Behavior¶
This expression preserves partitioning behavior:
- Does not require data shuffling across partitions
- Can be evaluated independently on each partition
- Maintains existing partitioning scheme when used in transformations
Edge Cases¶
- Null handling: Returns
nullif the input string isnull - Empty string: Returns
falsefor empty strings as they cannot contain valid Luhn numbers - Non-numeric characters: Behavior depends on the underlying
isLuhnNumberimplementation - Leading/trailing whitespace: Handled according to trim collation support
- Single digit numbers: May return
falseas Luhn algorithm typically requires multiple digits
Code Generation¶
This expression uses RuntimeReplaceable pattern with StaticInvoke:
- Does not generate custom code via Tungsten code generation
- Falls back to interpreted mode by invoking the static method at runtime
- The
StaticInvokeprovides some optimization by avoiding object creation
Examples¶
-- Valid credit card number
SELECT luhn_check('4532015112830366');
-- Returns: true
-- Invalid number
SELECT luhn_check('79927398714');
-- Returns: false
-- Using with table data
SELECT customer_id, luhn_check(credit_card_number) as is_valid
FROM customers;
// DataFrame API usage
import org.apache.spark.sql.functions._
df.select(col("credit_card_number"),
expr("luhn_check(credit_card_number)").as("is_valid"))
// Using in filter conditions
df.filter(expr("luhn_check(account_number) = true"))
See Also¶
- String manipulation functions
- Validation expressions
regexp_likefor pattern-based validationlengthfor string length validation