Skip to content

ArrayExists

Overview

The ArrayExists expression checks whether any element in an array satisfies a given predicate function. It applies a lambda function to each element of the array and returns true if at least one element makes the predicate evaluate to true.

Syntax

EXISTS(array, lambda_function)

Arguments

Argument Type Description
argument ArrayType The input array to evaluate
function LambdaFunction The predicate function to apply to each array element
followThreeValuedLogic Boolean Controls null handling behavior (internal parameter)

Return Type

BooleanType - Returns true if any element satisfies the predicate, false if none do, or null in certain null-handling scenarios.

Supported Data Types

  • Input: Any ArrayType with elements of any data type
  • Lambda function must return a boolean result
  • Supports arrays with nullable elements

Algorithm

  • Iterates through each element in the input array sequentially
  • Applies the lambda function to each element by binding it to the lambda variable
  • Short-circuits and returns true as soon as any element satisfies the predicate
  • Tracks null results from the lambda function for three-valued logic handling
  • Returns appropriate result based on findings and null-handling configuration

Partitioning Behavior

This expression preserves partitioning as it operates on individual array values within each partition:

  • Does not require data shuffle
  • Maintains existing data partitioning
  • Can be evaluated independently on each partition

Edge Cases

  • Null array input: Returns null if the input array itself is null
  • Empty array: Returns false for empty arrays
  • Null lambda results: When followThreeValuedLogic is true and lambda returns null for some elements but no element returns true, the overall result is null
  • Legacy mode: When followThreeValuedLogic is false, null lambda results are ignored and only affect final result if no true value is found
  • Nullable elements: Properly handles null elements within the array by passing them to the lambda function

Code Generation

This expression uses CodegenFallback, meaning it falls back to interpreted evaluation mode rather than generating optimized Java code via Tungsten code generation.

Examples

-- Check if any element is null
SELECT EXISTS(array(1, 2, 3), x -> x IS NULL);
-- Returns: false

-- Check if any element is greater than 2
SELECT EXISTS(array(1, 2, 3), x -> x > 2);
-- Returns: true

-- Check with null elements
SELECT EXISTS(array(1, null, 3), x -> x IS NULL);
-- Returns: true
// DataFrame API usage
import org.apache.spark.sql.functions._

df.select(exists(col("array_column"), x => x > lit(10)))

See Also

  • ArrayForAll - Checks if all elements satisfy a predicate
  • ArrayFilter - Filters array elements based on a predicate
  • ArrayTransform - Transforms array elements using a lambda function