BitmapAndAgg¶
Overview¶
BitmapAndAgg is an imperative aggregate function that performs a bitwise AND operation across multiple bitmap values. It combines binary bitmap representations by performing element-wise AND operations on all input bitmaps, effectively computing the intersection of bits across all aggregated values.
Syntax¶
Arguments¶
| Argument | Type | Description |
|---|---|---|
| child | Expression | The bitmap expression to aggregate, must evaluate to binary data |
| mutableAggBufferOffset | Int | Internal offset for mutable aggregation buffer (default: 0) |
| inputAggBufferOffset | Int | Internal offset for input aggregation buffer (default: 0) |
Return Type¶
BinaryType - Returns a fixed-size binary array representing the aggregated bitmap.
Supported Data Types¶
- Input:
BinaryTypeonly - the expression strictly validates that the child expression returns binary data - Output:
BinaryType- fixed-size binary array
Algorithm¶
- Initialize aggregation buffer with all bits set to 1 (filled with -1 bytes)
- For each input bitmap, perform bitwise AND operation between current buffer and input
- Merge partial aggregates by performing bitwise AND between buffer contents
- Null input values are skipped during aggregation
- Final result is the accumulated bitmap buffer
Partitioning Behavior¶
As an aggregate function, BitmapAndAgg has the following partitioning characteristics:
- Does not preserve input partitioning
- Requires shuffle for global aggregation across partitions
- Supports partial aggregation with merge capability for distributed processing
Edge Cases¶
- Null handling: Null input bitmaps are ignored and do not affect the aggregation result
- Empty input: Returns a bitmap with all bits set to 1 (default result of all -1 bytes)
- Non-nullable result: The function always returns a non-null bitmap result
- Fixed size: All bitmaps must conform to
BitmapExpressionUtils.NUM_BYTESsize requirement - Type validation: Throws
DataTypeMismatcherror if input is not BinaryType
Code Generation¶
This expression does not support code generation and operates in interpreted mode only, as it extends ImperativeAggregate which uses the imperative aggregation framework rather than Tungsten code generation.
Examples¶
-- Aggregate bitmap intersection across user groups
SELECT bitmap_and_agg(user_permissions_bitmap)
FROM user_groups
WHERE group_type = 'admin';
// DataFrame API usage
import org.apache.spark.sql.functions._
df.select(expr("bitmap_and_agg(permission_bitmap)"))
.collect()
See Also¶
BitmapOrAgg- Bitwise OR aggregation for bitmap union operationsBitmapXorAgg- Bitwise XOR aggregation for bitmap difference operations- Other bitmap manipulation expressions in the bitmap expression family