TypeOf¶

Overview¶

The TypeOf expression returns the Catalyst data type name of its input expression as a string. It provides a way to introspect the data type information at runtime, returning the catalog string representation of the type.

Syntax¶

typeof(expr)

// DataFrame API
import org.apache.spark.sql.functions._
df.select(expr("typeof(column_name)"))

Arguments¶

Argument	Type	Description
expr	Any	The expression whose data type should be returned

Return Type¶

StringType - Returns a UTF8String containing the catalog string representation of the input expression's data type.

Supported Data Types¶

This expression supports all Catalyst data types as input, including:

All primitive types (IntegerType, StringType, BooleanType, etc.)
Complex types (ArrayType, MapType, StructType)
Decimal types with precision and scale
Timestamp and date types
Binary types

Algorithm¶

The TypeOf expression evaluation follows these steps:

Accesses the dataType property of the child expression
Converts the DataType to its catalog string representation using catalogString method
Wraps the result in a UTF8String for return
Since it's foldable and context-independent, the result can be computed at compile time
Code generation produces a constant string literal for optimal performance

Partitioning Behavior¶

This expression has no impact on partitioning behavior:

Preserves existing partitioning as it doesn't depend on data values
Does not require shuffle operations
Can be computed independently on each partition

Edge Cases¶

Null input values: The expression itself is never null (nullable = false) regardless of whether the input expression can produce null values
Complex nested types: Returns the full type specification including nested structure (e.g., "array>")
Parameterized types: Includes type parameters like precision/scale for decimals
Empty containers: Returns the container type specification regardless of content

Code Generation¶

This expression supports Tungsten code generation. The doGenCode method generates optimized bytecode that produces a constant UTF8String literal, eliminating runtime type inspection overhead.

Examples¶

-- Basic primitive type
SELECT typeof(1);
-- Result: "int"

-- Array type
SELECT typeof(array(1, 2, 3));
-- Result: "array<int>"

-- Complex nested structure  
SELECT typeof(struct(1 as id, 'name' as label));
-- Result: "struct<id:int,label:string>"

// DataFrame API usage
import org.apache.spark.sql.functions._

df.select(expr("typeof(age)")).show()
df.select(expr("typeof(array_col)")).show()
df.withColumn("col_type", expr("typeof(some_column)"))