Skip to content

TryUrlDecode

Overview

TryUrlDecode is a URL decoding expression that safely decodes URL-encoded strings without throwing exceptions. It is implemented as a RuntimeReplaceable expression that wraps the UrlDecode function with error handling to prevent failures on malformed input.

Syntax

try_url_decode(str)
// DataFrame API
col("url_column").tryUrlDecode()

Arguments

Argument Type Description
expr String The URL-encoded string expression to decode

Return Type

String - Returns the URL-decoded string, or null if decoding fails.

Supported Data Types

  • String types (StringType)
  • Any expression that can be cast to string

Algorithm

  • Accepts a single string expression containing URL-encoded data
  • Internally wraps the UrlDecode expression with the strict parameter set to false
  • Uses RuntimeReplaceable pattern to delegate actual decoding to UrlDecode implementation
  • Returns null instead of throwing exceptions when encountering malformed URL encoding
  • Preserves the original input expression through the parameters method for analysis

Partitioning Behavior

This expression preserves partitioning behavior:

  • Does not require shuffle operations
  • Maintains existing data partitioning scheme
  • Can be pushed down to individual partitions for parallel execution

Edge Cases

  • Null input: Returns null without error
  • Empty string: Returns empty string
  • Malformed URL encoding: Returns null instead of throwing exception (key difference from strict url_decode)
  • Invalid percent-encoding sequences: Handled gracefully by returning null
  • Non-ASCII characters: Properly decoded according to URL encoding standards

Code Generation

This expression supports Catalyst code generation through its RuntimeReplaceable implementation. The actual code generation is delegated to the underlying UrlDecode expression, enabling efficient Tungsten execution.

Examples

-- Basic URL decoding
SELECT try_url_decode('Hello%20World') as decoded;
-- Result: "Hello World"

-- Handling malformed input safely
SELECT try_url_decode('invalid%GG') as decoded;
-- Result: null (no exception thrown)

-- Decoding special characters
SELECT try_url_decode('user%40example.com') as decoded;
-- Result: "user@example.com"
// DataFrame API usage
import org.apache.spark.sql.functions._

df.select(col("encoded_url").tryUrlDecode().as("decoded_url"))

// Using with other transformations
df.withColumn("clean_url", 
  coalesce(col("url").tryUrlDecode(), lit("INVALID_URL")))

See Also

  • UrlDecode - Strict version that throws exceptions on malformed input
  • UrlEncode - Companion expression for URL encoding
  • Base64 expressions - Related encoding/decoding functionality