LastDay¶
Overview¶
The LastDay expression calculates the last day of the month for a given date. It takes a date as input and returns a new date representing the final day of that same month and year.
Syntax¶
Arguments¶
| Argument | Type | Description |
|---|---|---|
| startDate | DateType | The input date for which to find the last day of the month |
Return Type¶
DateType - Returns a date representing the last day of the month containing the input date.
Supported Data Types¶
- DateType (input will be implicitly cast to DateType if compatible)
Algorithm¶
-
Extracts the year and month components from the input date
-
Determines the number of days in that specific month, accounting for leap years when necessary
-
Constructs a new date using the same year and month but with the day set to the maximum day of that month
-
Returns the resulting date as an integer representation (days since epoch)
-
Delegates the actual calculation to
DateTimeUtils.getLastDayOfMonth()
Partitioning Behavior¶
-
Preserves partitioning since it operates on individual rows without requiring data movement
-
Does not require shuffle as the transformation is applied locally to each partition
Edge Cases¶
-
Null handling: Returns null if the input date is null (null intolerant behavior)
-
Leap years: Correctly handles February in leap years (returns February 29) vs non-leap years (returns February 28)
-
Month boundaries: Properly handles months with different numbers of days (30, 31, 28, or 29 days)
-
Date range limits: Behavior depends on the underlying
DateTimeUtils.getLastDayOfMonth()implementation for edge dates
Code Generation¶
Supports Tungsten code generation through the doGenCode method. The generated code directly calls DateTimeUtils.getLastDayOfMonth() for optimal performance, avoiding interpretation overhead.
Examples¶
-- Get the last day of January 2009
SELECT last_day('2009-01-12');
-- Result: 2009-01-31
-- Get the last day for multiple dates
SELECT last_day('2020-02-15'), last_day('2021-02-15');
-- Result: 2020-02-29, 2021-02-28 (leap year vs non-leap year)
// DataFrame API usage
import org.apache.spark.sql.functions.last_day
// Calculate last day of month for a date column
df.select(last_day($"order_date").alias("month_end"))
// Use in filtering
df.filter(last_day($"date_col") === lit("2023-12-31"))
See Also¶
date_add()- Add days to a datedate_sub()- Subtract days from a datetrunc()- Truncate date to specified unitadd_months()- Add months to a date