Sunday, April 28, 2013

How to manipulate dates in DataStage 8.5

DataStage has two data types that can be used to store dates - DATE and TIMESTAMP. Transforming data often requires that we manipulate dates in some way. DataStage has a rich set of functions for all types of date manipulation yet I constantly see hand-written code to perform operations that are readily available out of the box.
The following is a discussion of some of the more common date operations and transformations.

Let's begin with some definitions of data types straight out of IBM's documentation.

Data Types

SQL Type Underlying Data Type Size Description
Date date 4 bytes Date with month, day, and year
Timestamp timestamp 9 bytes Single field containing both date and time value

Validation

If you are lucky enough to be working with Date types from a database then you won't have to worry about invalid dates but more often than not you will have to validate a string to ensure that you have a valid date. DataStage provides a function that can validate all data types.
IsValid(%typestring%,%valuestring%,[%format%]) 
Where typestring is one of the following:
"date", "decimal", "dfloat", "sfloat", "int8", "uint8", "int16", "uint16", "int32", "uint32", "int64", "uint64", "raw", "string", "time", "timestamp". "ustring"
DataStage also provides a number of convenience functions to make life just that little bit easier so instead of the function shown above you can use
IsValidDate(%date%) 

Date Arithmetic

Adding and Subtracting Days
Converting to Julian date is quite a common way of performing date arithmetic. Once you have a Julian day, you can simply add or subtract any number of days from the original value.
JDATE = JulianDayFromDate(DT_VAR)
TOMORROW = JDATE + 1
YESTERDAY = JDATE - 1
LAST_WEEK = JDATE - 7
Once you have performed your arithmetic, you simply change back to a date.
NEW_DT = DateFromJulianDay(LAST_WEEK)
Another way is to use the DateFromDaysSince function.
TOMORROW = DateFromDaysSince(1,DT_VAR)
YESTERDAY = DateFromDaysSince(-1,DT_VAR)
The great thing about this function is it will accept a string so long as the string is in the format ("%yyyy-%mm-%dd").
Both options have the same drawback - no options to add months or years. Trying to roll your own functions to add months and years (although fairly simple) is a pain at the best of times and it's just another component that has to be deployed. It would be nice to have a built-in function that does the job - and there is. The best and most flexible option is to use the DateOffsetByComponents function. This function provides almost everything you will need for date aritmetic. The function takes four parameters - date, year_offset, month_offset, day_offset. In other words, this function can perform calculations similar to the following examples.
Use DateOffsetByComponents(%basedate%,%yearoffset%,%monthoffset%,%dayoffset%) to:
  • Calculate the date 5 days before or after date
  • Calculate the date 3 months before or after date
  • Calculate the date 2 years before or after date
  • Calculate the date 2 years, 3 months and 5 days before or after the date
Furthermore, this function also accepts strings in the format "%yyyy-%mm-%dd".
Date Year Offset Month Offset Day Offset Result
2004-02-29 0 0 5 2004-03-05
2004-02-29 0 3 0 2004-05-29
2004-02-29 2 0 0 2006-03-01
2004-02-29 2 3 5 2006-06-03

Days Elapsed

Another common requirement is to calculate the number of days between 2 dates. This is easily accomplished with the DaysSinceFromDate function.
DaysSinceFromDate("2001-01-20","2001-01-28") = -8

Last Day of Month

Calculating the last day of the month is a little bit more work but is certainly easier than using if-else statements as I've seen in some jobs. This is how it's done using stage variables. Of course you could just use a single variable with lots of nested functions but it's a lot easier to understand (and maintain) with mulitple variables.

svDate = inputlink.dateVar
svYear = YearFromDate(svDate)
svMonth = MonthFromDate(svDate)
svDay = DaysInMonth(svDate)
svEndOfMonth = DateFromComponents(svYear,svMonth,svDay)

Date Transformations

DataStage provides a number of format strings that can be used to convert dates to strings and vice-versa. In the following examples, we use a DATE variable (date_value) that contains a value of "2000-02-29".
Description Syntax Result
Convert date to string DateToString(date_value,"%dd-%mm-%yyyy") 29-02-2000
Get short month name DateToString(date_value,"%mmm") Feb
Get long month name DateToString(date_value,"%mmmm") February
Get short weekday name DateToString(date_value,"%eee") Tue
Get long weekday name DateToString(date_value,"%eeee") Tuesday
Note that although the documentation mentions the "%eee" and "%eeee" format strings , they don't seem to work in isolation. I've had success with a format such as "%eee, %yyyy-%mm-%dd" but if I use "%eee" in isolation, I get the following error.
Transformer_2,0: Caught exception from runLocally(): APT_ParseError: Parsing pa rameters "%eee" for conversion "string=string_from_date[%yyyy-%mm-%dd](date)": APT_Conversion_String_Date: Invalid Format [%eee] used for string_from_date type conversion.

0 comments:

Post a Comment

Please Post your Comments..!