DataStage has two data types that can be used to store dates - DATE
and TIMESTAMP. Transforming data often requires that we manipulate dates
in some way. DataStage has a rich set of functions for all types of
date manipulation yet I constantly see hand-written code to perform
operations that are readily available out of the box.
The following is a discussion of some of the more common date operations and transformations.
Let's begin with some definitions of data types straight out of IBM's documentation.
Converting to Julian date is quite a common way of performing date arithmetic. Once you have a Julian day, you can simply add or subtract any number of days from the original value.
Both options have the same drawback - no options to add months or years. Trying to roll your own functions to add months and years (although fairly simple) is a pain at the best of times and it's just another component that has to be deployed. It would be nice to have a built-in function that does the job - and there is. The best and most flexible option is to use the DateOffsetByComponents function. This function provides almost everything you will need for date aritmetic. The function takes four parameters - date, year_offset, month_offset, day_offset. In other words, this function can perform calculations similar to the following examples.
Use DateOffsetByComponents(%basedate%,%yearoffset%,%monthoffset%,%dayoffset%) to:
Note that although the documentation mentions the "%eee" and "%eeee" format strings
, they don't seem to work in isolation. I've had success with a format
such as "%eee, %yyyy-%mm-%dd" but if I use "%eee" in isolation, I get
the following error.
The following is a discussion of some of the more common date operations and transformations.
Let's begin with some definitions of data types straight out of IBM's documentation.
Data Types
SQL Type | Underlying Data Type | Size | Description |
---|---|---|---|
Date | date | 4 bytes | Date with month, day, and year |
Timestamp | timestamp | 9 bytes | Single field containing both date and time value |
Validation
If you are lucky enough to be working with Date types from a database then you won't have to worry about invalid dates but more often than not you will have to validate a string to ensure that you have a valid date. DataStage provides a function that can validate all data types.IsValid(%typestring%,%valuestring%,[%format%])
Where typestring is one of the following:
"date", "decimal", "dfloat", "sfloat", "int8", "uint8", "int16", "uint16", "int32", "uint32", "int64", "uint64", "raw", "string", "time", "timestamp". "ustring"
DataStage also provides a number of convenience functions to make life
just that little bit easier so instead of the function shown above you
can use IsValidDate(%date%)
Date Arithmetic
Adding and Subtracting DaysConverting to Julian date is quite a common way of performing date arithmetic. Once you have a Julian day, you can simply add or subtract any number of days from the original value.
JDATE = JulianDayFromDate(DT_VAR)
TOMORROW = JDATE + 1
YESTERDAY = JDATE - 1
LAST_WEEK = JDATE - 7
Once you have performed your arithmetic, you simply change back to a date.NEW_DT = DateFromJulianDay(LAST_WEEK)
Another way is to use the DateFromDaysSince function.
TOMORROW = DateFromDaysSince(1,DT_VAR)
YESTERDAY = DateFromDaysSince(-1,DT_VAR)
The great thing about this function is it will accept a string so long as the string is in the format ("%yyyy-%mm-%dd").Both options have the same drawback - no options to add months or years. Trying to roll your own functions to add months and years (although fairly simple) is a pain at the best of times and it's just another component that has to be deployed. It would be nice to have a built-in function that does the job - and there is. The best and most flexible option is to use the DateOffsetByComponents function. This function provides almost everything you will need for date aritmetic. The function takes four parameters - date, year_offset, month_offset, day_offset. In other words, this function can perform calculations similar to the following examples.
Use DateOffsetByComponents(%basedate%,%yearoffset%,%monthoffset%,%dayoffset%) to:
- Calculate the date 5 days before or after date
- Calculate the date 3 months before or after date
- Calculate the date 2 years before or after date
- Calculate the date 2 years, 3 months and 5 days before or after the date
Date | Year Offset | Month Offset | Day Offset | Result |
---|---|---|---|---|
2004-02-29 | 0 | 0 | 5 | 2004-03-05 |
2004-02-29 | 0 | 3 | 0 | 2004-05-29 |
2004-02-29 | 2 | 0 | 0 | 2006-03-01 |
2004-02-29 | 2 | 3 | 5 | 2006-06-03 |
Days Elapsed
Another common requirement is to calculate the number of days between 2 dates. This is easily accomplished with the DaysSinceFromDate function.DaysSinceFromDate("2001-01-20","2001-01-28") = -8
Last Day of Month
Calculating the last day of the month is a little bit more work but is certainly easier than using if-else statements as I've seen in some jobs. This is how it's done using stage variables. Of course you could just use a single variable with lots of nested functions but it's a lot easier to understand (and maintain) with mulitple variables.
svDate = inputlink.dateVar
svYear = YearFromDate(svDate)
svMonth = MonthFromDate(svDate)
svDay = DaysInMonth(svDate)
svEndOfMonth = DateFromComponents(svYear,svMonth,svDay)
Date Transformations
DataStage provides a number of format strings that can be used to convert dates to strings and vice-versa. In the following examples, we use a DATE variable (date_value) that contains a value of "2000-02-29".Description | Syntax | Result |
---|---|---|
Convert date to string | DateToString(date_value,"%dd-%mm-%yyyy") | 29-02-2000 |
Get short month name | DateToString(date_value,"%mmm") | Feb |
Get long month name | DateToString(date_value,"%mmmm") | February |
Get short weekday name | DateToString(date_value,"%eee") | Tue |
Get long weekday name | DateToString(date_value,"%eeee") | Tuesday |
Transformer_2,0: Caught exception from runLocally():
APT_ParseError: Parsing pa rameters "%eee" for conversion
"string=string_from_date[%yyyy-%mm-%dd](date)":
APT_Conversion_String_Date: Invalid Format [%eee] used for
string_from_date type conversion.
0 comments:
Post a Comment
Please Post your Comments..!