ETL Run Management ~ Midhun`s Home

Sunday, April 28, 2013

ETL Run Management

This is a very simple run management system. It provides the bare minimum information but still enough to provide elapsed times, error frequency and trends in processing times. It is also quite easy to implement.

Requirements

Every run must be uniquely identifiable
The following run attributes must be recorded
- start time
- end time
- data start date
- data end date
- run status
The run identifier must be system generated
A run after a failed run will be related back to the failed run

Process Flow

Implementation Details

The run id is defined as a decimal value. The value of the run id is set to the next highest integer after every successful run. If a run is unsuccessful, the next run id is set to the last run id + 0.01. This means that each failed run will be related to the original run. The increment chosen (0.01) allows up to 99 unsuccessful runs.
This process is implemented using a DataStage sequence.

Midhun`s Home

Sunday, April 28, 2013