Sunday, April 28, 2013

Datastage Schema Files

Using RCP is great when your source is a database whose meta-data can be discovered but what about flat files? How do we create generic jobs that will process flat files.
The answer is Schema files. Schema files allow you to move column definitions from the job to an external file. This way a single generic job can be created to move flat files from source to destination.
There are two steps involved in creating a generic job using schema files:
  1. Create a schema file
  2. Create the DataStage job
    1. Example CSV File

      We have a very simple CSV file to load for this example. This file contains four fields - Name, Street Address, Suburb and Post Code.

      Creating a schema file

      A schema file is simply a text file containing an orchestrate table definition. The shema file for the employee file is shown below.
      Note that the data types used here are internal data types not SQL data types used in column definitions. Note also that the format tab of the Sequential file stage is ignored if a schema file is used. The schema file should contain the formatting options as shown below.

      The DataStage job is again a very simple job. It uses the schema file to define the formatting and meta-data of the sequential file and it uses RCP to propagate the column definitions.

      . The name of the schema file is defined in the Sequential File stage properties tab.

      The use of schema files allows a single DataStage job to populate multiple tables from multiple files. Since the schema files are plain text files, maintenance is easy and no special skills are required.

0 comments:

Post a Comment

Please Post your Comments..!