Loading data from DataStage

You can use an IBM® InfoSphere® DataStage® server to load data to your database.

You can view the following tutorial videos about how to load data by using DataStage:

Follow these steps to load data from a DataStage server:

Define database connections in DataStage
Define a job to load data
Compile and run a job to load data

Prerequisites

Verify that the user ID that is running the bulk load operation is granted one of the following privileges:

BIND: To be able to bind DB2® client packages.
EXECUTE: If the DataStage administrator has prebound the Db2 client packages, grant the EXECUTE privilege on those packages.

Define database connections in DataStage

From the DataStage and Quality Stage Designer (DataStage Designer), you need to define a connection to the Db2 database, BLUDB. You can find the connection information that you need in the web console. See also Connecting DataStage to a Db2 database.

Define a job to load data

Use the DataStage Designer to define and run a parallel job that loads data from your source database to the BLUDB database.

This task includes the following prerequisites:

The target BLUDB database must include target tables that do not contain any rows. You can create these tables in the Tables page of the web console.
The source database must be cataloged on the DataStage server, and source table definitions must be imported to the DataStage server.
The DataStage Designer must be connected to the DataStage server and must contain a connection object for the source database.

To define a DataStage job to load data to a BLUDB database:

From the DataStage Designer, click File > New > Parallel Job to open the workspace for creating the job.
Add source and target Db2 connector objects to the Job window and create a link between them:
1. From the Database palette, drag two Db2 connector objects to the Job window (one object for the source database connection, and one object for the target database connection).
2. From the General palette, select Link.
3. Create a link by dragging the cursor from one Db2 connector object to the other connector object.
Define the source data for the load process:
1. Double-click the Db2 Connector object that represents the connection to the source database to open the Db2 Connector window.
2. In the Properties tab, click Load.
3. Select the source database connection and click Open.
4. In the Instance field, ensure that the value is the name of the DataStage instance.
5. In the Database field, enter the name of the source database.
6. In the Usage area, select Yes for the Generate SQL option.
7. In the Table name field, enter the name of the table that contains the data to load.
8. In the Columns tab, click Load to open the Table Definitions window.
9. Open the definition for the source table that contains the data to load. Click OK to display the columns in the Columns tab.
10. Select the columns to load from the source table.
11. In the Db2 Connector window, click OK.
Define the target data in the BLUDB database for the load process:
1. Double-click the Db2 Connector object that represents the connection to the target BLUDB database to open the Db2 Connector window.
2. In the Properties tab, click Load.
3. Select the target database connection and click Open.
4. Ensure that the Instance value is the name of the DataStage instance.
5. In the Database field, enter the name of the target BLUDB database.
6. In the Usage area, select Yes for the Generate SQL option.
7. In the Write Mode field, select Bulk load.
8. In the Table name field, enter the name of the target table in the BLUDB database.
9. In the Db2 Connector window, click OK.
Click File > Save to save the job.
Optional: The following steps are required only when data is being loaded to the target BLUDB through an SSL connection.
Click File > Close to close the job.
Set DB2CLIINIPATH as a run option:
1. In the Repository pane to the left, navigate to the job, right-click, and select Properties.
2. On the Parameters tab, click Add Environment Variable.
3. In the Choose Environment Variable box, click New, and enter DB2CLIINIPATH in the Name field and DB2CLIINIPATH in the Prompt field. Click OK.
4. Ensure that the newly added variable has the following properties:
```
Parameter name :  $DB2CLIINIPATH
Prompt : DB2CLIINIPATH
Type : String
Default Value : /home/db2inst2/sqllib/cfg/db2cli.ini
```

For more information, see IBM Db2 connector.

Compile and run a job to load data

After you create a job to load data, you can compile the job and then run the load process.

With the job open in the DataStage Designer workspace, click File > Run to begin compiling the load process that is defined in the job.
When prompted to compile the job, choose Yes. If you receive errors about the job, use the job workspace to correct the errors, and then click File > Run to restart the compile process. When the job has compiled successfully, the Run Options window opens.
Click Run to start the load process.
After the load process completes, the Log View area of the job workspace will display the number of rows that are inserted into the target BLUDB database.
Verify that the load process was successful. For example, compare the number of rows in the source and target tables. To obtain the number rows in the source table, run the following command on the source database:
select count(*) from SCHEMA.TABLENAME

If the load process was successful, the number of rows in the target BLUDB database table should match the number of rows in the source table.

For more information, see the DataStage documentation: IBM Db2 databases.