Making assets available for use in activities

You must import metadata for all sources and targets that you want to make available for InfoSphere® Data Click activities. One reason that you import the metadata is to ensure that connection information is available during the processing of activities.

Before you begin

Before you can successfully import metadata into the metadata repository by using InfoSphere Metadata Asset Manager, you must set up connections to all sources and targets that you want to use in InfoSphere Data Click.

Click on the link for each connector in the list below for information about configuring them. You must configure them before using InfoSphere Metadata Asset Manager or InfoSphere Data Click. The following connectors are supported in InfoSphere Data Click:

Note: The Amazon S3 connector comes pre-configured for use with the InfoSphere Information Server suite. You do not need to configure the Amazon S3 connector.

Before you can move data from Amazon S3 by using InfoSphere Data Click, you must create an .osh schema file or specify the format of the file in the first row of each file that you want to move before you import the Amazon S3 metadata into the metadata repository by using InfoSphere Metadata Asset Manager. It is recommended that you create an .osh schema file because it provides more comprehensive details about the structure of the data in your files. If you used the Amazon S3 file as a target in a previous InfoSphere Data Click activity, then an .osh schema file was automatically created for the file by InfoSphere Data Click, so you do not need to create one.

To move data from Amazon S3, you must also verify that the character set of the Amazon S3 files matches the encoding properties that are set for the InfoSphere DataStage® project that you are using in your activity. For example, if the character set of the files in Amazon S3 is set to UTF-8, then the encoding for InfoSphere DataStage should also be set to UTF-8. You must update the APT_IMPEMP_CHARSET property in the InfoSphere DataStage project used for running the activity.

If you are planning on using InfoSphere Data Click to move data into a relational database that connects to InfoSphere Data Click by using the JDBC connector, then you must verify that the relational database has the same NLS encoding as the local operating system.

If you are planning on using InfoSphere Data Click to move data into a Hadoop Distributed File System in InfoSphere BigInsights® on a Linux computer, refer to the snycbi command for information on configuring your computer. See Configuring access to Hive tables (AIX only) and Configuring access to HDFS (AIX only) for information about setting up connections to InfoSphere BigInsights on Windows and AIX® computers.

To import metadata into the metadata repository through the InfoSphere Metadata Asset Manager, you must have the Common Metadata Importer role.

About this task

The following task describes how to import the metadata by using InfoSphere Metadata Asset Manager.

You must import metadata for all sources and targets that you want to use in InfoSphere Data Click activities, as well as every asset that you want to use in the Information Governance Catalog. For every distributed file system that you import by using the HDFS bridge, you must import at least one associated Hive table. The distributed file system and the Hive table become associated when you create a separate data connection to each one, and you specify the same host name in the parameters. For example, in the data connection that connects to the distributed file system by using the HDFS connector, you might specify 9.182.77.14 in the Host field. When you import the associated Hive table, you must specify 9.182.77.14 as part of the URL field in the data connection parameters. For example, jdbc:bigsql://9.182.77.14:7052/default.

To ensure that you can view lineage for your jobs in Information Governance Catalog, enable operational metadata.

Procedure

Log in to InfoSphere Metadata Asset Manager by clicking the icon in the InfoSphere Information Server Launchpad or by selecting Import Additional Assets in the InfoSphere Data Click activity creation wizard.
On the Import tab, click New Import Area.
1. Specify a unique name and a description for the import area.
2. Select the metadata interchange server that you want to run the import from.
3. Select the bridge or connector to import with. You choose the bridge or connector depending on the source metadata that it imports. Help for the selected bridge or connector is displayed in the Import Help pane.
4. Click Next.
Select or create a data connection. You can edit the properties of a selected data connection. If you create a new data connection, select Save password if you want InfoSphere Data Click users to be able to access the data without entering the user name and password for the database when they are creating activities. If you do not save the password, they will be prompted for credentials when they use data in the database in activities. If you are using an existing data connection, you might want to click Edit and update the Save password parameter.
Specify import parameters for the selected bridge or connector. Help for each parameter is displayed when you hover over the value field.
1. Optional: After you enter connection information for an import from a server, click Test Connection.
2. For imports from Amazon S3 buckets, you must select Import file structure if you want to use InfoSphere Data Click to move metadata from Amazon S3 files.
  Note: You do not need to select the Import file structure option if you only want to move data to Amazon S3. If you want to use Amazon S3 as a target in InfoSphere Data Click, you do not need to select this option.
  
  Figure 1. A view of the Import file structure option on the Import Parameters panel in InfoSphere Metadata Asset Manager
  This imports the .osh schema files that define the structure of the data in the Amazon S3 files or it imports the metadata about the first row of each file, which contains information about the structure of each file.
3. For imports from databases, repositories, and Amazon S3 buckets, browse to select the specific assets that you want to import.
4. Click Next.
On the Identity Parameters screen, enter the required information. Click Next.
Type a description for the import event and select Express Import.
Click Import. The import area is created. The import runs and status messages are displayed.
Leave the import window open to avoid the possibility that long imports time out.
Repeat the preceding steps until you have imported all metadata assets that you want to make available to InfoSphere Data Click users.

Results

After running an express import, take one of the actions that are listed in the following table.

Table 1. Choices after an express import
In this case	Take this action
If the analysis shows problems that you must fix	The Staged Imports tab is displayed. Review the analysis results. If necessary, reimport the staged event.
If your administration settings require a preview	The View Share Preview screen is displayed. Preview the result of sharing the import.
If your administration settings do not require a preview	The assets are shared to the metadata repository. The Shared Imports tab is displayed. You can browse the assets on the Repository Management tab and work with them in other suite tools.