HDFS bridge reference

Prerequisites and parameter information for the HDFS bridge.

About this bridge

The HDFS bridge imports metadata about directories in the Hadoop Distributed File System (HDFS) that is installed with InfoSphere® BigInsights®.

The bridge stores the directory metadata, including the full name and path of each directory, as a data file folder asset in the metadata repository. You can use this information to create InfoSphere Data Click activities that move data files and their contents to the HDFS directory.

To import, you select or create a data connection to the HDFS that contains the directory. For each import you specify one ore more HDFS directories.

Prerequisites

The HDFS bridge is installed on the engine tier when connectors are installed. The engine tier must be on a Linux or AIX computer.

The HDFS bridge uses a data connection to access the InfoSphere BigInsights REST API and communicate with InfoSphere BigInsights. The REST communication is done over the HTTP or HTTPS protocol. Security and configuration procedures are described in the InfoSphere BigInsights web console security document.

Before you set up the HDFS connection, ensure that you have met the configuration prerequisites:

If SSL is required for the connection, you must set up the truststore options.

HDFS bridge import parameters

HDFS bridge has the following import configuration parameters.

HDFS directory

Browse to select one or more HDFS directories.

When you share the import to the metadata repository, the full directory name and path for each directory is stored in the metadata repository as a data file folder.