IBM Support

Connection refused using HDFS operators to connect to Hadoop.

Question & Answer


Question

Why do InfoSphere Streams HDFS operators fail to connect to Hadoop?

Cause

There are three scenarios when Streams is an external client and the explicit namenode URL is not visible.

  • A private network,
  • a public network with a namenode high availability,
  • GPFS.

Answer

When the explicit namenode URL is not visible you must use the webhdfs schema with a credential file as explained in the hdfsUri and credFile sections of:

HDFS2DirectoryScan operator

HDFS2FileSource operator

HDFS2FileSink operator.

Example:

    () as HDFS2FileSink_1 = HDFS2FileSink(Tuple_out0 as inPort0Alias)
    {
    param
    file : "/user/biadmin/StreamsTst.txt" ;
    hdfsUser : "<ValidUser>" ;
    hdfsUri : "webhdfs://<host:port>" ;
    credFile : "<path>/credfile.txt" ;
    }
For details about the credential file see the BigInsights documentation section "Specifying a credentials file in a webhdfs URI".

[{"Product":{"code":"SSCRJU","label":"IBM Streams"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Not Applicable","Platform":[{"code":"PF016","label":"Linux"}],"Version":"4.0;4.0.1","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
16 June 2018

UID

swg21967040