Question & Answer
Question
Why do InfoSphere Streams HDFS operators fail to connect to Hadoop?
Cause
There are three scenarios when Streams is an external client and the explicit namenode URL is not visible.
- A private network,
- a public network with a namenode high availability,
- GPFS.
Answer
When the explicit namenode URL is not visible you must use the webhdfs schema with a credential file as explained in the hdfsUri and credFile sections of:
HDFS2DirectoryScan operator
HDFS2FileSource operator
HDFS2FileSink operator.
Example:
- () as HDFS2FileSink_1 = HDFS2FileSink(Tuple_out0 as inPort0Alias)
{
param
file : "/user/biadmin/StreamsTst.txt" ;
hdfsUser : "<ValidUser>" ;
hdfsUri : "webhdfs://<host:port>" ;
credFile : "<path>/credfile.txt" ;
}
[{"Product":{"code":"SSCRJU","label":"IBM Streams"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Not Applicable","Platform":[{"code":"PF016","label":"Linux"}],"Version":"4.0;4.0.1","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]
Was this topic helpful?
Document Information
Modified date:
16 June 2018
UID
swg21967040