IBM InfoSphere Streams Version 4.1.0

Developing and running applications that use the HBase Toolkit

SPL standard and specialized toolkits > com.ibm.streamsx.hbase 2.0.0 > Developing and running applications that use the HBase Toolkit

To create applications that use the HBase Toolkit, you must configure either Streams Studio or the SPL compiler to be aware of the location of the toolkit.

Before you begin

  • Install IBM® InfoSphere® Streams. Configure the product environment variables by entering the following command:
    source product-installation-root-directory/4.0.0.0/bin/streamsprofile.sh
  • Install Apache HBase (http://hbase.apache.org/), Apache Hadoop (http://hadoop.apache.org/) and Apache Zookeeper (http://zookeeper.apache.org/). All of these products are installed as part of IBM InfoSphere BigInsights®.
    • This toolkit supports IBM InfoSphere BigInsights version 3.0.0 and later and HBase 0.96 and later versions.
    • Note: The host or resource where InfoSphere Streams is installed must be able to communicate with the resources where HBase is installed, but you do not need both on the same resource.

About this task

After the location of the toolkit is communicated to the compiler, the SPL artifacts that are specified in the toolkit can be used by an application. The application can include a use directive to bring the necessary namespaces into scope. Alternatively, you can fully qualify the operators that are provided by toolkit with their namespaces as prefixes.

Procedure

  1. Supply the operator with HBase configuration information. Choose one of the following methods:
    • Copy the hbase-site.xml file from the conf directory in your HBase installation into the etc directory in your application. Use the hbaseSite parameter in the HBase Toolkit operators to point to the hbase-site.xml.
    • If the Streams processing application runs on a resource where Hbase and Hadoop are installed, you may instead set the HBASE_HOME environment variable so that the operators look in HBASE_HOME/conf/hbase-site.xml for HBASE configuration information. For example, set HBASE_HOME to the directory that contains HBase.
  2. Configure the SPL compiler to find the toolkit root directory. Use one of the following methods:
    • Set the STREAMS_SPLPATH environment variable to the root directory of a toolkit or multiple toolkits (with : as a separator). For example:
      export STREAMS_SPLPATH=$STREAMS_INSTALL/toolkits/com.ibm.streamsx.hbase
    • Specify the -t or --spl-path command parameter when you run the sc command. For example:
      sc -t $STREAMS_INSTALL/toolkits/com.ibm.streamsx.hbase -M MyMain
      where MyMain is the name of the SPL main composite. Note: These command parameters override the STREAMS_SPLPATH environment variable.
    • Add the toolkit location in InfoSphere Streams Studio.
  3. Develop your application. To avoid the need to fully qualify the operators, add a use directive in your application.
    • For example, you can add the following clause in your SPL source file:
      use com.ibm.streamsx.hbase::*;
      You can also specify a use clause for individual operators by replacing the asterisk (*) with the operator name. For example:
      use com.ibm.streamsx.hbase::HBASEDelete;
  4. Build your application. You can use the sc command or Streams Studio.
  5. Start the InfoSphere Streams instance.
  6. Run the application. You can submit the application as a job by using the streamtool submitjob command or by using Streams Studio.