IBM InfoSphere Streams Version 4.1.0

Assigning host tags

Telecommunications Event Data Analytics applications use host tags to distribute the processing load evenly across the available hosts, to store enrichment data on all hosts that need it and to overcome CPU or memory limitations.

For example, assume that you have four hosts with 64 GB RAM each. Your application is structured into twenty groups processing data in parallel and each group needs 8 GB of RAM to fulfill its task. To distribute the processing load evenly you want to have five groups running on each host respectively, consuming 40 GB of RAM per node. Thus you need to assign the host tags of five groups to a single host.

An ITE application requires the following host tags with the <ite-namespace> being the namespace of the ITE application:

  • <ite-namespace>_ingest: Assign this host tag to the host that shall read the input files.
  • <ite-namespace>_controller: Assign this host tag to the host that shall control the ITE application
  • <ite-namespace>_chain: Assign this host tag to the host or hosts that shall run the business logic (variants A and B only)
  • <ite-namespace>_chain_<id>: Assign these host tags to the host or hosts that shall run the business logic (variant C only)
  • <ite-namespace>_context_<id>: Assign these host tags to the host or hosts that shall run the group logic (variants B and C only)

The Lookup Manager application requires the following host tags with the <lm-namespace> being the namespace of the Lookup Manager application:

  • <lm-namespace>_lookup_writer: Assign this host tag to the host that shall control the Lookup Manager application
  • <lm-namespace>_lookup_host_writer: Assign this host tag to the host or hosts on which the enrichment data resides

The <lm-namespace>_lookup_host_writer host tag must be assigned to all hosts that use enrichment data and which have defined the <ite-namespace>_chain or <ite-namespace>_chain_<id> host tags. The number of hosts is variable and must correlate with the number of ‘<lm namespace>_lookup_host_writer’ host tags.

Before you begin

Build your application. This creates the ‘hosttags.txt’ file, which resides in the ‘<project-base>/<application-folder>/config’ directory. This file contains all host tags that your application is using. You must assign every single tag to at least one of your host machines. Use the hosttags.txt files as input when applying hosttags to the hosts in your Streams domain.

You should discuss with your development team which host tags should be assigned to which hosts. Furthermore, your developers may have defined more host tags whose assignment restrictions can only explained by them.

Procedure

  • Create a Streams domain as described in the Streams documentation.
  • Create a Streams instance as described in the Streams documentation.
  • Build the application.
  • Open the ‘hosttags.txt’ files of your application.
  • Using the information in the file, assign the host tags to your hosts either by using Streams Studio’s Streams Explorer or via the command line using the streamtool command

    Example:

    You have an instance with host_1, host_2, and host_3. Assign the host tags to the hosts by copying the host tag names to the command line behind the --tags option of the streamtool chhost command:

     streamtool chhost --add --tags <ite namespace>_ingest,<ite namespace>_controller,<lm namespace>_lookup_writer <host_1>
     streamtool chhost --add --tags <lm namespace>_lookup_host_writer,<ite namespace>_chain <host_2>
     streamtool chhost --add --tags <lm namespace>_lookup_host_writer,<ite namespace>_chain <host_3