IBM InfoSphere Streams Version 4.1.1

Postinstallation roadmap for InfoSphere Streams

The postinstallation roadmap summarizes required and optional postinstallation tasks and options for InfoSphere® Streams.
Table 1. Postinstallation tasks for InfoSphere Streams
Task Description
Configure the InfoSphere Streams environment variables. Before you can use InfoSphere Streams, you must configure the product environment variables. For more information, see Configuring the InfoSphere Streams environment by running streamsprofile.sh.
If you are upgrading InfoSphere Streams from a release prior to Version 4.0, review the migration requirements and options for applications, instances, and Streams Studio. If a streams processing application was compiled on a release prior to InfoSphere Streams Version 4.0, the application must be migrated to Version 4.1.1.

The instance upgrade tool is available to export the host lists, properties, and security settings for Version 3.2.1 instances to Version 4.1.1. This tool is most helpful if you want to continue using pre-Version 4.0 requirements such as SSH and shared file system installations.

You cannot reuse an existing Streams Studio workspace from a release prior to Version 4.0. However, you can import the projects in an existing workspace into a new Version 4.1.1 workspace.

For more information, see Migration guidelines for releases prior to Version 4.0.

If you are upgrading InfoSphere Streams from Version 4.0, or later, review the migration requirements and options for Streams Studio and applications. To reuse your Streams Studio Version 4.0, or later installation, you must update Streams Studio. Otherwise, your existing Streams Studio directory and all its contents are removed. This includes the Streams Studio workspaces that reside under the Streams Studio installation directory, customized tools, and other customizations of the Streams Studio development environment.

For more information, see Migration guidelines for Version 4.0, or later.

If you are using the default InfoSphere Streams resource manager, set up automatic recovery from failures on your main InfoSphere Streams management hosts. If you are using the default InfoSphere Streams resource manager, run the streamtool registerdomainhost command on hosts where you installed a full installation of the product by using the main installation package. This command sets up a Linux system service for the InfoSphere Streams domain controller service, which enables InfoSphere Streams to automatically recover from host failures. Running the domain controller service as a system service is required if you plan to use the security.runAsRoot domain property.

You can run the streamtool registerdomainhost command before or after you create an InfoSphere Streams enterprise domain.

Notes:
  • You must have root authority to run this command.

  • You must specify a domain name and the --zkconnect option on this command. The domain name that you specify does not need to exist before running the following command: streamtool registerdomainhost -d domain-id --zkconnect host:port.

    The --zkconnect option specifies the name of one or more host and port pairs for the configured external ZooKeeper ensemble. This value is the external ZooKeeper connection string. To obtain this value, enter the streamtool getzk -d domain-id command.

  • For additional information about this command, enter streamtool man registerdomainhost.
Set up an InfoSphere Streams domain. Before you use InfoSphere Streams, you must create at least one domain. An InfoSphere Streams domain is a container for InfoSphere Streams instances which provides a single point for configuring and managing common resources, security, and instances.

A basic domain has a single InfoSphere Streams resource and user. This type of domain is typically used for test or development environments.

An enterprise domain can have multiple resources and users. This type of domain is typically used for production environments. You can configure high availability to ensure that InfoSphere Streams can continue to run even if resources fail or are not available.

Optional: Configure SSH for InfoSphere Streams. In previous releases, InfoSphere Streams used Secure Shell (SSH) to run product applications, commands, and scripts. Beginning in Version 4.0, using SSH is optional. If you are using SSH, see Configuring a Secure Shell environment for InfoSphere Streams.
Optional: Install IBM® InfoSphere Streams Studio. Streams Studio is an integrated development environment that enables you to create, edit, visualize, test, debug, and run streams processing applications. You can install Streams Studio on a Linux or Microsoft Windows system. For information about installation requirements and procedures for Streams Studio, see Installing IBM InfoSphere Streams Studio.
Optional: Install IBM InfoSphere Streams for Microsoft Excel. You can use Streams for Excel to analyze and visualize streaming data in Microsoft Excel worksheets. Installing the Streams for Excel add-in is optional. For more information, see Installing IBM InfoSphere Streams for Microsoft Excel.
Optional: Install the Uncrustify source code formatter. When you generate C++ code using the sc command or Streams Studio, InfoSphere Streams can use the Uncrustify source code formatter to ensure that the generated code is indented and formatted correctly. Installing and using the Uncrustify source code formatter with InfoSphere Streams is optional. For more information, see Installing the Uncrustify source code formatter for InfoSphere Streams.