IBM InfoSphere Streams Version 4.1.1

What's New in Version 4.1

These InfoSphere® Streams Version 4.1 enhancements are included in Version 4.1.1.
New options are available for customizing user authentication in an enterprise domain.
When you create an enterprise domain, you must specify either the Lightweight Directory Access Protocol (LDAP) or the Pluggable Authentication Module (PAM) service as the default user authentication method for InfoSphere Streams. After creating the domain, you can now use the following new options to customize user authentication:
  • Use InfoSphere Streams or user-defined login modules.
  • Set up client certificate authentication for the domain by using X.509 certificates.
The following InfoSphere Streams interfaces support certificate authentication: Domain Manager, Streams Console, streamtool command-line interface, JMX API, and REST API.

New informationLearn more...

Secure communication to and from domain and instance services by using cryptographic protocols

Many domain and instance services support connections that use Transport Layer Security (TLS) cryptographic protocols. You can specify which cryptographic protocols the services use for secure communication by setting domain and instance properties.

In previous releases, you could specify which cryptographic protocols to use for the web management service; the meaning of some of those options change in Version 4.1. For example, the SSL_TLS and SSL_TLSv2 options no longer use Secure Sockets Layer (SSL) 3.0 and later protocols.

New informationLearn more...

InfoSphere Streams no longer requires that LDAP servers enable anonymous binds.

If you use LDAP for authenticating users, you no longer need to enable anonymous binds in the LDAP server. Instead, you can set LDAP administrator authentication credentials when you create a domain by specifying the following properties: security.ldapAdministratorUser; security.ldapAdministratorPassword. To change the credentials for an existing domain, you can use the new streamtool setldapadminconfig command.

New informationLearn more...

Support is added for IBM® Power® Systems that run little endian.

InfoSphere Streams now supports both big endian and little endian on IBM Power Systems™.

New informationLearn more...

Support is added for IBM Platform Symphony® and user-defined external resource managers.

You can now manage InfoSphere Streams resources by using IBM Platform Symphony and user-defined resource managers.

If you are using an external resource manager, you can use the new resource installation package to add externally managed resources to the domain.

New informationLearn more...

Document native functions and create tables in SPLDOC markup.

In Version 4.1, you can use SPLDOC annotation tags to document both SPL and native functions. There is also a new @throws annotation, which you can use to describe the exceptions that are thrown by native functions.

Version 4.1 also includes support for nested text markup. In particular, you can combine different typefaces, such as italics, bold, and monospace.

You can also include tables in the comments for the SPL artifacts in a toolkit. The spl-make-doc command generates HTML documentation from the marked up artifacts.

New informationLearn more...

SPL Standard Toolkit contains SPLDOC comments.

All of the toolkits, including the SPL Standard Toolkit, now use SPLDOC markup to associate comments with their SPL artifacts. You can use the spl-make-doc command to generate HTML documentation for the toolkits. The output from that command is included in the product documentation and in the doc directory within the toolkit.

New informationLearn more...

Develop applications with new toolkits
Version 4.1 includes the following new toolkits:
  • The Cybersecurity Toolkit (com.ibm.streams.cybersecurity) provides operators can analyze Domain Name System (DNS) response records. The operators in this toolkit use machine learning models to analyze DNS traffic and report on suspicious behavior.
  • The Distributed Process Store Toolkit (com.ibm.streamsx.dps) enables multiple applications that are running processing elements on one or more machines to share application-specific state information. The shared information is stored in an external Redis NoSQL Key-Value (K/V) data store. The toolkit thus allows non-fused SPL, C++ and Java™ operators that are running on different machines to share information.
  • The Spark MLlib Toolkit (com.ibm.streamsx.sparkmllib) allows the machine learning library in Apache Spark to be used for real time scoring of data in InfoSphere Streams.
  • The Topology Toolkit (com.ibm.streamsx.topology) provides mechanisms to build streams processing applications in other programming languages, with Java and Scala support.

New informationLearn more...

InfoSphere Streams REST API supports cross-origin resource sharing (CORS).

For security purposes, web browsers typically restrict scripts from accessing data with a different origin than the page that contains the script. This restriction, which is known as the same-origin policy, can prevent you from directly accessing the InfoSphere Streams REST API from a script. To address this restriction, the InfoSphere Streams REST API supports cross-origin resource sharing (CORS), which provides a mechanism for the browser and server to determine whether to allow cross-origin requests. To enable this support, you must specify which origins should be permitted to make cross-origin REST requests. You can manage the list of trusted origins by using the Streams Console or the new streamtool addtrustedorigin, rmtrustedorigin, and lstrustedorigins commands.

New informationLearn more...

Capture the current state of the instance in an output file.

The streamtool capturestate command captures the current state of the instance, including the state of objects in the system and the metrics available on those objects. Sometimes the command outputs a large amount of XML information. In IBM InfoSphere Streams Version 4.1.1, you can use the --file parameter to specify a file path for the command output. The command also changed such that the --select parameter is no longer optional.

New informationLearn more...

New and changed streamtool commands

IBM InfoSphere Streams Version 4.1.1 includes several new streamtool commands, such as mkresourcepkg and setigcadminconfig. All streamtool commands also changed such that the level value is no longer optional when you specify the --verbose option.

New informationLearn more...

Provide information governance for your Streams applications.

You can now develop streams processing applications or extend your existing streams processing applications to take advantage of the features of IBM InfoSphere Information Governance Catalog.

New informationLearn more...

Catch exceptions thrown during tuple processing

You can now can now specify to catch exceptions of a specified type thrown by an operator while processing tuples. To catch these exceptions, use the new @catch annotation.

New informationLearn more...

Write streams processing applications in Java™ and Scala

You can now write streaming applications for InfoSphere Streams entirely in Java™ with the new Java application API. The API uses a functional style of programming so that you can define graph flow and data manipulation simultaneously.

Scala support also enables you to create streaming applications entirely in Scala for InfoSphere Streams.

The new Topology Toolkit (com.ibm.streamsx.topology) supports building streaming applications for InfoSphere Streams in different programming languages, such as Java and Scala.

New informationLearn more...

Monitor your applications with the InfoSphere Streams Application Dashboard
The InfoSphere Streams Console enables you to administer your domain and monitor the applications running in that domain. You can monitor your applications in the following ways:
  • Monitor all of the applications within a domain

    Use the management dashboard to monitor the status of the domain and all of the resources, instances, jobs, processing elements (PEs), operators, and streams that are associated with the domain.

  • Monitor specific applications
    In IBM InfoSphere Streams Version 4.1.1, you can also create an application dashboard to monitor an application, or set of applications, by specifying the data you want to see. You can use application dashboards to proactively manage your applications because they enable you to:
    • Specify which data is relevant to the health of your application.
    • Focus on a specific application or set of applications.
From an application dashboard, you can:
  • Create queries to focus on the InfoSphere Streams objects that are relevant to your application.
  • Create derived metrics to dive deeper into the health of your applications.
  • Display the results of a query on a card, which enables you to visualize important data about your applications with tools, such as live graphs, tables, and charts.
  • View product, application trace, and console logs.
Support of incremental checkpointing for InfoSphere Streams applications

You can now checkpoint the state of your InfoSphere Streams applications more efficiently with incremental checkpointing. The basis of incremental checkpointing is to track the changes made to the operator state during tuple processing with low overheads. Upon checkpointing the operator state, only the changed portion since previous checkpoint is saved. This approach reduces checkpointing overheads and saves storage space and I/O bandwidth on checkpointing backend storage. In IBM InfoSphere Streams Version 4.1.1, incremental checkpointing is enabled for the Join, Aggregate, and Sort operators in SPL standard toolkit and works transparently for InfoSphere Streams applications using those operators.