IBM InfoSphere Streams Version 4.1.0

Glossary

This glossary includes terms and definitions for InfoSphere®® Streams Telecommunications Event Data Analytics.

The following cross-references are used in this glossary:

  • see refers you from a term to a preferred synonym, or from an acronym or abbreviation to the defined full form.
  • see also refers you to a related or contrasting term.

application control

Implements the synchronization and communication between the Lookup Manager and the ITE applications.

application control directory

A directory in the (shared) file system. All ITE and Lookup Manager applications that belong to a Telecommunications Event Data Analytics application, must have access to the directory. The applications store status information in the directory. The status information is used for an inter-application communication.

application framework

A set of Streams application templates. The application templates are used to build Lookup Manager applications and ITE applications that are typically used in Telecommunications-oriented applications. The application framework also provides wizards and tools to support the setup, configuration, and operation of these applications.

ASN.1

Abstract Syntax Notation One is a common encoding format for CDRs.

business logic

Business logic is the part of the program that encodes the real-world business rules.

In the application framework, the business logic and the implemented rules validate and change the input data as needed for the use case, for example, transformation and enrichment.

CDR

Call Detail Record. Typically, it contains information about a transaction of a mobile network element.

chain

In the ITE application, a chain is responsible for reading an input file, parsing its content, applying the business rules and writing the transformed and enriched tuples to a file. To process multiple files in parallel, multiple chains can be configured.

checkpointing

The mechanism of the ITE application that save state information to disk. The state information is used to recover the application state after a job restart. A job restart might be required because of, for example, maintenance tasks.

cleanup

A configurable mechanism in the application framework to remove data that is no longer valid, from stateful operators.

context

The context is the part of the ITE application that is stateful and supports tuple correlation or aggregation. The context consists of the built-in tuple deduplication and a customizable part. Both parts can be enabled or disabled independent from each other.

control path

See application control directory.

CRM

Customer Relationship Management. A system to manage interactions with customers.

CSV

Comma-separated values is a common data-encoding format.

DBLoader toolkit

InfoSphere Streams toolkit to load files into a database. It uses batch processing methods to improve performance. The DbLoader application is available on GitHub in the IBMStreams/samples directory.

enrichment

Data enrichment is a process to enhance, refine or otherwise improve raw data.

In the context of a Telecommunications Event Data Analytics application, enrichment means that data that is not part of the input, is added to a tuple, for example, a contract number or demographic information. The added data comes from external data sources, for example a database.

See enrichment data.

enrichment data

Data enrichment is a process to enhance, refine or otherwise improve raw data. The enrichment data is typically a rarely changing set of data that often comes from CRM system and that is used by the ITE application to enrich tuples. The enrichment data is stored in shared memory. The ITE application uses the lookup mechanism to access the enrichment data that is also called lookup data.

group

The ITE application supports tuple grouping. All tuples of a group flow through the same context that is typically stateful. The groups and the corresponding contexts remove potential performance or memory bottlenecks because many context instances process tuples in parallel and each instance can be placed on a different host.

housekeeping

Housekeeping is an automated or manual process to free resources such as memory or hard disk space or to remove no longer valid information.

The application framework provides an automated and configurable process to remove no longer valid data. See also cleanup.

The administrator of Telecommunications Event Data Analytics applications is responsible to remove files that are no longer needed.

ITE

Ingest, Transform, and Enrich are the main functions of an ITE application.

ITE application

A Streams application that is created with the application framework. The ITE application ingests CDRs that are stored in files, and transforms, enriches, and stores them. You can configure and customize this application to fit your business needs.

lookup

A mechanism to access enrichment data that is stored as key-value pairs in the shared memory.

lookup data

See enrichment data.

Lookup Manager application

A Streams application that is created with the application framework. The Lookup Manager application manages the content of the shared memory that contains the enrichment data, and suspends the ITE applications while updating the shared memory. The ITE application uses the enrichment data during its enrichment step. You can configure and customize this application to fit your business needs.

namespace

In the application framework, the SPL namespace of an ITE application is used as a unique identifier. Only one instance of an ITE application with the unique namespace can run at a time.

ODM

Operational Decision Management. IBM® product to manage and create business rules.

processing chain

See chain.

segment

The shared memory consists of one or more named segments. Each segment can have one or more stores.

shared memory

Shared memory is memory that might be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies of data. In the context of Streams and the application framework, the shared memory holds enrichment data as key-value pairs that can be accessed by every processing element (PE) on the same host. The Lookup Manager application stores enrichment data into the shared memory and the ITE applications read the enrichment data. The shared memory consists of segments and stores.

SQL

Structured Query Language. Programming language to query and manipulate databases.

store

A store is a named map that holds enrichment data as key-value pairs. A shared memory segment can have several stores.

table file

The ITE application creates a CSV output file that can be loaded into a database table. Typically, the ITE application creates many such output files simultaneously, one for each database table.

tap

The ITE application supports some customizable taps that get copies of the flowing tuples. The copied tuples can be used to monitor or process them independent from the normal flow.

Telecommunications Event Data Analytics application

A set of one ore more ITE applications and an optional Lookup Manager application. These applications work together to implement a use case.

Telecommunications Event Data Analytics application framework

See application framework.

Telecommunications Event Data Analytics toolkit

A Streams toolkit that contains a set of generic operators and the application framework.

transformation

The process of changing values or adding attributes by applying business rules to the input data.

variant

A set of configuration parameters that specify the main features of an ITE application, for example, tuple deduplication or grouping. The supported variants are called A, B, and C.