IBM InfoSphere Streams Version 4.1.1

Tutorial

SPL standard and specialized toolkits > com.ibm.streams.teda 1.0.2 > com.ibm.streams.teda.parser.binary > ASN1Parse > Tutorial

This tutorial describes the usage of the ASN1Parse operator by explaining the source code in the teda.sample.ASN1Parse.Assignments sample application. In addition, this tutorial demonstrates how to modify the SPL output schema.

To review the discussed code, import the sample project into your IBM InfoSphere Streams Studio workspace.

The following ASN.1 grammar is taken from the structure definition document, the etc/grammar.asn file.

MyModule DEFINITIONS IMPLICIT TAGS ::=
BEGIN

A ::= CHOICE
{
	integer [0] INTEGER,
	b [1] B,
	...
}

B ::= SEQUENCE
{
	boolean [0] BOOLEAN OPTIONAL,
	integer [1] INTEGER DEFAULT "3",
	c [2] SEQUENCE OF C,
	...
}

C ::= CHOICE
{
	d [0] D,
	e [1] E,
	...
}

D ::= SEQUENCE
{
	string [0] UTF8String,
	...
}

E ::= SEQUENCE
{
	integer [0] INTEGER,
	strings [1] SEQUENCE OF UTF8String OPTIONAL,
	...
}

END

The following SPL code shows the ASN1Parse operator call. The code is taken from the Main.spl file of the sample.

(
	stream<Record> Records as O
) as ParsedRecords = ASN1Parse(DataBlocks as I)
{
	param
		payloadAttribute: payload;
		structureDocument: "etc/grammar.asn";
		trigger: "/b/c";
		checkConstraints: true;
	output O:
		blockNumber = fromInput(blockNo);
}

The ASN1Parse operator automatically selects the A ASN.1 element as the trigger root because it is the only ASN.1 element that does not have parent elements. Alternatively, you can add the pdu parameter with the "A" value, for example:

		pdu: "A";

In this example, the trigger is specified as "/b/c", and the corresponding ASN.1 path is A.b.c.

Because of the repetition of the c field (c [2] SEQUENCE OF C) in the B ASN.1 SEQUENCE, the "/b/c" trigger generates an output tuple for each occurrence of a C element for the mentioned ASN.1 path.

SPL output schema.

The C element is a CHOICE, and each element of that choice is optional. For optional elements, it is recommended that you have an SPL list type. If the optional element is present, the list should contain one element. If optional elements are not present, the list should be empty. Alternatively, you can use an SPL set type, but it is useful only if you do not want duplicate tuples or if the input data does not contain any duplicate tuples because the SPL set stores only unique elements.

The C element has d and e fields that are of ASN.1 SEQUENCE type, which, along with ASN.1 SET and CHOICE types, results in a tuple type that holds the D and E attributes.

The E element has the repeating strings field. Similar to the optional elements, it is recommended that you use the SPL list or set type, but since there is an ASN.1 SEQUENCE OF primitive type, the list contains the corresponding SPL primitive type directly and not another SPL tuple type.

Mandatory elements, like string in the D attribute and integer in the E attribute, normally require a mandatory SPL attribute of a fitting SPL primitive type. You can use an SPL list, like for optional elements, but the list always contains exactly one element.

The following snippet shows the SPL schema that is required to generate the complete C ASN.1 element:

type C = tuple // CHOICE
<
	list<D> d, // optional, non-repeating
	list<E> e // optional, non-repeating
>;

type D = tuple // SEQUENCE
<
	rstring string // UTF8String, mandatory, non-repeating
>;

type E = tuple // SEQUENCE
<
	int64 integer, // INTEGER, mandatory, non-repeating
	list<rstring> strings // UTF8String, optional, repeating
>;

If, for example, you are not interested in E, remove the list<E> e line from the C type definition and the complete E type definition. Only ASN.1 fields that have a corresponding SPL counterpart are converted.

The following snippet shows the reduced SPL schema:

type C = tuple // CHOICE
<
	list<D> d // optional, non-repeating
>;

type D = tuple // SEQUENCE
<
	rstring string // UTF8String, mandatory, non-repeating
>;

If you need an additional SPL attribute, for example, X in D, which is set by downstream operators, add it to the D type definition. SPL attributes that do not have a corresponding ASN.1 counterpart are either taken from the input (which is valid only for top-level SPL attributes) or initialized with the default value.

The following snippet shows the extended SPL schema:

type C = tuple // CHOICE
<
	list<D> d // optional, non-repeating
>;

type D = tuple // SEQUENCE
<
	rstring X, // will be set downstream
	rstring string // UTF8String, mandatory, non-repeating
>;

The spl-schema-from-asn1 tool generates the proper SPL type definitions from the ASN.1 grammar.

The SPL type definition in the Main composite of the following sample, below, has two additional SPL attributes, filename and blockNumber, at the top level. Both output attributes get their values from the input tuple. The d and e SPL attributes correspond to the SPL schema described first, except that the D and E types are replaced by their definitions, and C is renamed to Record.

Record = tuple
<
	// The filename is taken from the input stream.
	rstring filename,
	// The block number is taken from the input attribute 'blockNo'.
	int64 blockNumber,
	// D is a SEQUENCE containing the UTF8String field 'string'.
	// The SPL attribute is a list because C is a choice. The
	// list will either be empty or will have exactly one element.
	// The SEQUENCE requires that the list contains a tuple type.
	// The name of the inner SPL attribute 'string' is identical
	// to the name of the ASN.1 field.
	list<tuple<rstring string>> d,
	// E is a SEQUENCE containing the INTEGER field 'integer'
	// and a SEQUENCE OF UTF8String, stored as field 'strings'.
	// The SPL attribute is a list because C is a choice. The
	// list will either be empty or will have exactly one element.
	// The SEQUENCE requires that the list contains a tuple type.
	// The names of the inner SPL attributes 'integer' and 'strings'
	// are identical to the name of the ASN.1 fields.
	// The SPL attribute 'strings' is a list because the ASN.1
	// field 'strings' is repeating (SEQUENCE OF).
	list<tuple<int64 integer, list<rstring> strings>> e
>;