IBM Support

CDPz: Streaming your application's structured data to analytics platforms

How To


Summary

In addition to the data streams that IBM Common Data Provider for z Systems (CDPz) provides to process and stream the SMF records produced by IBM products and several third-party products, you can extend the CDPz functionality to stream your own SMF records and application data.

Objective

This ability provides you the option to stream your application’s structured data to analytics platforms (Logstash or Splunk) via CDPz. You implement the code to write your application’s structured data to SMF with the application structured data as the payload in the SMF record and define the CDPz artifacts to support your application structured data definition. The System Data Engine (SDE) component of CDPz then picks up the special SMF record and processes your application structured data to stream it to the appropriate subscribers.

Environment

To stream your application’s structured data via CDPz, the central work is to use the System Data Engine (SDE) collector language to define the necessary definition objects for your application structured data.  The main tasks you will need to do include:

 

  • If your application’s structured data is non-SMF data, you need implement the code to write your application’s structured data via a special SMF record.
  • On SDE, create record definitions, update definitions and optionally template definitions for your special SMF record.
  • On Configuration Tool, create the corresponding user data streams for your update definitions. Once the user data streams are defined, you can create or update the policy for SDE and Data Streamer to include the user data streams in the same way as data streams delivered by CDPz.
  • On analytics platforms (Logstash or Splunk), update the configuration files to support the new data source types.

Steps

Follow the procedures below for the step by step instructions to stream your applications structured data.

 

Step 1: Write your application’s structured data to SMF to be picked up by CDPz.

 

If the application data you wish to stream via CDPz is already available in SMF records, you can skip Step 1 and move on to Step 2.

 

If your application’s structured data is non-SMF data, you need implement the code to write the data to SMF to be further processed by CDPz. You write the application structured data to CDPz by packaging the application structured data inside a special SMF record and submitting the SMF record using the SMFWTM/SMFEWTM macro or other appropriate method.

 

SMF record types 128 through 255 are available for user-written records. In addition, you can use SMF record type 127 subtype 1000 to stream your non-SMF application structured data. The SMF user exit provided by CDPz suppresses the recording of the SMF 127/1000 records. If your SMF is in data set recording mode and you want the records to be processed by CDPz only and not recorded to VSAM data sets, you can write your application’s structured data as SMF 127/1000 records. Note that CDPz may be enhanced in the future to make the SMF record type and subtype configurable instead of using the fixed type 127 and subtype 1000.

 

If you use SMF record 127/1000 for your application’s structured data, you must define the record layout according to the following table. Since the SMF 127/1000 records are used in CDPz by multiple data providers, CDPz uses additional two fields (SM127SRC and SM127SRS) following the standard SMF record header to identify the user data payload type and subtype.

 

Offset

Data Field

Length

Note

0 (x00)

SM127LEN

2

Record length (maximum size of 32,756)

 

Must be the logical record length including the RDW.

2 (x02)

SM127SEG

2

Segment descriptor

 

Initialize the field with zeros.

4 (x04)

SM127FLG

1

System indicator

 

Turn on bit 1 (X'40') indicating record with subtypes.

5 (x05)

SM127RTY

1

Record type

 

Must be 127.

6 (x06)

SM127TME

4

Time of record written

 

Don’t need supply if using SMFWTM or SMFEWTM macro.

10(x0A)

SM127DTE

4

Date of record written

 

Don’t need supply if using SMFWTM or SMFEWTM macro.

14(x0E)

SM127SID

4

System ID

 

Don’t need supply if using SMFWTM or SMFEWTM macro.

18(x12)

SM127SSI

4

Subsystem ID

 

Must be 'CDP '.

22(x16)

SM127STY

2

Record subtype

 

Must be 1000.

24(x18)

SM127SRC

2

User payload type

 

Use a number between 128 and 255.

26(x1A)

SM127SRS

2

User payload subtype

 

Use this field to further identify your payload application structured data.

28(x1C)

 

 

Start of user payload data.

 

Refer to MVS System Management Facilities (SMF) manual for more details on writing SMF records.

 

Step 2: Create a PDS for user defined CDPz definitions.

 

If you do not already have one, create a partitioned data set (PDS) that is used as the user concatenation library for the user record, update and template definitions. For more information about how to create the data set, see Creating a System Data Engine data stream definition in the Knowledge Center.

 

Step 3: Define the RECORD definitions for the layout of your SMF records.

 

You need define record definitions to CDPz for the SMF records containing your application’s structured data so that CDPz can process the data correctly. The record definition must match the physical layout of the SMF records. Create a member in the data set created in step 2 to define the RECORD definitions for the layout of your SMF records.

 

In this example, we create a member named USRRS127 to define the record definition for structured data using SMF record type 127 and subtype 1000. Note that SMF 127/1000 has two additional fields, SM127SRC and SM127SRS, following the standard SMF record header. If you are using a different SMF record type (with or without subtypes), be sure to define the record definition with the correct SMF header.

 

USERID.LOCAL.DEFS(USRRS127)

/**********************************************************************/ 
/*                                                                    */ 
/* SMF Record Type 127 SubType 1000 for User Data                     */ 
/*                                                                    */ 
/**********************************************************************/ 

SET SMF_ABC_RECTYPE = '127'  ;                                          
SET SMF_ABC_RECSTYP = '1000' ;                                          
SET SMF_ABC_SRCID   = '128'  ;                       ==> Note 1          

DEFINE RECORD ABC_01                                 ==> Note 2         
  VERSION 'CDP.110'                                                     
  IN LOG SMF                                                            
  IDENTIFIED BY SM127RTY = &SMF_ABC_RECTYPE                              
            AND SM127STY = &SMF_ABC_RECSTYP                             
            AND SM127SRC = &SMF_ABC_SRCID            ==> Note 3         
            AND SM127SRS = 1                         ==> Note 3          
  FIELDS (                                                                                                                                          
  ---------------------------------------------------------------------  
  ---   Standard SMF record header                                        
  ---------------------------------------------------------------------  
      SM127LEN  LENGTH  2  BINARY,          -- Record length      
      SM127SEG  LENGTH  2  BINARY,          -- Segment descriptor 
      SM127FLG  LENGTH  1  HEX,             -- System indicator   
      SM127RTY  LENGTH  1  BINARY,          -- Record Type        
      SM127TME  LENGTH  4  TIME(1/100S),    -- Time               
      SM127DTE  LENGTH  4  DATE(0CYYDDDF),  -- Date               
      SM127SID  LENGTH  4  CHAR,            -- System ID          
      SM127SSI  LENGTH  4  CHAR,            -- Subsystem ID       
      SM127STY  LENGTH  2  BINARY,          -- Record subtype     
  ---------------------------------------------------------------------  
  ---   CDP fields for payload type/subtype                               
  ---------------------------------------------------------------------  
      SM127SRC  LENGTH  2  BINARY,          -- Payload Type        ==> Note 3
      SM127SRS  LENGTH  2  BINARY,          -- Payload Subtype     ==> Note 3                                                                          
  ---------------------------------------------------------------------  
  ---   SMF User Data                                                     
  ---------------------------------------------------------------------  
      fld1 ,                                         ==> Note 4           
      fld2 ,                                         ==> Note 4          
      ......                                         ==> Note 4          
      Fldn                                           ==> Note 4          
      );                                                                   

Notes:

 

  1. Use a number between 128 and 255 to identify your user data payload type.
  2. Choose a meaningful name for the record. The name must be different from any other record definitions.
  3. Use SM127SRC and SM127SRS to uniquely identify the SMF record.
  4. These are the fields for user data in the SMF user record. Fields are separated by commas.

 

You can define multiple record definitions in the same member. Use a different SM127SRS value for each record.

 

For the language reference of the DEFINE RECORD statement, see DEFINE RECORD statement in the Knowledge Center.

 

Step 4: Define the UPDATE definitions for your SMF records.

 

Create a member in the data set created in step 3 to define the UPDATE definitions to collect the SMF user records. In this example, we create a member named USRUS127 to define the update definitions for the record defined in step 3.

 

USERID.LOCAL.DEFS(USRUS127)

SET IBM_FILE = 'ABC01';                ==> Note 1  

DEFINE UPDATE ABC_01                   ==> Note 2  
  VERSION 'CDP.110'                                 
  FROM ABC_01                          ==> Note 3  
  TO &IBM_UPDATE_TARGET                            
  &IBM_CORRELATION                                 
  AS &IBM_FILE_FORMAT SET(ALL);                    

Notes:

 

  1. Set a different name for IBM_FILE for each update definition. The name must be a valid DD name.
  2. The update definition name must be unique among all update definitions. The update definition name can be same as or different from the record definition name as long as the name is not used by any other update definitions.
  3. The FROM clause identifies the source of the update definition, which is the name of the user record definition.

 

You can define multiple user update definitions in the same member. Also, you can use the WHERE clause to select certain records for collection.

 

For the language reference of the DEFINE UPDATE statement, see DEFINE UPDATE statement in Knowledge Center.

 

Step 5: Optional, define the TEMPLATE definitions for the UPDATE definitions.

 

If you want to filter the fields to be streamed from the SMF user records, add a DEFINE TEMPLATE statement after the update definition in the same member of the update definition. Make sure you specify the template name the same as the corresponding update definition name.

 

In the template definitions, you must include the SM127TME and SM127DTE fields in the standard SMF record header; the two fields are required for timestamp resolution when you ingest data to your analytics platform.

 

For the language reference of the DEFINE TEMPLATE statement, see DEFINE TEMPLATE statement in Knowledge Center.

 

Step 6: Validate the syntax of the RECORD, UPDATE, and, optionally, TEMPLATE definitions.

 

Use the following example job to verify the syntax of the record, update and template definitions.

 

//HBOJBCOL JOB (),'DUMMY',MSGCLASS=X,MSGLEVEL=(,0),
//         CLASS=A,NOTIFY=&SYSUID                 
//*                                                           
//HBOSMFCB EXEC PGM=HBOPDE,REGION=0M,PARM='SHOWINPUT=YES'     
//STEPLIB  DD   DISP=SHR,DSN=HBOvrm.SHBOLOAD                   ==> Note 1
//HBOOUT   DD   SYSOUT=*                                       
//HBODUMP  DD   SYSOUT=*                                      
//HBOIN    DD   DISP=SHR,DSN=HBOvrm.SHBODEFS(HBOCCSV)          ==> Note 1
//         DD   DISP=SHR,DSN=HBOvrm.SHBODEFS(HBOCCORY)         ==> Note 1
//         DD   DISP=SHR,DSN=HBOvrm.SHBODEFS(HBOLLSMF)         ==> Note 1
//         DD   DISP=SHR,DSN=USERID.LOCAL.DEFS(USRRS127)       ==> Note 2
//         DD   DISP=SHR,DSN=USERID.LOCAL.DEFS(USRUS127)       ==> Note 2
//         DD   *                                             
COLLECT SMF                                                   
WITH STATISTICS                                               
BUFFER SIZE 1 M;                                               
//*                                                           
//HBOLOG   DD   DUMMY                                          

 

Notes:

 

  1. Change HBOvrm to the high-level qualifier for the IBM Common Data Provider for z Systems SMP/E target data set.
  2. These two statements specify the members for the user record, update and template definitions. USERID.LOCAL.DEFS is the USER concatenation library. USRRS127 is the member that contains the user record definitions, and USRUS127 is the member that contains the user update and template definitions. Replace these values based on your configuration. Make sure that the user record definition member is included before the user update definition member.

 

Important:  make sure that the definitions are error-free by running the validation job before you create the user data stream in the Configuration Tool.

 

If there is no syntax error, you see the following messages in HBOOUT.

 

HBO0125I ABC_01 was successfully defined. 

 

HBO0201I Update ABC_01 was successfully defined.

 

If there are syntax errors, correct the errors according to the messages in the output file that is defined by HBOOUT.

 

Step 7: Validate data collection with the RECORD, UPDATE, and, optionally, TEMPLATE definitions.

 

You can use a batch SDE job to collect data from a SMF log data set containing your SMF user records. You can then validate the correctness of the resulting data by reviewing the output data set.

 

Use the following example job to verify the data collected with your record, update and template definitions.

 

//HBOJBCOL JOB (),'DUMMY',MSGCLASS=X,MSGLEVEL=(,0),
//         CLASS=A,NOTIFY=&SYSUID                 
//*                                                           
//HBOSMFCB EXEC PGM=HBOPDE,REGION=0M,PARM=' ALLHDRS=YES'     
//STEPLIB  DD   DISP=SHR,DSN=HBOvrm.SHBOLOAD                   ==> Note 1
//HBOOUT   DD   SYSOUT=*                                      
//HBODUMP  DD   SYSOUT=*                                      
//HBOIN    DD   DISP=SHR,DSN=HBOvrm.SHBODEFS(HBOCCSV)          ==> Note 1
//         DD   DISP=SHR,DSN=HBOvrm.SHBODEFS(HBOCCORY)         ==> Note 1
//         DD   DISP=SHR,DSN=HBOvrm.SHBODEFS(HBOLLSMF)         ==> Note 1
//         DD   DISP=SHR,DSN=USERID.LOCAL.DEFS(USRRS127)       ==> Note 2
//         DD   DISP=SHR,DSN=USERID.LOCAL.DEFS(USRUS127)       ==> Note 2
//         DD   *                                             
COLLECT SMF                                                   
WITH STATISTICS                                               
BUFFER SIZE 1 M;                                               
//*                                                           
//ABC01    DD   DSN=USERID.ABC01.CSV,                          ==> Note 3
//         DISP=(NEW,CATLG,DELETE),SPACE=(CYL,(10,10)),        
//         DCB=(RECFM=V,LRECL=32756)                           

 

Notes:

 

  1. Change HBOvrm to the high-level qualifier for the IBM Common Data Provider for z Systems SMP/E target data set.
  2. These two statements specify the members for the user record, update and template definitions. USERID.LOCAL.DEFS is the USER concatenation library. USRRS127 is the member that contains the user record definitions, and USRUS127 is the member that contains the user update and template definitions. Replace these values based on your configuration. Make sure that the user record definition member is included before the user update definition member.
  3. This statement specifies the output data set which the SDE stores resulting data. Make sure that the DD name matches the value of IBM_FILE variable in the SET IBM_FILE statement ahead of your update definition. The resulting data set is a CSV file. You can download the file to your desktop and use spreadsheet applications to validate the data collected for each field.

 

Important:  make sure that you do enough data validation by running the batch SDE job before you create the user data streams in the Configuration Tool.

 

Step 8: Create user data streams in the Configuration Tool.

 

Create a user System Data Engine data stream in the Configuration Tool for each of the update definitions that are created in previous steps. Make sure that the data stream name is the same as the update definition name, and that you specify the member for the record definition before the member for the update and template definitions in the SHBODEFS data set members field.

 

For more information, see Creating a System Data Engine data stream definition in Knowledge Center.

 

Step 9: Update the Logstash configuration for the new data streams.

 

If you are ingesting data to the Elastic Stack, for each data stream, create a field name annotation configuration file, and a timestamp resolution configuration file in the Logstash configuration directory.

 

A. Field name annotation configuration file: the file is named H_data_stream_name.lsh, for example, H_ABC_01.lsh.

 

# CDPz ELK Ingestion
#
# Field Annotation for stream zOS-ABC_01
#

 

filter {
   if [sourceType] == "zOS-ABC_01" {
      csv{ columns => [  "Correlator", "SM127LEN", "SM127SEG", "SM127FLG", "SM127RTY", "SM127TME", "SM127DTE", "SM127SID", "SM127SSI", "SM127STY", "SM127SRC", "SM127SRS","fld1”, "fld2", "fldn" ]
         separator => "," }
   }
}

 


Notes:

 

  1. The value of sourceType must match the data source type of the data stream. The naming convention is zOS-data-stream_name, for example, zOS-ABC_01.
  2. Replace fld1, fld2, fldn with the field names defined in your record definition. If you have a template definition associated with the update definition, change the entire column list to match the fields and order in the template definition. Keep Correlator as the first column in the list.

 

B. Timestamp resolution configuration file: the file is named N_data_stream_name.lsh, for example, N_ABC_01.lsh.  

 

# CDPz ELK Ingestion
#
# Timestamp Extraction for stream zOS-ABC_01
#

 

filter {
   if [sourceType] == "zOS-ABC_01" {
      mutate{ add_field => {
         "[@metadata][timestamp]" => "%{SM127DTE} %{SM127TME}"
        }}

 

      date{ match => [
             "[@metadata][timestamp]", "yyyy-MM-dd HH:mm:ss:SS"
        ]}
   }
}

 


Notes:

 

  1. The value of sourceType must match the data source type of the data stream. The naming convention is zOS-data_stream_name, for example, zOS-ABC_01. 

 

Restart Logstash after you create the files for all data streams.

 

Refer to Logstash documentation for more information about the configuration files.

 

Step 10: Update Splunk to support the new data streams.

 

If you are ingesting data to Splunk, define the layout of the data streams to the Splunk server by creating the props.conf file in the Splunk_Home/etc/apps/ibm_cdpz_buffer/local directory with the following content. If the props.conf already exists, append the following lines to that file.

 

#
# ABC_01
#

 

[zOS-ABC_01]
 TIMESTAMP_FIELDS = SM127DTE, SM127TME, timezone
 TIME_FORMAT= %F %H:%M:%S:%2Q %z
 FIELD_NAMES = "sysplex","system","hostname","","","sourcename","timezone","Correlator", "SM127LEN", "SM127SEG", "SM127FLG", "SM127RTY", "SM127TME", "SM127DTE", "SM127SID", "SM127SSI", "SM127STY", "SM127SRC", "SM127SRS", "fld1”, "fld2", "fldn"
 INDEXED_EXTRACTIONS = csv
 KV_MODE = none
 NO_BINARY_CHECK = true
 SHOULD_LINEMERGE = false
 category = Structured
 disabled = false
 pulldown_type = true

 


Notes:

 

  1. You must specify the data source name of the data stream. The naming convention is zOS-data_stream_name, for example, zOS-ABC_01. 
  2. Replace fld1, fld2, fldn with the field names defined in your record definition. If you have a template definition associated with the update definition, change the entire column list to match the fields and order in the template definition. Keep Correlator as the first column in the list.

 

In the Splunk user interface, you must also configure the file to data source type mapping for the new data stream. The file that the Data Receiver saves is named CDP-zOS-data_stream_name-*.CDP. For example, the data stream ABC_01 has the file named CDP-zOS-ABC_01-*.CDP.

 

Restart the Splunk server after you make the changes.

 

Refer to Splunk documentation for more information.

 

Step 11: Create or update the policy for SDE and Data Streamer in the Configuration Tool.

 

In the Configuration Tool, create or update the policy to add the new System Data Engine data streams so that the SMF user records can be processed and streamed by the IBM Common Data Provider for z Systems.

 

  1. In the Configuration Tool primary window, create a new policy or select the policy that you want to update.
  2. Click the DATA STREAM button in the Policy Profile Edit window.
  3. Find and select the data streams from the list in the select data stream window.
  4. Assign a subscriber for each new data stream.
  5. In the Policy Profile Edit window, click SDE to specify values for USER Concatenation and CDP Concatenation fields, and click OK. Following previous examples, USERID.LOCAL.DEFS should be specified for the USER Concatenation field. Fill in this field with the name of your user concatenation library.
  6. Click Save to save the policy.

 

Important: Each time that the associated record definition or update definition is changed, you must edit and save the policy in the Configuration Tool so that the changes are reflected in the policy files.

 

For more information on how to update a policy, see Updating a policy in Knowledge Center..

 

Step 12: Restart the System Data Engine and the Data Streamer

 

 

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSTQCD","label":"IBM Z Common Data Provider"},"Component":"","Platform":[{"code":"PF035","label":"z\/OS"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB35","label":"Mainframe SW"}}]

Document Information

Modified date:
28 April 2020

UID

ibm10718855