InfoSphere DataStage parallel jobs: Understanding the content of the APT_DUMP_SCORE report

Technote (FAQ)


Question

How do I read and interprete the results from the APT_DUMP_SCORE report?

Answer

The APT_DUMP_SCORE report documents activity within the InfoSphere Information Server parallel engine. The parallel engine is a runtime engine that uses a combination of interpreted and compiled code to extract, transformation and load of data. The InfoSphere DataStage Designer enables you to specify the data flow and processing steps in a job, without having to specify the degree and nature of parallel processing which happens during job run time.

A configuration file (APT_CONFIG_FILE) specifies the nature and amount of parallelism, and the specific resources that are used to run a job. When a job is to be executed, the data flow information in the compiled job is combined with the configuration file to produce a detailed execution plan called the score. The score is useful in analyzing job performance and diagnosing problems in jobs. If the user sets the environment variable APT_DUMP_SCORE, a text representation of the score (a report) is written to the job’s log.

The job score that populates the report includes information about all operators in the job, including operators that are automatically inserted by the Designer client. The score shows how many nodes each operator runs on, and the anticipated flow of the job as it goes from operator to operator through internal data sets.




To see the score in the job log, enable APT_DUMP_SCORE (by defining it and supplying a value) either on the job level, project level, or the system level. Enabling APT_DUMP_SCORE at the system level results in all parallel jobs providing a dump score report, and is ideal for development and quality assurance environments.

This document provides a start in understanding how your job works, and further analyzing job issues.

The following score example is a fairly small job. When you enable APT_DUMP_SCORE and then run a job, you might typically see in the log the following text:


main_program: This step has 10 datasets:
ds0: {op0[1p] (sequential PacifBaseMCES)
eOther(APT_ModulusPartitioner { key={ value=MBR_SYS_ID }
})<>eCollectAny
op1[4p] (parallel RemDups.IndvIDs_in_Sort)}
ds1: {op1[4p] (parallel RemDups.IndvIDs_in_Sort)
[pp] eSame=>eCollectAny
op2[4p] (parallel RemDups)}
ds2: {op2[4p] (parallel RemDups)
[pp] eSame=>eCollectAny
op6[4p] (parallel buffer(0))}
ds3: {op3[1p] (sequential PacifGalaxyMember)
eOther(APT_ModulusPartitioner { key={ value=MBR_SYS_ID }
})<>eCollectAny
op4[4p] (parallel IndvIdJoin.toIndvIdJoin_Sort)}
ds4: {op4[4p] (parallel IndvIdJoin.toIndvIdJoin_Sort)
eOther(APT_HashPartitioner { key={ value=MBR_SYS_ID }
})#>eCollectAny
op5[4p] (parallel inserted tsort operator {key={value=MBR_SYS_ID, subArgs={asc}}}(0) in IndvIdJoin)}
ds5: {op5[4p] (parallel inserted tsort operator {key={value=MBR_SYS_ID, subArgs={asc}}}(0) in IndvIdJoin)
[pp] eSame=>eCollectAny
op7[4p] (parallel APT_JoinSubOperatorNC in IndvIdJoin)}
ds6: {op6[4p] (parallel buffer(0))
[pp] eSame=>eCollectAny
op7[4p] (parallel APT_JoinSubOperatorNC in IndvIdJoin)}
ds7: {op7[4p] (parallel APT_JoinSubOperatorNC in IndvIdJoin)
[pp] eAny=>eCollectAny
op8[4p] (parallel APT_TransformOperatorImplV22S14_ETLTek_HP37FMember_PMR64262_Test1_SplitTran2 in SplitTran2)}
ds8: {op8[4p] (parallel APT_TransformOperatorImplV22S14_ETLTek_HP37FMember_PMR64262_Test1_SplitTran2 in SplitTran2)
eSame=>eCollectAny
op9[4p] (parallel buffer(1))}
ds9: {op9[4p] (parallel buffer(1))
>>eCollectOther(APT_SortedMergeCollector { key={ value=MBR_SYS_ID,
subArgs={ asc }
}
})
op10[1p] (sequential APT_RealFileExportOperator in HP37_OvaWestmember_extract_dat)}
It has 11 operators:
op0[1p] {(sequential PacifBaseMCES)
on nodes (
node1[op0,p0]
)}
op1[4p] {(parallel RemDups.IndvIDs_in_Sort)
on nodes (
node1[op1,p0]
node2[op1,p1]
node3[op1,p2]
node4[op1,p3]
)}
op2[4p] {(parallel RemDups)
on nodes (
node1[op2,p0]
node2[op2,p1]
node3[op2,p2]
node4[op2,p3]
)}
op3[1p] {(sequential PacifGalaxyMember)
on nodes (
node2[op3,p0]
)}
op4[4p] {(parallel IndvIdJoin.toIndvIdJoin_Sort)
on nodes (
node1[op4,p0]
node2[op4,p1]
node3[op4,p2]
node4[op4,p3]
)}
op5[4p] {(parallel inserted tsort operator {key={value=MBR_SYS_ID, subArgs={asc}}}(0) in IndvIdJoin)
on nodes (
node1[op5,p0]
node2[op5,p1]
node3[op5,p2]
node4[op5,p3]
)}
op6[4p] {(parallel buffer(0))
on nodes (
node1[op6,p0]
node2[op6,p1]
node3[op6,p2]
node4[op6,p3]
)}
op7[4p] {(parallel APT_JoinSubOperatorNC in IndvIdJoin)
on nodes (
node1[op7,p0]
node2[op7,p1]
node3[op7,p2]
node4[op7,p3]
)}
op8[4p] {(parallel APT_TransformOperatorImplV22S14_ETLTek_HP37FMember_PMR64262_Test1_SplitTran2 in SplitTran2)
on nodes (
node1[op8,p0]
node2[op8,p1]
node3[op8,p2]
node4[op8,p3]
)}
op9[4p] {(parallel buffer(1))
on nodes (
node1[op9,p0]
node2[op9,p1]
node3[op9,p2]
node4[op9,p3]
)}
op10[1p] {(sequential APT_RealFileExportOperator in HP37_OvaWestmember_extract_dat)
on nodes (
node2[op10,p0]
)}
It runs 35 processes on 4 nodes.


The dump score contains two sections -- the data sets (DS) and the operators (OP).
Data sets - The data sets that are listed in the score are the same type of data sets that you create with the Data Set stage -- in this context, they are temporary memory and/or disk storage during the job's run.
Operators - Operators are individual parallel engine stages that you might see on the user interface.

In a typical job flow, operators are end-points, and data sets are the links between the operators. (An exception is when data sets are used to actually output to a file.)

Each and every link on the job design is potentially a data set. However, unlike the data set stage which you put in your resource disk group by using the specified node pool within the job's configuration file (APT_CONFIG_FILE), these data sets are in memory. These temporary data sets are only placed in the scratch disk space when an imposed limit is reached. A limit can be imposed due to environment settings, or due to physical memory limitations.

Each operator listed in the score spawns a number of processes that are dependent on:

  • the job's established configuration file (APT_CONFIG_FILE)
  • constrained by the node pool settings
  • the operator configuration in the parallel engine code
  • Several environment variables, such as APT_DISABLE_COMBINATION, being set/unset.

First, let us focus on the operators, which are listed after the data sets in the score:

op0[1p] {(sequential PacifBaseMCES)
on nodes (
node1[op0,p0]
)}
op1[4p] {(parallel RemDups.IndvIDs_in_Sort)
on nodes (
node1[op1,p0]
node2[op1,p1]
node3[op1,p2]
node4[op1,p3]
)}

In the proceeding example, the two operators are: op0 and op1. The operator name is prefixed with the code name "op" and appended with an incremental numeric value starting with zero (0). Next to the operator name, there is an enclosed bracket with a value that is followed by the letter "p", for example, "[1p]". The value indicates the number of partitions given to that operator by the engine. For the first operator, only one (1) partition is provided, and the second operator is given four (4) partitions.

Within the curly brackets, the execution mode ("parallel" or "sequential") and the name of that operator is provided. The operator name is based on the name shown on the parallel canvas in the Designer client. The operator name is not the same as the operator type.

In the proceeding example, the first operator is listed as "PacifBaseMCES" and is the stage name in its entirety. However, the second operator, is listed as "remDups.IndvIDs_in_Sort". The stage name"IndvIDs" is renamed to indicate that the sort process triggered by the remove duplicate stage occured.

Following each operator name are the specific nodes that the operators are tagged to run on. In the proceeding example, node1 is for the first operator, and node1, node2, node3, and node4 are for the second operator. The name of nodes are defined in your configuration file (APT_CONFIG_FILE).


Now let us focus on the data sets:

ds0: {op0[1p] (sequential PacifBaseMCES)
eOther(APT_ModulusPartitioner { key={ value=MBR_SYS_ID }
})<>eCollectAny
op1[4p] (parallel RemDups.IndvIDs_in_Sort)}
ds1: {op1[4p] (parallel RemDups.IndvIDs_in_Sort)
[pp] eSame=>eCollectAny
op2[4p] (parallel RemDups)}

The name of the data set is provided first. Within the curly brackets, there are three stages:
  • the source of the data set - operator 0, sequential PacifBaseMCES
  • the activity of the data set - operator 1, parallel RemDups.IndvIDs_in_Sort
  • the target of the data set - operator 2, parallel RemDups
The source and target are usually operators, although you might see a specific file name provided that indicates that the operator is referencing and reading from a physical data set.

The first data set, ds0, partitions the data from the first operator ("op0" running in 1 partition). The data set uses APT_ModulusPartitioner (sometimes referred to as modulus) to partition. (Note: APT means Advanced Parallel Technology.) The modulus partitioning is using the key field MBR_SYS_ID in this scenario. The partitioning data is being sent to the second operator (op1 running in 4 partitions). Logically, this means that the data is being partitioned in 4 partitions using the modulus method.

The second data set, "ds1", reads from the second operator ("op1" running 4 partitioning). The second data set uses the eSame method to partition and sends the data over to the third operator ("op2" running 4 partitioning). One special note: the value "[pp]" means preserved partitioning. Preserved partitioning is an option that is usually set by default in the GUI. If data must be repartitioned, the "[pp]" flag is overridden and a warning message is triggered.

In the example for the first dataset, you see "eOther" and "eCollectAny".These are input and target read methods. The second method indicates the method that the receiving operator uses to collect the data.

In this example, "eOther" is the originating or input method for op0. It is an indication that something else is being imposed outside the expected partitioning option (and that you need to observe the string within the parenthesis -- APT_ModulusPartitioner in this example -- Modulus partitioning is imposed.)

"eCollectAny" is the target read method. Any records that are fed to this data set are collected in a round robin manner. The round robin behavior is less significant than the behavior that occurs for input partitioning method, which is eOther(APT_ModulusPartitioner) for ds0.

In the first example in this document, where the operator and stage uses the APT_SortedMergeCollector for ds9, the "eCollectOther"method indicates where actual partitioning occurs and is usually specified when you are referencing a sequential flat file. Shown again, in part, is the example:

ds8: {op8[4p] (parallel APT_TransformOperatorImplV22S14_ETLTek_HP37FMember_PMR64262_Test1_SplitTran2 in SplitTran2)
eSame=>eCollectAny
op9[4p] (parallel buffer(1))}
ds9: {op9[4p] (parallel buffer(1))
>>eCollectOther(APT_SortedMergeCollector { key={ value=MBR_SYS_ID,
subArgs={ asc }

The symbols between the originating partitioning method and the target read method translates to the parallelism of the partitioning. The following is the list of the symbols, and their definition:

-> Sequential to Sequential
<> Sequential to Parallel
=> Parallel to Parallel ( SAME)
#> Parallel to Parallel ( NOT SAME)
>> Parallel to Sequential
> No Source or Target

In summary:

Operator "op0" runs first in sequential mode on node "node1", and send data to data set "ds0". Data set "ds0" is partitioned using the modulus partitioning method data provided from sequential to parallel (4 ways) where it is sent to operator "op1" running in parallel mode on "node1", "node2", "node3", and "node4". Operator "op1" then handles the collected data, and sends the results to data set "ds1". Data set "ds1" just provides data in the same partitioning order for "op2" operator as it was for "op1" operator.

This is the basic concept of how the parallel engine works at runtime, and what is shown in the dump score report.

* * *

The dump score report also can tell you when the parallel engine inserted an operator based on its internal analysis of each operator's requirements. For example, Join stages require that the data be sorted, but the GUI does not require that you supply the sort details. The engine is intelligent enough to realize that a sort is required, and supplies the requirement itself. You can see the behavior in the following example:


op5[4p] {(parallel inserted tsort operator {key={value=MBR_SYS_ID, subArgs={asc}}}(0) in IndvIdJoin)
on nodes (
node1[op5,p0]
node2[op5,p1]
node3[op5,p2]
node4[op5,p3]
)}


In the example, tsort is the name of the sort operator being used. As part of this insertion, the data is repartitioned based on the same key as the hash partitioning:


ds4: {op4[4p] (parallel IndvIdJoin.toIndvIdJoin_Sort)
eOther(APT_HashPartitioner { key={ value=MBR_SYS_ID }
})#>eCollectAny
op5[4p] (parallel inserted tsort operator {key={value=MBR_SYS_ID, subArgs={asc}}}(0) in IndvIdJoin)}


All of this partitioning and sorting provided in the proceeding example is for the Sort stage:

* * *

ds5: {op5[4p] (parallel inserted tsort operator {key={value=MBR_SYS_ID, subArgs={asc}}}(0) in IndvIdJoin)
[pp] eSame=>eCollectAny
op7[4p] (parallel APT_JoinSubOperatorNC in IndvIdJoin)}

[...]

op7[4p] {(parallel APT_JoinSubOperatorNC in IndvIdJoin)
on nodes (
node1[op7,p0]
node2[op7,p1]
node3[op7,p2]
node4[op7,p3]
)}

* * *

If you have a keen eye, you might have already noticed a problem with this particular dump score report: one of the two input links for that sort stage ("op7") is partitioned using modulus order ("ds0"), while the other input link is partitioned using hash partitioning ("ds4"). The hash partitioning overrode the initial modulus partitioning request ("ds3").


The first modulus insertion was overridden because the engine detected that the job design did not supply the required fields. Quite frequently, the key fields are supplied in the wrong order, or the job is using different key fields that break the compatibility of the data order requirements for the downstream stages. It is important to review the APT_DUMP_SCORE report and confirm that your valid job design is interpreted correctly by the parallel engine. Ensure that your intended design is correctly implemented.

* * *

One additional detail from the above dump score you may be wondering about

op6[4p] {(parallel buffer(0))
on nodes (
node1[op6,p0]
node2[op6,p1]
node3[op6,p2]
node4[op6,p3]
)}


Buffer operators are specifically for instances where the downstream operator is at risk of getting overloaded with data while it is processing. Buffer operators are an attempt to produce a buffer zone where two things happen:
  1. The buffer operator communicates with the upstream operator to slow down its sending of data.
  2. The buffer operator holds on to the data until the downstream operator is ready for the next block of data.

If you are wondering why your job is running slower than other jobs, seeing the number of buffers in effects provides a clue. Buffer operators serve a valuable purpose: the prevention of race conditions between operators. Disabling buffering can cause severe and hard-to-analyze problems. However, better job design can reduce the amount of buffering that occurs.

* * *

The following detail is not seen above, but is possible:

op1[2p] {(parallel APT_CombinedOperatorController:
(APT_TransformOperatorImplV0S1_TrafoTest1_Transformer_1 in Transformer_1)
(Peek_2)
) on nodes (
node1[op1,p0]
node2[op1,p1]
)}


Data sets take up memory. As part of optimization, jobs try to combine multiple operators that handle data in the same way (for example, without any requirement to change the partition or sort order for the data flow). Data is immediately handed off to the next operator when processing is completed in the prior operator with significantly less memory impact.

In the immediately proceeding example, there are two operators that are combined: a Transformer and a Peek, running on two partitions.

When the job log indicates an error occurred in "APT_CombinedOperator", the APT_DUMP_SCORE report can help you identify which of the combined operators is suspect. To better pinpoint the culprit, enable the environment variable APT_DISABLE_COMBINATION. The environment variable can help you to identify the stage with the error.


Self-training exercises:
  1. What does the dump score of your job looks like when you run it using 1 node configuration file compared to a multiple nodes configuration file?
  2. What happens to the dump score when you enable APT_DISABLE_COMBINATION on a job?
  3. How does the APT_DUMP_SCORE determine multiple inputs and outputs for the jobs, such as multiple references, and multiple output files?
  4. What happens to the dump score if, instead of using the internal sort option of the join stage, the customer uses a separate sort stage?
  5. What happens if the following environment variables are set: APT_NO_PART_INSERTION? APT_NO_SORT_INSERTION? For 8.7 and higher - APT_NO_PARTSORT_OPTIMIZATION?
  6. What happens to the symbol between the partitioning method and eCollectAny when you change the partitioning methods for a single link? How does it look like at the target data set?
  7. What happens when you add a sequential file for input and output with no other operators in between?

Cross reference information
Segment Product Component Platform Version Edition
Information Management InfoSphere Information Server AIX, HP-UX, Linux, Solaris, Windows 8.7, 8.5, 8.1, 8.0, 7.5

Rate this page:

(0 users)Average rating

Add comments

Document information


More support for:

InfoSphere Information Server

Software version:

7.5, 8.0, 8.1, 8.5, 8.7

Operating system(s):

AIX, HP-UX, Linux, Solaris, Windows

Software edition:

Enterprise

Reference #:

1595704

Modified date:

2013-04-09

Translate my page

Machine Translation

Content navigation