How can I estimate the amount of virtual storage required for translating large documents?

Technote (FAQ)


Question

How can I estimate the amount of virtual storage required for translating large documents?
What is the memory utilization formula for WebSphere Data Interchange?

Answer

This is valid for Send/Receive maps only. The answer depends on whether Pageable Translation, Incremental Translation, or both, or neither are in effect . Pageable or incremental translation are individual enhancements that dramatically improve memory utilization for translation of large transactions. The following formula can be used to estimate the high-water storage mark for inbound or outbound translation of large files WITHOUT the use of either of these features. See the related links to obtain more information about these enhancements, and to see the adjustments to the formula when enabled.


Virtual Storage = (x + y) + (z * 120 bytes) + 4MB overhead

Description of variables x, y, and z:

(x) size of the largest transaction image in application format (If there is more than one transaction set within the EDI interchange, only consider the size of the largest transaction's application image)
(y) size of EDI Interchange image
(z) # records in largest appl image (x)

Note: WDI is a 31bit application and can address approximately 1.5GB of total virtual storage for the above variables combined.

Tips for deriving the variable values:

Receive Map processing:

"y" is in the input, so its size is a given. If you have an input file with multiple EDI interchanges, only consider the size of the largest interchange.
"z" takes a little thought. When dealing with large transactions, there is typically one main detail loop that repeats over and over. Non-repeating sections (header or trailer) are insignificant. Usually a consistent record (or set of records) is output on behalf of one detail loop iteration; thus, you can easily derive the record count given the number of detail loop iterations.
"x" is the hardest to determine, because it is part of the output. Since it would be desirable to estimate memory usage before executing the translator, try one of the following two techniques for estimating "x":
1. As mentioned for the "z" estimation above, when dealing with large transactions, there is typically one main detail loop that repeats over and over. Non-repeating sections (header or trailer) are insignificant. If a consistent record (or set of records) is output on behalf of one detail loop iteration, then multiply the record size (or combined record sizes) by the number of detail loop iterations. Remember that WebSphere Data Interchange strictly uses the application data format for obtaining storage lengths. Even unused fields at the end of a record are padded to accommodate the full "as defined" length.
2. Use ratios. Run a scaled-down version of the input so that the job completes and produces output. Use a variable block sequential output file and check the amount of DASD utilized. (We find using 47 kb/track works fine for an estimate.) Be sure to use variable block and specify an LRECL long enough to hold the longest Application Data Format record. Determine the ratio of "x" to "y" and replace "x" with a factor of "y". Typically, "x" is many times larger than "y", because "x" is fixed length data versus "y" which is variable, delimited, data. A factor of 10 for x/y is not unusual. Ratios will only be accurate if the scaled-down test reflects the same input pattern, and there are enough iterations to make any nonrepeating data insignificant.

Send Map Processing:

"x" is the input, so its size is a given. Be sure to consider that WebSphere Data Interchange strictly uses the application data format for obtaining storage lengths. Even unused fields defined at the end of a structure are padded to accommodate the full "as defined" length. In other words, it's possible that the input file's record length is shorter than the length actually defined in the data format. If so, the physical input size could be significantly less than the storage DataInterchange actually allocates. Also, if more than one transaction is in the input, only consider the largest image.
"z" takes a little thought. When dealing with large transactions, typically, one main detail loop repeats over and over. Non-repeating sections (header or trailer) are insignificant. Usually a consistent record (or set of records) is output on behalf of one detail loop iteration; thus, you can easily derive the record count given the number of detail loop iterations.
"y" is the hardest to determine because it's the output. Since it would be desirable to estimate memory usage before executing the translator, try using ratios.
Run a scaled-down version of the input so that the job completes and produces output. Do not segment the output (keep it wrapped) and use a fixed block sequential file. Check the amount of DASD utilized. (We find using 47 kb/track works fine for an estimate.)
Determine the ratio of "y" to "x" and replace "y" with a factor of "x". Typically, "x" is many times larger than "y", because "x" is fixed length data versus "y" which is variable, delimited, data. A factor of 0.1 for y/x is not unusual. Ratios will only be accurate if the scaled-down test reflects the same input pattern, and there are enough iterations to make any non-repeating data insignificant.


Related information

What is Incremental Translation
What is Pageable Translation

Rate this page:

(0 users)Average rating

Document information


More support for:

WebSphere Data Interchange

Software version:

3.2, 3.3

Operating system(s):

AIX, Windows, z/OS

Reference #:

1378594

Modified date:

2013-03-27

Translate my page

Machine Translation

Content navigation