What are the considerations related to using TSM for ERP in environments where deduplication is used?
When backing up data to a device that eliminates duplicate blocks in the data (for example, this can be TSM server side deduplication or a hardware device that offers such functionality), the configuration of Tivoli Storage Manager for Enterprise Resource Planning (TSM for ERP) may affect the results of the deduplication process.
To ensure optimal deduplication results, the changes to data between two subsequent backups should be kept to the absolute minimum. This means that in addition to the changes from the SAP application (that are subject of the backup), the backup application (i.e. TSM for ERP in this case) should cause no, or minimal, changes as well.
The following recommendations are specific to the backint interface. i.e. when the parameter "backup_dev_type" is set to "util_file" or "util_file_online" in the BR*Tools profile "init<SID>.sap".
The first goal is to preserve the order of data blocks. When using the multiplexing feature of TSM for ERP (by setting the parameter "MULTIPLEXING" in the TSM for ERP profile to a value >1) multiple data files are read in parallel and blocks from theses files are multiplexed within a single data stream that is sent to the TSM server.
While this adapts slow disks on the SAP system to fast tape drives on the TSM Server, it may change the order of data blocks that are sent to the TSM Server for subsequent backups. To ensure the same order of blocks for each backup, this feature should be disabled (by setting "MULTIPLEXING" to a value of "1").
Instead of using the multiplexing feature, you can attempt to achieve a similar backup throughput by increasing the number of sessions that are used instead. The number of sessions can be increased by using the parameters "MAX_SESSIONS" and "SESSIONS" for the server stanzas associated with your backup.
Keep in mind that MULTIPLEXING specifies the number of files per session. So, if you previously had this configuration:
TSM for ERP would read 12 data files in parallel (4 times 3) and store the data in 3 parallel sessions on the TSM Server. Therefore, when disabling the multiplexing feature, the number of sessions should be increased to at least 6 or 8 to achieve a similar throughput. For example:
Note: Though one may assume that 12 sessions would be required based on the previous example, in many cases 6 to 8 sessions are sufficient to achieve similar throughput. Therefore, the recommendation is to start with the lower number and to make additional adjustments as needed.
Another item which could impact deduplication is the block header inserted by TSM for ERP. Generally, the TSM for ERP block header will always be the same structure, and insert the same data, at a specific point in the stream. However, in the case when there is some shift of the data in a database file, the header would appear at a different point in the final stream which in turn would impact the deduplication of this data. Since each buffer has a block header, it is recommended to increase the buffer size used by TSM for ERP in order to reduce the number headers that need to be inserted. A larger buffer size reduces the number of buffers required which in turn reduces the number of headers that could be impacted. The size of buffers is defined by the "BUFFSIZE" parameter. It is recommended to set the size in the range of 512 KB to 1 MB.