Tivoli Storage Manager V6 server might crash if log is not sized appropriately

Flash (Alert)


Abstract

The Tivoli Storage Manager V6 server might crash with a message that indicates the server is out of log space if the active log and archive log directories are not sized adequately for the transaction load for the server.

Content

The Tivoli Storage Manager V6 server might crash if the active log space is exhausted. This document provides guidance for sizing the logs for a V6 server. Currently, TSM provides servers for V6.1, V6.2, and V6.3. This guidance is applicable to these different releases of the version 6 server.


Key concepts

Upgrading from a V5 server to V6
Deploying a new V6 server
Effects of deduplication usage on recovery log requirements

Key concepts

With V6.1, V6.2, or V6.3 the recovery log is now comprised of two primary storage locations, for the active log and the archive log. There are additional, optional log storage locations, but the focus of this document is the two required storage locations.

The active log is used to store current in-flight transactions for the server. For example, if the server has 10 backup-archive client sessions performing backups, the transactions used by those sessions are represented in the active log and used to track changes to the server database such as the insertion, deletion, or update to records for tables within the server database. The active log must be large enough to hold the largest concurrent workload that the server will encounter. Or put another way, it must be able to store the data that represents in-flight transactions for the largest concurrent workload that the server will support.

The archive log is used as a storage repository for log files that were previously active and no longer contain references for in-flight transactions. So, after a log file in the active log directory is no longer referenced by an in-flight transaction, that file is eligible to be moved to the archive log directory where it is then retained. The movement of log files from the active log directory to the archive log directory is done automatically.

The active log uses the space available to it based on the defined active log size. As new transactions are started in response to server activities, the head of the log moves forward and these new transaction records are added at the head of the active log (the most recent activity). The tail of the active log is continually truncated as the oldest in-flight transactions complete which allows active log files to become inactive and eligible to be archived to the archive log directory. The active log must have sufficient disk space to hold all transactions from the head of the active log to the tail of the active log and entries eligible to be moved to the archive log, but have not been moved yet.

The archive log size is not maintained at a predefined size like the active log. The archive log stores all inactive log files based on retention policies that the server manages. These policies are not configurable by administrators. Archive log files are automatically pruned (deleted) after they are no longer needed. This pruning is a function of full database backup cycles.

Full database backups are considered in this processing of the archive log. When any database backup is performed, the database backup contains the actual database pages that are allocated and in use for the database. The database backup also contains the active log and archive log information that is necessary to represent transactional consistency for the database data that is in that database backup. Because the V6 server operates in ROLLFORWARD recovery mode, the archive log files must be kept to preserve all the transactional changes to the server database from the time of the last full database backup. In the rest of this document, a full database backup cycle represents both of the following:

  • The time between one full database backup and the next.
  • All the transactional changes that were recorded from the time that the previous full database backup was done, until the time that the next full database backup is done.

The Tivoli Storage Manager server requires that archive log files that represent the last two full database backup cycles be retained in the archive log space at all times. Because the server requires archive log space representing two full database backup cycles, you must also consider the active or current transaction load, which is the files from the active log that become inactive and eligible to be moved to the archive log. This then increases the archive log space requirement from being two times the space needed to store the transactional data for a single database backup cycle, to being up to a total of three times this total space requirement.

Use the following information to help you set the size of the active log and the archive log space that is assigned to a V6 server. The examples and discussion give some basic values that can be used for estimation purposes. Keep in mind that larger values might need to be considered in your environment.

Upgrading from a V5 server to V6

Before V6, the recovery log was a single log storage structure that stored transaction data representing modifications such as insertion, update, or deletion of records for tables in the server database. There were two considerations relating to how the recovery log worked and these were managed by the log mode for the server. If the server was in NORMAL mode, the log would simply store in-flight transactions and once the oldest in-flight transaction was completed (ended or aborted), the recovery log could recover this space by an operation referred to as log truncation. The second processing mode for the recovery log for pre-V6 servers is ROLLFORWARD mode. Processing in ROLLFORWARD mode, the recovery log would store all transaction records up to the point that a database backup was performed. This allowed the server to be recovered using the 'DSMSERV RESTORE DB' utility to any point in time that could be represented by an existing database backup or the transaction information stored in the recovery log.

Active log

Start with an active log size that is twice the size of the recovery log for the existing V5 server. For example, if the V5 recovery log is 10 GB, start with an active log size of 20 GB for the upgraded V6 server.

Archive log

To estimate the size for the archive log, first estimate the current transaction workload for a single full database backup cycle for the V5 server. Then use a multiplier to set the initial size for the upgraded V6 server.

  1. Before the next regularly scheduled, full database backup, issue the command:

    SHOW LOGV

    From the results, record the log head LSN (HeadLsn) value, which has the form: XXXXX.YYY.ZZZZ

    For example, this might be a value like: 140312.105.1011

  2. Let the full database backup run and then allow the server to operate normally.

  3. At the next scheduled full database backup, again issue the command:

    SHOW LOGV

    Record the new log head LSN (HeadLsn) value. For example, this might now be a value like: 158600.88.16

  4. To estimate the transaction activity for this server between full database backups, find the difference between the two XXXXX values in the log head LSN values. Using the example values in the preceding steps, you calculate:

    158600 - 140312 = 18288

    This value is in megabytes (MB), so the transaction workload for a single database backup cycle for this example is approximately 18 GB.

  5. Multiply the estimated transaction workload by 3 to obtain a conservative estimate of the space needed for the archive log. Using the example values from the previous step:

    18 GB * 3 = 54 GB



Deploying a new V6 server

Make a first estimate using the basic approach, then consider the additional variations that might apply to your plans for the new server.

When deploying a new version 6.1, 6.2, or 6.3 Tivoli Storage Manager server, estimate the size of the recovery log based on the expected number of files to be stored nightly by the server. The number of files stored for a typical nightly schedule window is just one way to estimate this. As part of the estimate, consider the total number of files to be backed up, archived, and migrated (by HSM clients) to server storage in a typical night. The estimate requires knowledge about expected workloads, which the administrator or team implementing the server must supply.

Basic approach

Estimate the log space needed to handle all transactions that might be in progress at one time. The calculation is:

number of clients performing operations x files stored per transaction x log space needed per file

Item Description Example values
Number of clients performing operations The typical number of client nodes that are expected to perform backup, archive, or space management operations every night 300
Files stored per transaction The server option setting for TXNGROUPMAX. The default value is 4096. 4096
Log space needed per file The value used in this example is based on results of tests performed under laboratory conditions. The value of 3053 bytes per file in a transaction represents the log bytes needed when backing up files from a Windows server where the file names vary in length from 12 to 120 bytes. Tests consisted of clients performing backup operations to a DISK storage pool, because these pools have increased log overhead and use compared to sequential media storage pools. A value larger than 3053 bytes might need to be considered if the data being stored has file names that are longer than 12 - 120 bytes. 3053 bytes
Active log - recommended starting size Note: This and other calculations in this document use the conversion:

1 GB = 1,073,741,824 bytes

(300 clients x 4096 files per transaction x 3053 bytes per file) / 1,073,741,824 bytes = 3.5 GB
Archive log - recommended starting size Because of the requirement to be able to store archive logs across 3 backup cycles, multiply the estimate for the active log by 3 to estimate the archive log requirement. 3.5 GB x 3 = 10.5 GB

The total concurrent workload in this example (the client transactions in progress at one time) is

300 clients x 4096 files per client transaction = 1,228,800

Effects of clients using multiple sessions

If clients set the client option RESOURCEUTILIZATION to a value that is greater than the default, then the concurrent workload for the server increases. The calculation then becomes:

number of clients performing operations x sessions per client x files stored per transaction x log space needed per file

Item Description
Example values
Number of clients performing operations The typical number of client nodes that are expected to perform backup, archive, or space management operations every night 300 1000
Possible sessions per client The client option RESOURCEUTILIZATION set greater then default such that each client session ran with up to a maximum of 3 sessions in parallel 3 3
Files stored per transaction The server option setting for TXNGROUPMAX. The default value is 4096. 4096 4096
Log space needed per file (bytes) Suggested value based on laboratory test results. 3053 3053
Active log - recommended starting size (300 x 3 x 4096 x 3053) / 1,073,741,824
= 10.5 GB
(1000 x 3 x 4096 x 3053) / 1,073,741,824
= 35 GB
Archive log - recommended starting size Because of the requirement to be able to store archive logs across 3 backup cycles, multiply the estimate for the active log by 3 to estimate the archive log requirement. 10.5 GB x 3 = 31.5 GB 35 GB x 3 = 105 GB

Effects of backups that use simultaneous write operations

If client backup operations use storage pools that are configured for simultaneous write, the amount of log space that is needed per file becomes larger. The log space that is required for each file increases by about 200 bytes per copy storage pool that is used for the simultaneous write operation. For example, if the data is stored to two copy storage pools in addition to the primary storage pool, the log bytes estimate per file is increased by 400 bytes. If you use the suggested basic value of 3053 bytes per file, then the total value becomes 3453 bytes.

Item Description Example values
Number of clients performing operations The typical number of client nodes that are expected to perform backup, archive, or space management operations every night. 300
Files stored per transaction The server option setting for TXNGROUPMAX. The default value is 4096. 4096
Log space needed per file Suggested value based on laboratory test results (3053 bytes per file), plus 200 bytes per copy storage pool being written to in simultaneous write operations. 3453 bytes
Active log - recommended starting size (300 clients x 4096 files per transaction x 3453 bytes per file) / 1,073,741,824 bytes = 4.0 GB
Archive log - recommended starting size Because of the requirement to be able to store archive logs across 3 backup cycles, multiply the estimated value for the active log by 3 to estimate the archive log requirement. 4 GB x 3 = 12 GB

Effects of server operations on log space

The examples and discussion so far assume that the client data-storage operations are done in isolation. Migration of data in server storage, deduplication (identification processes), reclamation, expiration, and even other administrative tasks such as administrative commands or SQL from administrative clients might be run concurrently with client data-storage operations. Any of these server operations that occur during the time for the estimated client workload can increase the active log space that is required.

For example, migration of files from the DISK storage pool to a sequential-media storage pool (DEVTYPE=FILE) uses approximately 110 bytes of log space per file migrated. For example, if you have 300 clients that each back up 100,000 files every night, and all the files are initially stored on DISK and then migrated to a sequential-media storage pool, estimate the active log space for the data migration:

300 clients x 100,000 files/client x 110 bytes = 3.1 GB

Add this value to the estimate for the active log space that was based on client data-storage operations.

Effects of extreme variation within the typical workload

The estimates to this point assume that the client workloads are somewhat homogeneous, in particular in the duration of transactions. Problems with the active log space can occur if you have both a large number of transactions that complete quickly, and also transactions that take much longer by comparison. If the mix of client workloads and the relative amount of time needed for specific transactions to complete is somewhat heterogeneous, you might need to increase the active log size to compensate for the timing differences.

Effects of TYPE=FULL database backup on the size of the archive log

As discussed above, the archive log is pruned when a full database backup is performed. As a result, when you estimate the space needed for the archive log, you also must consider the frequency of full database backups. For example, if FULL database backups are only performed weekly, then the archive log space must be able to contain the archive log information for a full week.


Item Description Example values
Number of clients performing operations The typical number of client nodes that are expected to perform backup, archive, or space management operations every night. 300
Files stored per transaction The server option setting for TXNGROUPMAX. The default value is 4096. 4096
Log space needed per file Suggested value based on laboratory test results (3053 bytes per file), plus 200 bytes per copy storage pool being written to in simultaneous write operations. 3453 bytes
Active log - recommended starting size (300 clients x 4096 files per transaction x 3453 bytes per file) / 1,073,741,824 bytes = 4.0 GB
Archive log - recommended starting size with a daily full database backup Because of the requirement to be able to store archive logs across 3 backup cycles, multiply the estimated value for the active log by 3 to estimate the archive log requirement. 4 GB x 3 = 12 GB
Archive log - recommended starting size with a weekly full database backup Because of the requirement to be able to store archive logs across 3 backup cycles, multiply the estimated value for the active log by 3 to estimate the archive log requirement.

And then multiply this by the number of days between full database backups.
(4 GB x 3 ) x 7 = 84 GB

Effects of deduplication usage on recovery log requirements

Whether you are upgrading a V5 server to V6.1, V6.2, or V6.3 or installing a new V6.1, V6.2 or V6.3 server, you need to consider the effects of deduplication when you estimate the recovery log requirements. The effects depend on the following factors:

  • The amount of deduplicated data
  • The timing and number of the duplicate-identification processes (IDENTIFY processes)
  • The sizes of the files being deduplicated

The effect of deduplication on the active log and archive log space that is needed for the server varies depending on the deduplication ratio of the data. A higher percentage of data that can be deduplicated means that more log space is required. Approximately 1500 bytes of active log space is required per file chunk that is identified by the duplicate identification process. For example, if the number of duplicate chunks that are identified by the duplicate identification process is 250,000, then the active log space needed for this operation is estimated to be 358 MB.

For example, if 300 clients back up 100,000 files each night, then 30,000,000 files is the nightly workload. If among those 30,000,000 files, there are 60,000,000 deduplicable chunks, the total archive log space requirement for deduplication is estimated to be 84 GB.

In this scenario, the active log impact of 60,000,000 extents is based on the TXNGROUPMAX server options setting. The identify process will operate on aggregates (groups) of files based on how many files were stored in a given transaction. If the average number of extents per file is 2 (60,000,000 / 30,000,000) and the number of files in a transaction is 4096, then the extents per aggregate is 8192 which results in 12 MB of active log space used per 4096 files having 8192 extents.

The next consideration relative to the active log size needed for IDENTIFY processing is how many processes are being run. If there are 10 IDENTIFY processes running in parallel, then the concurrent load on the active log is 12 MB * 10 or 120 MB.

Finally, consider the effect of very large files on the active log usage by IDENTIFY processing. For example, a client backs up a 800-GB object, which might be an image backup of a file system. This object can have a very high number of duplicate chunks; for example, the files included in an image might also have been backed up using incremental backup operations. For this discussion assume that the number of duplicate chunks is 1.2 million extents. The 1.2 million extents in this one very large file represents a single transaction for an IDENTIFY process, and the transaction requires an estimated 1.7 GB of active log space.

The 1.7 GB of active log space might be easily attainable in isolation. But if many other activities that affect the active log happen at the same time, such as other IDENTIFY processes that are only processing 8192 extents per transaction, the active log can become constrained for space. The small transactions are intermixing with the large transaction that is identifying the extents for the 800 GB single object. If a storage pool that is enabled for deduplication has a mix of data with many relatively small files (file sizes in the 10's and 100's of KB) and a small number of very large objects that have a high percentage of duplicate chunks, then plan to increase the active log size by a factor of two. The issue is not only the raw space that is needed but the timing and duration of the transactions that require that space while other activities are concurrently processing. So, if the estimates based on other factors recommend a 25 GB active log size (23.3 GB plus 1.7 GB for deduplication of a large object), then if deduplication is also running, the active log size becomes 50 GB and the archive log then is 150 GB.

Consider the following example for log size estimates with deduplication enabled and a duplicate chunk size of 700 KB:

Item Description
Example values
Size of largest single object to deduplicate The granularity of processing for deduplication is at the file level. So the largest single file to deduplicate then represents the largest transaction and corresponding load on the active and archive logs. 800 GB 4 TB
Average size of deduplicated chunks The deduplication algorithms use a variable block approach. This means that not all deduplicated chunks for a given file are the same size. So, for estimation purposes we consider an average chunk size. 700 KB 700 KB
Duplicate chunks for a given file Using the average chunk size, this represents the total number of deduplicate chunks for a given object. (800 GB / 700 KB) = 1,198,372 bits ( 4 TB / 700 KB) = 6,135,667 bits
Active log space required to deduplicate single object for a single IDENTIFY process The estimated active log space needed for this transaction. 1.7 GB 8.6 GB
Total active log size required Having considered other aspects of the workload on the server in addition to deduplication, use the existing estimate multiplied by two.

In this case, the active log space required to deduplicate a single large object is considered along with prior estimates for the active log size needed.
(23.3 GB + 1.7 GB) x 2 = 50 GB. (23.3 GB + 8.6 GB) x 2 = 63.8 GB
Archive log size The active log estimate times three. 50 GB x 3 = 150 GB 63.8 GB x 3 = 191.4 GB

Consider the following example for log size estimates with deduplication enabled and a duplicate chunk size of 256 KB:

Item Description
Example values
Size of largest single object to deduplicate The granularity of processing for deduplication is at the file level. So the largest single file to deduplicate then represents the largest transaction and corresponding load on the active and archive logs. 800 GB 4 TB
Average size of deduplicated chunks The deduplication algorithms use a variable block approach. This means that not all deduplicated chunks for a given file are the same size. So, for estimation purposes we consider an average chunk size. 256 KB 256 KB
Duplicate chunks for a given file Using the average chunk size, this represents the total number of deduplicate chunks for a given object. (800 GB / 256 KB) = 3,276,800 bits (4 TB / 256 KB) = 16,777,216 bits
Active log space required to deduplicate single object for a single IDENTIFY process The estimated active log space needed for this transaction. 4.5 GB 23.4 GB
Total active log size required Having considered other aspects of the workload on the server in addition to deduplication, use the existing estimate multiplied by two.

In this case, the active log space required to deduplicate a single large object is considered along with prior estimates for the active log size needed.
(23.3 GB + 4.5 GB) x 2 = 55.6 GB (23.3 GB + 23.4 GB) x 2 = 93.4 GB
Archive log size The active log estimate times three. 55.6 GB x 3 = 166.8 GB 93.4 GB x 3 = 280.2 GB

As shown by the examples using 700 KB and 256 KB for the average duplicate chunk size, the active log space needed varies depending on how easily deduplicated a file may be and the size of the duplicated data chunks that are present. For estimation purposes, use the average deduplicate chunk size of 256 KB because this results in a larger active log space estimate, which is less likely to result in operational issues for the server.

Product Alias/Synonym

TSM
ADSM
ITSM

Rate this page:

(0 users)Average rating

Add comments

Document information


More support for:

Tivoli Storage Manager
Server

Software version:

6.1, 6.2, 6.3

Operating system(s):

AIX, HP-UX, Linux, Solaris, Windows

Software edition:

Enterprise, Standard

Reference #:

1389352

Modified date:

2010-03-19

Translate my page

Machine Translation

Content navigation