In HADR environments, it is recommended to have the LOGINDEXBUILD configuration parameter turned on. However, this setting increases the log space usage during index building such as with online index reorg. This technote explains some of the extra logging requirements related to building indexes with LOGINDEXBUILD turned on so that you can better plan your transaction log space in an HADR environment.
LOGINDEXBUILD is a database configuration parameter which is turned off by default. It can also be set at a table-by-table basis to override the database configuration setting via the ALTER TABLE command with the LOG INDEX BUILD OFF option.
With the parameter turned off, when an index is built the logging done is minimal. That means that if forward recovery must replay through log files that contains the records for an index build, it will not actually perform the index build at that time. Instead, the index will be flagged as invalid and it will be rebuilt at a later time depending on the setting for the parameter INDEXREC.
In HADR environments, it is recommended to turn LOGINDEXBUILD on because HADR replication is based on the logging. If a failover happens with LOGINDEXBUILD turned off, it can add an undesirable delay for the index to be built before the standby becomes available.
When LOGINDEXBUILD is turned on, there is more logging that happens during an index build than with it off. This raises the question about how much logging is done during the index build of an online reorg with LOGINDEXBUILD turned on.
Accurately estimating the log space requirements for an index build during an online reorg is very difficult for the following reasons:
1) Log space is defined at the database level. As such, it is shared by all applications, not just the application doing the online index reorg. Therefore, we cannot make any log estimations in a general sense because we do not know about the log space being used by the other applications that are running concurrently with the online index reorg.
2) Since there can be concurrent updates to the same table that the online index reorg is running, it can impact the behaviour of the online index reorg itself and can result in more logging than if the reorg was done without updates happening against the table.
This document presents the background information about estimating log space usage during online index reorg with LOGINDEXBUILD turned on, and provides a formula that can be used as an initial guideline for defining log space, however actual workload testing should be done to narrow down the setting that will work best.
Log space estimation:
When an index is being built with full index logging turned on, there are several types of log records that will get written to the logs. A single log record type, called 'IndexCreatePage' log record for the purpose of this document, is the main log record that consumes the majority of the log space during an index build.
There are other log records that get written as well, however these are small in comparison to the IndexCreatePage record such that for this estimation, we'll assume that those other log records are neglible in size.
For each index page in the index, the index build will write the IndexCreatePage log record. We can present a rough formula that the amount of log space needed for the index build will be:
LogSpaceNeeded = NumIndexPages * IndexPageSize
where NumIndexPages is the total number of index pages of all indexes on the table. This value can be obtained by doing a db2 inspect on the table, or using db2pd with the -tcbstats option.
This formula could be construed as an overestimate of the log space needed. The reason for this being an overestimate is that we do not log the entire contents of the index page to the log file. The following items in the index page do not get logged in the IndexCreatePage log record:
- Page overhead (bufferpool page header)
- Free space in the index page
If a more accurate formula is needed, we need to subtract the items in the index page that do not get logged in the log record. The formula would now become:
LogSpaceNeeded = NumIndexPages * BytesLoggedPerIndexPage
= NumIndexPages * (IndexPageSize - Overhead)
Where Overhead is the free space and other overhead in the index page.
The calculation to determine the index page overhead and free space cannot be presented via a simple formula because these overheads depend on the PCTFREE setting, the type of index page (root, leaf, intermediate), and other overheads that are not easily calculated.
Another important note is that if the tablespace is a DMS tablespace, then there are some log records dealing with the space management that might contribute to the log space usage on top of the index log space usage mentioned above. For DMS tablespaces, you might want to add on the following amount of log space to your calaculation:
(140 * NumExtents) + 4156
NumExtents = IndexSize / ExtentSize + 1 if IndexSize % ExtentSize > 0,
NumExtents = IndexSize / ExtentSize, if IndexSize % ExtentSize == 0
The formula in bold is likely an overestimate of the log space required because it does not subtract the unused space in the index page. However, the purpose of doing reorg online is that there is likely other workload going on in the database at the same time and this will increase the logging requirements. Therefore, the potential overestimation of the log space with the formula may be balanced out by other logging happening from other applications that run concurrently.
Above we have hinted that a more complex and accurate formula might be created that takes into account index page free space and other overheads, but it is not practical to do so and the feasability of using such a formula in the "real world" is questionable.
You can use the formula in bold above as a starting point when defining your log space in this configuration. The best bet would be to use the formula to start with, add in the amount of log space that you think your other applications will be requiring, and then do some workload testing with database and application snapshot monitoring in place to view the log space usage. Then adjust your settings, and tune it further until you have the setting that works best for your workload requirements.