Evaluating data deduplication results

You can evaluate the effectiveness of Tivoli® Storage Manager data deduplication by examining various queries or reports. Actual data reduction results can show whether the expected storage savings are achieved. You can also evaluate other key operational factors, such as database utilization, to ensure that they are consistent with expectations.

Before you begin

Consider the following factors when you are evaluating data deduplication results:

Therefore, it is important to collect results at regular intervals to record valid results.

Procedure

Use the following commands and tools to help you evaluate data deduplication effectiveness:
Action Explanation
Use the QUERY STGPOOL server command to quickly check deduplication results. The Duplicate Data Not Stored field shows the actual reduction of data, in megabytes or gigabytes, and the percentage of reduction of the storage pool. For example, issue the following command:
query stgpool format=detailed
If the query is run before reclamation of the storage pool, the Duplicate Data Not Stored value is not accurate because it does not reflect the most recent data reduction. If reclamation did not yet take place, issue the following command to show the amount of data to be removed:
show deduppending backkuppool-file
Where backkuppool-file is the name of the deduplicated storage pool.
Use the QUERY OCCUPANCY server command. This command shows the logical amount of storage per file space when a file space is backed up to a deduplicated storage pool.
Examine the Tivoli Storage Manager client backup reports to see the data reduction for a backup operation that is run with client-side data deduplication and compression. The backup reports are available upon the completion of backup operations.

Over time, if the backup reports repeatedly show little to no data reduction after many backups, consider redirecting the client node to a non-deduplication storage pool if one is available. This way, the client is not wasting time by processing data that are not good candidates for data deduplication.

Run the deduplication report script to show information about the effectiveness of data deduplication. The report provides details of deduplication-related utilization of the Tivoli Storage Manager database. You can also use it to gather diagnostic information when the deduplication results are not consistent with your expectations.

You can obtain the script and usage instructions for the script at http://www.ibm.com/support/docview.wss?uid=swg21596944.

What to do next

For more information, see Effective Planning and Use of IBM® Tivoli Storage Manager V6 Deduplication at http://www.ibm.com/developerworks/mydeveloperworks/wikis/home/wiki/Tivoli Storage Manager/page/Effective Planning and Use of IBM Tivoli Storage Manager V6 Deduplication.