Downloadable files
Abstract
Comprestimator is a command line host-based utility that can be used to estimate an expected compression rate for block devices.
Download Description
Overview
The Comprestimator utility uses advanced mathematical and statistical algorithms to perform the sampling and analysis process in a very short and efficient way. The utility also displays its accuracy level by showing the maximum error range of the results achieved based on the formulas it uses. The utility runs on a host that has access to the devices that will be analyzed, and performs only read operations so it has no effect whatsoever on the data stored on the device. The following section provides useful information on installing Comprestimator on a host and using it to analyze devices on that host. Depending on the environment configuration, in many cases Comprestimator will be used on more than one host, in order to analyze additional data types.
It is important to understand block device behavior when analyzing traditional (fully-allocated) volumes. Traditional volumes that were created without initially zeroing the device may contain traces of old data in the block device level. Such data will not be accessible or viewable in the file system level. When using Comprestimator to analyze such volumes, the expected compression results will reflect the compression rate that will be achieved for all the data in the block device level, including the traces of old data. This simulates the volume mirroring process of the analyzed device into a compressed volume. Later, when volume mirroring is actually used to compress the data on the storage system, it will process all data on the device (including both active data and traces of old data) and get it compressed. After that when storing more active data on the compressed volume, traces of old data will start getting deleted by new data that is written into the volume. As more active data accumulates in the device the compression rate achieved will be adjusted to reflect the accurate savings achieved for the active data. This block device behavior is limited to traditional volumes and will not occur when analyzing thinly provisioned volumes.
Regardless of the type of block device being scanned, it is also important to understand a few characteristics of common file systems space management. When files are deleted from a file system, the space they occupied before being deleted will be freed and available to the file system even though the data on disk was not actually deleted but rather the file system index and pointers were updated to reflect this change. When using Comprestimator to analyze a block device used by a file system – all underlying data in the device will be analyzed, regardless of whether this data belongs to files that were already deleted from the file system. For example – you can fill a 100GB file system and make it 100% used, then delete all the files in the file system making it 0% used. When scanning the block device used for storing the file system in this example, Comprestimator (or any other utility for that matter) will access the data that belongs to the files that were already deleted.
In order to reduce the impact of block device and file system behavior mentioned above it is highly recommended to use Comprestimator to analyze volumes that contain as much active data as possible rather than volumes that are mostly empty of data. This increases accuracy level and reduces the risk of analyzing old data that is already deleted but may still have traces on the device.
Note: Comprestimator may run for a long period of time (as long as a few hours) when scanning a relatively empty device. The utility randomly selects and reads 256KB samples from the device. If the sample is empty (i.e. full of null values) it is skipped. A minimum number of samples with actual data are required in order to provide an accurate estimation. When a device is mostly empty, many random samples will actually be empty. As a result, the utility will run for a longer time as enough non-empty samples are required for an accurate estimate. If the number of empty samples is over 95% the scan will be aborted.
Your primary resource for sizing and implementing:
Real-time Compression in SAN Volume Controller and Storwize V7000 Redpaper - http://www.redbooks.ibm.com/Redbooks.nsf/RedpieceAbstracts/redp4859.html?Open
Instructions on how to set up IBM Real-time Compression for 45 Day Evaluation
Real-time Compression Evaluation User Guide - http://www-304.ibm.com/support/docview.wss?uid=ssg1S7003988&myns=s028&mynp=OCST3FR7
This version of Comprestimator is supported to run on the following host operating systems:
- Red Hat Enterprise Linux Version 5 (64-bit)
- ESXi 4.0, 5.0
- AIX V6.1, V7.1
- HP-UX 11i v3
- Solaris 10 (SPARC)
- Windows 2003 Server, Windows 2008 Server (32-bit and 64-bit)
Using Comprestimator
To use Comprestimator on a Linux, AIX/VIOS, Solaris, HP-UX, ESXi server:
- Log into the server using the root account.
- Copy the Comprestimator binary file from the corresponding folder to any folder on the host.
- Make sure the file has execute permissions, to add execute permissions to the binary file, type “chmod +x comprestimator“.
- Obtain the list of device names:
In Linux: Use the “fdisk –l” command.
In AIX/VIOS: Use the “lsdev –Cc disk” command.
In Solaris: Use the “format” command.
In HP-UX: Use the “ioscan -kfnC disk” command.
In ESXi 4.0: Use the “esxcli corestorage device list | grep Dev” command.
In ESXi 5.0: Use the “esxcli storage core device list | grep Dev” command. - Run Comprestimator with the -d <device_name> flags to analyze a device or a partition.
Note: Comprestimator is designed to scan any block device that is readable by the OS itself. This typically includes devices managed by logical volume managers (LVMs) or partitioned by the OS. However, for practical reasons, since compression is applied to physical volumes, it is recommended to estimate compression by running Comprestimator on the same block device/physical volume that will be compressed, and not on a logical volume, which may be spanning on those volumes. It is thereby highly recommended to always analyze the native block-device when using Comprestimator.
Some volume managers "reserve" some of the LUN capacity for internal use. Since Comprestimator reads directly from the block device, some of these IOs may fail. The tool will tolerate up to 1% failed IOs and a scan will be aborted if this threshold is reached.
To use Comprestimator on a Windows server:
1. Log into the server using the Administrator account.
2. Run Comprestimator with the -l flag to list the devices.
3. Run Comprestimator with the proper flags to analyze a device or a partition.
Syntax
Using Linux, ESXi, AIX, Solaris and HP-UX:
comprestimator -d <device> [-p <number_of_processes>] [-P] [-e] [-h] [-c <csv file>] [-v]
Using Windows:
comprestimator [-l | -n <disk_number> | -d "<device>"] [-p <number_of_processes>] [-P] [-e] [-h] [-c <csv file>] [-v]
| -d | Specifies the device name. |
| device | Path of device to analyze. For example: /dev/sdb, /dev/md-2 (in Linux), /vmfs/devices/disks/<id> (in ESX), /dev/hdisk1 (in AIX). |
| disk_number | Only in Windows – The disk number. Identify this number by running Comprestimator with the –l flag first. |
| -l | List the disk numbers and names available to use by Comprestimator. The drive numbers reported match the “Disk x” reported in Windows Disk Management. |
| -p | Specifies the number of processes (or threads in Windows) between 1 and 50. The default value is 10. |
| -P | Displays the results using a paragraph format. |
| -c | Export the results to a CSV-formatted output file. |
| csv_file | File name of CSV-formatted output file. |
| -e | Performs an exhaustive scan. Note that using this option will extend the runtime considerably. |
| -h | Display usage information. |
| -v | Display the results of every few samples. Note that the output is extended to a few thousands of lines as a result. |
Explanation of scan results
| Sample# | The number of the current sample reported. |
| Device | The device name used in the scan. |
| Size(GB) | The total size of the device as reported by the operating system, in gigabytes. |
| Compressed Size(GB) | The estimated size of the device if it will be compressed using Storwize V7000/SVC Real-time Compression, in gigabytes. |
| Total Savings(GB) | The total estimated savings from thin-provisioning and compression, in gigabytes. |
| Total Savings(%) | The estimated savings from thin-provisioning and compression, in percentage of the size of the device. This value is calculated in the following method: Total Savings(%) = 1-( Compressed Size(GB) / Size(GB) ) |
| Thin Provision Savings(%) | The estimated savings from thin provisioning (areas with zeros are stored using very minimal capacity). |
| Compression Savings(%) | The estimated savings from compression. |
| Compression Accuracy Range(%) | The accuracy of the estimate provided by Comprestimator. The results provided are estimated based on samples from the device and therefore may be lower or higher than the actual compression that would be achieved. The approximate accuracy of the results is represented as a percentage of the total size of the device. For example, if the estimated Compression Savings (%) is 67%, and the Compression Accuracy Range is 5%, the actual compression savings (in percents) if this device will be compressed on Storwize V7000/SVC is between 62% and 72%. |
For more information and examples refer to the Quick Start Guide provided in the Comprestimator package.
Installation Instructions
The installation instructions are located in the Comprestimator Quick Start Guide provided in the downloadable package.
Download package
| DESCRIPTION | DOCUMENTATION | LABEL | Download Options |
|---|---|---|---|
| Platform Platform Independent Version Independent English Byte Size 5742007 Date 16 Jan 2013 |
|
Compresstimator v1.2.0.3 | HTTP |
| Segment | Product | Component | Platform | Version | Edition |
|---|---|---|---|---|---|
| Disk Storage Systems | IBM Storwize V7000 (2076) | 6.4 | IBM Storwize V7000 | 6.4 | |
| Storage Virtualization | SAN Volume Controller | 6.4 | SAN Volume Controller | 6.4 |
Rate this page:
Copyright and trademark information
IBM, the IBM logo and ibm.com are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml.