 |
Software > Lotus >
|
 |
 |
 |
High CPU usage by router and poor performance of Domino under VMware
|
| | | Problem | | Your Lotus® Domino® server uses considerably more CPU under VMware ESX 2.5 and 3.0 than running directly on the same hardware. With a simple mail workload sending mail to just a few users, CPU spikes to almost 100% for the duration of the test both under Windows® and Linux® guest virtual machines (VM)s. | | | | Cause | | This issue has been investigated and it was determined it is independent from Domino. | | | | Solution | The majority of the test outlined in this document were performed on 4 pCPU systems. Each VM had 1 vCPU and 1024-2048 MB of RAM assigned. The disks where either local storage (RAID 0 10K RPM SAS drives and RAID 5 10K RPM SCSI drives) or FastT SAN.
To verify the cause of this issue in Windows open Task Manager, switch to the Performance tab and make sure "Show Kernel Times" is selected under the View menu: 
This option will show how much of the CPU being used is actually used by the OS Kernel versus the application. In general, Kernel time should be as low as possible. When it uses a considerable amount of CPU it may be an indication of an unhealthy situation that needs to be investigated. With this option enabled, Task Manager will show in green the total CPU usage and in red the amount used by the Kernel.
Running the same mail routing test showed that the majority of the CPU was being used by the OS Kernel: 
The area indicated with "A" shows the CPU usage during the mail routing test, where 90% of the CPU was spent inside the OS Kernel.
A second test was performed, this time outside of Domino. From within the VM, copy a large (300 MB or more) file from the local disk to the same disk, i.e.: copy largefile.iso test.iso
and verify the CPU usage during the file copy. In the previous chart, the area indicated with "B" shows the CPU usage during such a file copy, where CPU spiked to 50%, all used by Kernel time.
This was consistent across various version of Windows (2003 Server, 2000 Advanced Server). http://kb.vmware.com/kb/9645697 describes how to tune the LSI driver. The same tests were performed with a transfer size set to 256 KB and 1 MB but did not provide a significant change in the observed behavior.
Running esxtop directly on the VMware ESX server shows CPU usage between 30-50% by the VM.
The same file copy performed outside of VMware shows virtually no CPU usage (in our tests it varied between 1-2% on recent hardware and up to 6% on older PCs).
Under Linux, start a vmstat command to verify CPU usage: vmstat 2
and let it run for the duration of the test.
We created a large file with the command: dd if=/dev/zero bs=1024 count=365536 > ivd.out
During the file creation, system CPU usage jumped to 50-99% depending on the version and flavor of Linux (Red Hat, SuSe and others): # vmstat 2 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 60 34776 217616 1070128 0 0 4 43 44 7 6 5 89 0 0 0 0 60 35024 217616 1070128 0 0 0 18 264 345 0 4 96 0 0 0 0 60 35024 217616 1070128 0 0 0 0 264 342 1 2 97 0 0 1 0 60 692400 217628 415996 0 0 0 0 265 343 1 24 75 0 0 2 0 60 636352 217684 471884 0 0 0 0 265 346 1 99 0 0 0 1 0 60 614652 217704 493696 0 0 0 34896 305 249 1 99 0 0 0 2 0 60 559348 217760 548736 0 0 2 0 261 345 1 99 0 0 0 18 1 60 503672 217816 602988 0 0 0 61582 487 372 0 100 0 0 0 3 0 60 478748 217840 629316 0 0 0 80 265 358 1 99 0 0 0 1 0 60 422824 217896 683892 0 0 2 0 265 388 1 99 0 1 0 5 1 60 394180 217924 712248 0 0 0 54632 330 206 1 99 0 0 0 5 1 60 338752 217980 768444 0 0 0 70 263 366 6 92 0 2 0 0 0 60 331188 217988 775732 0 0 0 0 266 360 2 10 84 4 0
We then copied the file created previously: cp ivd.out ivd2.out
and once again vmstat showed system CPU usage to spike to 99%: procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ 1 0 60 221076 218032 885716 0 0 0 21072 357 355 0 91 9 0 0 1 0 60 64960 218180 1036840 0 0 0 30794 416 408 1 99 0 0 0 1 2 60 46432 213520 1063412 0 0 0 61826 532 409 1 99 0 0 0 1 3 60 42216 213524 1067496 0 0 0 61040 481 360 0 35 0 65 0 1 0 60 45688 213524 1067496 0 0 0 4048 320 355 1 15 64 21 0 1 0 60 45936 213524 1067496 0 0 0 4250 286 342 2 6 91 2 0
We performed an identical test on a native Linux machine running SuSe 9.3 and cp used only 5% of the CPU.
We saw similar results in testing with local storage and SAN storage.
In some situations CPU usage was slightly better under VMware ESX 3.0, however, the gain was marginal and it was not enough to be considered a solution to the issue.
In a virtualized environment collecting CPU performance data from withing a VM may not be 100% reliable. In this tests we had a single VM running with a single vCPU assigned to the VM on a 4 CPU system. In this conditions the likelihood that performance data collected from within the VM is incorrect is very low. We also used esxtop on the host machine while conducting the test and verified a similar CPU utilization pattern was reported, supporting the evidence collected from within the VM.
Domino depends on the OS and hardware to perform disk I/O operations. If these operations consume excessive CPU and leave virtually no CPU cycles for Domino to perform any additional computation, the performance of the whole server will suffer and response time for users can increase to unacceptable levels.
In the same way as we see an important decrease of CPU when performing a file copy directly on the hardware vs under VMware ESX, we also verified that the same mail workload performed for Domino showed lower CPU usage and better overall performance and response time when Domino was running directly on the hardware.
As this issue is occuring independently from Domino, it is suggested to evaluate the expected load for the server before migrating it under VMware.
When running directly on the hardware, the OS offloads I/O activity to the hardware controller (i.e. SCSI card installed in the server). This requires little CPU cycles as it is a very efficient operation. The same operation under VMware requires the translation of each I/O operation to be performed in software, requiring the use of the CPU. The more I/O is performed, the higher the CPU that is used by the system (kernel time).
For a description of the support policy of IBM Lotus products, refer to the technote titled "VMware product support information for IBM Lotus Domino-based server products" (#1106182). | | | | | | | | |
|
 |
| IBM, the IBM logo and ibm.com are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml. |
 |
 |
 |
| Please take a moment to complete this form to help us better serve you. |
 |
 |
 |
|
|
|
 |
 |
| Product categories: |
 |
| | Software |  |
| | Messaging Applications |  |
| | Advanced Messaging |  |
| | Lotus Domino |  |
| | Lotus Domino Server |  |
 |
| Operating system(s): |
| |
Linux, Windows 2000, Windows 2003
|
 |
| Software version: |
| |
6.0, 6.5, 7.0
|
 |
| Reference #: |
| |
1252786
|
 |
| IBM Group: |
| | Software Group |
 |
| Modified date: |
| | 2007-04-22 |
 |
|