Slow performance in Controller due to inability to use available CPU cores in SQL server
Users experience performance issues in Controller. During the diagnosis, the root cause is found that the database (SQL) server has plenty of CPU cores (e.g. 16) but only some (e.g. 4) are being utilised.
I.T. administrator logs onto the SQL server, and looks in Task Manager, inside tab "Performance". Inside there, there are many (e.g. 16) separate CPU graphs, indicating that Windows can successfully see many (e.g. 16) CPU cores.
However, during the 'busy' periods of time, only some (e.g. 4) of these graphs actually are active.
- In addition, the overall CPU usage percentage figure will only ever reach a limit (e.g. 4/16 = 25%), and never reach the full potential (100%).
There are lots of possible causes for performance problems.
- For more examples, see separate IBM Technotes #1365257 & 1347497.
This IBM Technote specifically relates to the scenario where the SQL server does not utilise all the available CPU cores.
Again, there are several possible causes including:
- Scenario #1 - SQL server has more than 4 CPU sockets, and customer has reached the limit of their edition of SQL.
- Different versions of SQL have different limitations.
- For example, SQL Server 2008 R2 is licensed per processor socket, and not per logical CPU basis.
- SQL 2008 R2 Standard edition is limited to 4 CPU sockets, but each socket could (for example) be quad-core, therefore could happily utilise 16 CPU cores in total.
- Scenario #2 - SQL server has 4 or fewer CPU sockets (e.g. 4 sockets) and each one is multi-core (e.g. 4 CPU cores in total) which means that Task Manager sees lots of CPU cores (e.g. 16 cores in total) however:
- The SQL server is running on virtual (non-physical) hardware
- The host platform (e.g. ESX Server) has not been configured to tell the image ("guest") that its hardware is based on multi-core CPUs. Therefore, the 'guest' image thinks that it is based on many (e.g. 16) separate physical sockets and not (in fact) the true picture which is more like 4 CPU sockets with each one quad-core.
- Scenario #3 - SQL server is trying to utilise all CPU cores, but its automatic CPU affinity process is failing due to the fact that the server has been created from an image.
- For example, in one real-life customer case, the SQL server only ever used 1 CPU core. This was caused by the fact that the customer installed SQL 2005 onto a 'KVM' (host based on Red Hat Linux) guest image. Afterwards, this image was 'cloned' and hosted on a different host. Because the CPUids were different, the SQL server was confused and therefore the 'automatic' CPU affinity masking process was failing.
Resolving the problem
Upgrade your edition of Microsoft SQL server (for example from "Standard" to "Enterprise" or even "DataCenter" editions.
Reconfigure your virtual host platform (e.g. ESX server) so that the guest images are aware that the CPUs are multi-core.
- For example:
- if using VMWare ESX, utilise the setting "cpuid.coresPerSocket".
- if using KVM, utilise the setting "topology sockets" inside the guest XML file.
- For more information, see separate IBM Technote #1365257.
In one real-life example, the solution was to disable Microsoft's ' automatic' CPU affinity system, and instead manually inform SQL to utilise all CPUs (e.g. all 4 CPUs that the system had installed).
- Logon to the SQL server as an administrator
- Launch "SQL Server Management Studio"
- Right-click on the server, and choose "properties"
- Click tab "Processors"
- Untick the box "Automatically set processor affinity mask for all processors"
- Tick the "Processor Affinity" box next to all of your CPUs
- Obtain a short period of downtime, and restart your SQL server (to ensure that the changes are actioned).