This white paper provides an implementation scenario and troubleshooting information for the Load Generation Agent of IBM Rational Performance Tester (RPT).
Author: Rajesh Avanthi and Neeta Stanley
Table of Contents:
There are many reasons why performance testing is considered important. There are several reasons why, including this: Measurement is the first step that leads to control and eventually to improvement.
- If you cannot measure something, you cannot understand it
- If you cannot understand it, you cannot control it
- If you cannot control it, you cannot improve it
In order to attain accurate performance results, the accuracy of load test results is important. When hardware resources are overloaded, test results can become skewed. Rational Performance Tester (RPT) offers agent health indicators and load-test resource management to achieve this task.
This white paper provides an implementation scenario and troubleshooting information for the Load Generation Agent of RPT. For better understanding of the initial configuration setup of this agent component, irrespective of the deployment architecture of the environment, see the referenced technical article under IBM developerWorks.
This white paper provides a better understanding of how to validate communication problems between the RPT workbench and the RPT agent machines.
Old Agent Controller versus new Load Generation Agent
Starting in RPT V8.3, the Load Generation Agent is shipped along with the old Agent Controller component. The main difference is the communication infrastructure. The load generation engine is the same.
The key differences:
- The Load Generation Agent communicates to the workbench by way of HTTP. There is a Jetty server running on the workbench. If a firewall is involved, only one port has to be opened on the workbench to allow agents to reach the workbench.
- The old Agent Controller required that three ports be opened on the agent to allow the workbench to contact it. Additionally, a range of ephemeral ports had to be opened on the workbench to allow the Agent Controller to make connections back to the workbench. This was difficult in a situation where heavy firewall restrictions are imposed.
- The Load Generation Agent polls the workbench for work.
- With the Agent Controller, the workbench had to connect to each agent. Now you can instantly know the readiness of an agent. With the Agent Controller if you had 30 agents and one was not ready, you might wait 20 minutes to find out the schedule launch will fail because an agent is not ready.
- With the Load Generation Agent, you can launch all agents on a schedule in parallel.
In a given test, you can see less than two minutes of startup time for 70 agents. The TPTP Test Execution Harness cannot launch Agent Controller agents in parallel, taking 30-60 seconds per agent to launch one. This is a big gain for large load testing opportunities.
- With the Load Generation Agent, you can avoid deploying every test asset or runtime component as the file transfer deployment is smarter resulting in faster launch times.
- With the Agent Controller, all runtime and test assets were deployed every time.
- The Load Generation Agent uses TSL/SSL for secure workbench-to-agent communication, if required. This support is specified in preferences on the workbench.
- With the Agent Controller you had to specify secure communication in the configuration data of each agent.
For an efficient setup of the workbench machine and the agent machine, the configuration of the “MajorDomo.cfg” file must be validated. As for Load Generation Agent configuration, all that is asked for is the workbench host name and port number (default 7080). If you want to change the poll interval, you must edit the MajorDomo.config file on each agent. A default 8.5 configuration file is posted below.
Note: If there are DNS issues then Majordomo might not be able to contact a workbench. It should be possible to view Majordomo log file for status and the status should indicate the communication state with all supported workbenches.
Structure of the Majordomo Configuration file:
<hostName>Name of the Workbench Machine</hostName>
- <MajordomoConfig xmlns="http://www.example.org/MajordomoConfiguration">
This is an Example domain.
If "true" instead of "false", the Majordomo service continuously polls the workbench. The trace data from this polling is written to
C:\windows\temp\system\majordomo.log. This is useful if a workbench does not show an agent in contact when it should be. Turning on debug might provide information on the cause of the connection problem. Any change to this file are immediate and do not require stopping and restarting the Majordomo service program. Changes to this file are immediately effective.
- <hostName><name of workbench machine></hostName>
Represents the host name or IP address of the workbench. Preferably, Using the workbench host name is preferable.
The canonical host name is added to allow forcing what host name the agent claims to be when contacting the workbench. Sometimes the name as resolved by DNS is not what you require and rather than try to sort out the Domain Name Server this allows forcing a different name that is more amenable to the workbench.
- <ipAddress>IP Number</ipAddress>
IP address is added to allow forcing the agent to identify itself to the workbench using the specified IP. Some systems might have multiple NICs and the primary might be one not known to the workbench because it is a private network. There are ways to specify the correct primary NIC but this allows a quick way to do it.
- <pollInterval>Specify a Value</pollInterval>
The default is 10 seconds. The workbench host and port parameters are the same as with V8.3. Changes made to the configuration file are automatically picked up at the next poll interval without having to stop and restart the agent.
The TCP port number to be used for communication. Default is 7080. When you change this port number, make sure that RPT is put in sync with this new port with the menu option
Window > Preferences > Test and Server.
Consider there are two physical machines that are RPT workbenchs. The machine IDs are 997 and 998. Additionally, there are two virtual machines that are RPT agents. Those machine IDs are 944 and 945.
In any given scenario, either of the physical machines or virtual machines is treated as workbench or agent machines. Assuming that under this given environment, you encounter communication errors indicating that either the agent is not ready for taking the triggered user load or the agent communication lost contact during the process of test schedule execution.
On machine 997 and machine 945, running a schedule by selecting to run it on local host does not start the run attempted. Assuming that you have these tests done to check the accessibility of physical server machine with virtual server machine...
- Machine 997 acting as controller and machine 945 acting as agent: No agents listed or connected.
- Machine 945 acting as controller and machine 997 acting as agent: Agent ready status shown, connection established
- Machine 997 acting as controller and machine 944 acting as agent: No Agents listed.
- Machine 944 acting as controller and machine 945 acting as agent: Agents are listed and status shows ‘Ready’.
RPT workbench agent communication flow
In an ideal scenario, the schedule triggered from the RPT workbench communicating to the agents involves this sequence:
- Schedule Executor indicates interest in running on agent for specified engine.
- Majordomo polls server looking for work.
- Server queries Schedule Executor for work for this agent.
- Schedule Executor returns execution information, which server returns to agent.
- Majordomo launches process.
- Majordomo sends message to server that is forwarded to Schedule Executor indicating process has been launched.
- Majordomo collects stdout and stderr output from launched process and forwards to server.
- Server forwards stdout and stderr output to Schedule Executor.
- Majordomo detects process end and sends a message to the server indicating the process has ended along with the return code.
- Server forwards process end message to Schedule Executor. Existing Schedule Executor code handles what is either a normal process termination or abnormal process termination.
- In the case of abnormal process, termination the stdout and stderr output can be logged to provide further insight into the reason for abnormal process termination.
Here are some check lists with which you can start:
Check 1: On Agent Machine
Verify if the Load Generation Agents are installed on agent machines. Open the Start menu option Start > Programs > IBM Installation Manager > View Installed Packages.
Verify these details:
Package:- IBM® Rational® Performance Tester Agent
Feature:- Load Generation Agent
Load Generation Agent will install the Majordomo Service itself.
Check 2: On Agent machines
Verify if the Majordomo Service is installed and running: Start > Run > Services.msc > MajordomoService
Check 3: On Agent machine
- Open the MajorDomo.config file from SDP directory: "C:\Program Files (x86)\IBM\SDP\Majordomo".
- Verify if the host name and the port number for the workbench machine is provided correctly:
<MajordomoConfig xmlns="http:// www.example.org/MajordomoConfiguration ">
This file enables this agent machine to be used by multiple RPT workbench machines to share user load. If the host name does not resolve, try using IP address, or add the entry from "drives/etc/hosts" file.
NOTE: "HOSTNAME" can be replaced with the host name or IP address of the RPT workbench machine.
Check 4: On Workbench machine
You have a button "Agent Status" on the top row of RPT workbench UI. Click this to see if the agentss are ready:
Generally, when the Load Generation Agent (Majordomo) connects to the RPT workbench, it sends a request to the workbench like this:
GET /agent/whatisthybidding?canonicalHostName=ncrptwin7&ip=10.18.x.x &version=8300&engineListSize=0
The IP value in this example (10.18.x.x) is determined by the agent; the details are operating system dependent. Upon receiving this request, the workbench does a reverse look-up on the provided IP and uses the resulting name in the Agent Status dialog. If the reverse look-up fails, the Agent Status dialog will show the IP address.
For example, if the workbench received the previous request and you have an entry like this in the hosts file, then RPT shows "MyFavAgent" as the agent name in the Check Agents dialog:
When a schedule specifies playing back on a location, RPT will get the IP address corresponding to the location by resolving the "Host Name" specified in the location. RPT then compares that IP address to the list of IP's of the agents that have contacted it. When a match is found, that agent is used to run the test.
To check the connectivity during the process of schedule execution provided the agents are listed as ready, open the command prompt on the RPT workbench in parallel and perform a continuous ping operation to the agent IP address. For example:
Check 5: Alternative way for “Test Connection” Status from the RPT Workbench to the AGENT machines
From the Test Navigator, you find the agent location entries as seen here:
Click to open any of the agent entries and you get this screen to set the agent and IP address:
Click Test Connection to test the connection:
There are options to edit here; General Properties and Eclipse Workbench Properties:
Another approach to test the trusted communication between the RPT workbench and the agent machine is to perform these steps:
- On the RPT agent machine, open a supported browser.
- Open this URL: http://<Workbench IP address>>:7080/hello
This ensures that both the workbench and agent are coordinated for communication by way of the respective port.
Check 6: Check if the agents are working correctly to share the load
Consider if you have a test with a scenario by distributing 50% load on each agent. Create a schedule for 10 users and set the percentage equally to each of the agents. Consider two agents and hence divide the load as 50% each.
- Name of the schedule to "Proxy"
- Set User Load set to 10
- Set two user groups to 50% each with percentage.
Here is an example of the first user group set to 50%:
- Assign a second user group to the second agent.
- Run the script.
Note: Make note of the time of execution
- Now navigate to this agent deployment user directory from respective agent machines.
- Open the location D:\RPT_AGENT
- Look for the deployment root directory.
- Open the User directory to find the data.
The alphanumeric directory is created for each run and will have different timestamps. Note the timestamp at the time of execution to locate the exact execution data directory. You will find numerous file types listed from the execution data.
- Look for the ‘KERNELIO.dat’ file.
- Open the KERNELIO.dat file to locate the "Total Users" parameter to prove that 50% load has been started on this AGENT. (Total users = 5)
- Perform the same for the other agents to verify the load started on each of the machines.
For More information on the usage of the engine room and analyzing the KERNELIO.dat file, see this IBM developer work article.
Additional troubleshooting information
If you experience agent communication problems, you can verify these points:
- Close all the RPT related services and processes on the agent and workbench machines. Then restart the same.
- Start RPT with the -clean option, which reloads the plug-ins. For example: C:\IBM\RPT>eclipse –clean
- Before running the RPT schedule, click the "Agent Status" icon and verify the status.
- Add the debug switch (<debug>true</debug>) in the Majordomo configuration file. Validate the majordomo log file that generally is created in the temporary folder.
In another given scenario, the servicelog.log contains this error message:
Error: Unable to create the server socket.
An error was returned from TransportLayer(1000)::startTransportLayer errNum = -1
If this error occurs, check these points:
- Make sure ports 10002, 10005, and 10006 are not in use with netstat or similar tools.
- Make sure there are no ACServer.exe, RAServer.exe, or ACWinService.exe processes running.
- Remove the serviceconfig.xml and servicelog.log files from this directory:
- Start RPT and try to playback a schedule.
If the connectivity problem persists, run the RPTAgentTest.jar utility file attached to this white paper. This utility performs real time polling between the workbench and agent machine. Perform these steps:
- Exit from the RPT workbench.
- Copy the attached RPTAgentTest.jar file to a convenient location on your RPT workbench computer, such as "C:\Temp" or "/tmp".
- Start "AgentListener" with a command like:
Windows: "C:\Program Files\IBM\SDP\jdk\jre\bin\java" -cp C:\Temp\RPTAgentTest.jar AgentListener (Windows)
Linux: /opt/IBM/SDP/jdk/jre/bin/java -cp /tmp/RPTAgentTest.jar AgentListener
- To exit from AgentListener, press CTRL-C on the keyboard.
Here is a sample of the output data:
C:\> "C:\Program Files\IBM\SDP\jdk\jre\bin\java" -cp C:\Temp\RPTAgentTest.jar AgentListener
Listener on port 7080 started on ncdual (192.x.x.1)
Agent sleet:33026 (192.x.x.1) connected: 07:14:54.934
Agent rptrtb2:22799 (192.x.x.1) connected: 07:14:57.996
Agent example.ibm.com:44656 (192.x.x.2) connected: 07:15:00.668
Agent sleet:33028 (192.x.x.1) connected: 07:15:04.965
The first line identifies name "ncdual" and IP address 192.x.x.1 for the computer where AgentListener is running. By default, AgentListener listens at port 7080. You can change this by providing a different port as the final argument on the command line, for example:
"C:\Program Files\IBM\SDP\jdk\jre\bin\java" -cp C:\Temp\RPTAgentTest.jar AgentListener 7081
Each polling request to the workbench results in output like:
Agent sleet:33026 (192.x.x.1) connected: 07:14:54.934
The initial line of the per-poll output specifies the name "sleet", originating port "33026", and IP address 192.x.x.1 of the agent as determined from the connection.
This line also includes the timestamp of the request (07:14:54.934). The canonicalHostName and IP lines specify the agent name and IP address as provided by the agent in the request. Normally these match the name and IP address in the initial line; Mismatches are a potential source of problems. The version of the agent is shown in the "version" line.
For RPT V8.5 and later agents, like rptrtb2 in this example, there is additional information about the agent architecture (arch), polling interval in milliseconds (poll), and available memory in bytes (ram).
Expect to see a poll request from an agent approximately every 10 seconds by default. For example, the "sleet" agent is polled at 07:14:54.934 and then again at 07:15:04.965. If an agent is taking longer than the specified polling interval, which is 10 seconds in RPT V8.3 and customizable in RPT V8.5, it can indicate that the agent is having trouble in talking to some other workbench.
NOTE: In addition to the new AgentListener class, RPTAgentTest.jar also contains classes that can be useful in supporting RPT.
This is used for diagnosing problems with pre-RPT V8.3 agents.
- Enter this command, replacing <agent> with the host name(or IP address of the agent to be tested:
"C:\Program Files\IBM\SDP\jdk\jre\bin\java" -cp C:\Temp\RPTAgentTest.jar RPTAgentTester <agent>
- RPTAgentTester prompts for the local computer name and then the IP. To use the defaults, type Enter for each.
- RPTAgentTester then simulates the initial traffic to the specified agent as if it was an RPT workbench.
- If all goes well, you receive the prompt "Type [Enter] to close connections and end test". Type Enter at this point to end the utility.
When RPTAgentTester is run on a computer like "nc-opteron" in this example, with ejb1 running RPT V22.214.171.124 as the agent, the output is:
If your environment does not have ports 25000-25003 open, then you can specify a different starting port as a second argument for RPTAgentTester. For example, to specify ports 9000-9003, use:
"C:\Program Files\IBM\SDP\jdk\jre\bin\java" -cp C:\Temp\RPTAgentTest.jar RPTAgentTester <agent> 9000
SocketTestClient & SocketTestServer:
These components are used for testing connectivity between computers. "SocketTestServer" runs a listener at a specified port on the computer; "SocketTestClient" attempts to connect to a specified computer and port. For example, if you must verify that an agent can successfully establish a connection to a workbench computer, you can:
- Copy the RPTAgentTest.jar to both the workbench and agent.
- On the workbench, run SocketTestServer since the agent is connecting to the workbench. The "7000" part is the port. You can specify any available port:
"C:\Program Files\IBM\SDP\jdk\jre\bin\java" -cp C:\Temp\RPTAgentTest.jar SocketTestServer 7000
- If all goes well, you should see this message:
Listener started on host <host name>/192.168.x.x
Waiting for connection at port 7000
- On the agent, run SocketTestClient and specify the IP address of the workbench and the same port as used with the server:
"C:\Program Files\IBM\SDP\jdk\jre\bin\java" -cp C:\Temp\RPTAgentTest.jar SocketTestClient 10.13.x.x 7000
- You then see messages like:
Connected to host 192.168.x.x at port 7000
Connection time: Tue Nov 9 10:01:25 EST 2010
"Hello" message sent to server: Tue Nov 9 10:01:25 EST 2010
Type [Enter] to close connection
- When you run SocketTestClient on the agent, you also get this message on the RPT workbench:
New connection received: Tue Nov 9 10:01:25 EST 2010
From port <port number> of client <agent>
Waiting for "Hello" from client
Received "Hello" from client; waiting for client to end session ...
- Note the <agent> variable reported in this message. It is the name or address that you can use to reference the agent.
- At this point, a connection has been established from the agent to the RPT workbench. No further traffic is sent over this connection until you type the Enter key on the agent. Type Enter to make sure that the connection closes. Once Enter is typed on the agent, the agent provides this message:
Final message sent to server: Tue Nov 9 10:08:25 EST 2010
Closed port <port number>
On the RPT workbench, the message is:
Final message received from client: Tue Nov 9 10:08:25 EST 2010
Closed port 7000
This lists available network interfaces and IP addresses of the computer
"C:\Program Files\IBM\SDP\jdk\jre\bin\java" -cp C:\Temp\RPTAgentTest.jar showInterfaces
This is useful when diagnosing problems with IP aliasing.
This displays the "default" IP address of the computer:
"C:\Program Files\IBM\SDP\jdk\jre\bin\java" -cp C:\Temp\RPTAgentTest.jar GetIP
THE INFORMATION CONTAINED IN THIS DOCUMENT IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS DOCUMENT, IT IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS DOCUMENT OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS DOCUMENT IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR REDOCUMENTS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS AND CONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS OR SOFTWARE.
All source code and/or binaries attached to this document are referred to here as "the Program". IBM is not providing program services of any kind for the Program. IBM is providing the Program on an "AS IS" basis without warranty of any kind. IBM WILL NOT BE LIABLE FOR ANY ACTUAL, DIRECT, SPECIAL, INCIDENTAL, OR INDIRECT DAMAGES OR FOR ANY ECONOMIC CONSEQUENTIAL DAMAGES (INCLUDING LOST PROFITS OR SAVINGS), EVEN IF IBM, OR ITS RESELLER, HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
Original publication date