Intermittent poor performance when the MDB's are getting messages from MQ queues

Technote (troubleshooting)


You have MDB's driven by messages from WebSphere MQ queues. Intermittently the MDB's experience a delay where it may take up to sixteen seconds for the retrieval of each message. There is no gradual degradation. The MDB's are running fine one minute, then they will experience these delays and then resume normal processing. The MDB's will usually only experience these delays for a short time, usually 30 seconds or less, before they start retrieving messages in a matter of a few milliseconds again.


A review of MQ and MQ JMS traces at the time of the delays reveals that the messages were on the queue and available at the time when the MDB's were experiencing a delay. The MDB's are doing a browse of the messages prior to doing a MQGET to remove the message from the queue. The delayed messages have a priority of 5 in the MQ message descriptor (MQMD) while most of the messages being browsed have a priority 0 or 4, which are lower priorities.
The observed delay is that the thread which is doing MQGET is using the MQGMO_BROWSE_NEXT option and it has browsed to a later point in the queue than the point where the priority 5 messages were committed. Each browsing thread from the MDB's has a browse cursor whose position in the queue dictates the next message that thread is able to get. Messages committed to the queue before the browse cursor's current position are not gettable by that thread until it issues a new MQGET call using the MQGMO_BROWSE_FIRST option. The browse cursor is then reset to point at the start of the queue and therefore the higher priority messages are now visible to the browsing thread.

The delayed messages are successfully delivered after the browse thread uses MQGMO_BROWSE_FIRST. The time delay is accounted for by the application making MQGMO_BROWSE_NEXT + MQGMO_WAIT calls at a time when the browse cursor has moved past the priority 5 messages.

Here is an example to illustrate the concept of how a browse cursor works and why these delays are occurring. Suppose that a priority queue contains the following sequence of messages:
Message A priority 4
Message B priority 4
Message C priority 3

Message B was put onto the queue after Message A, hence why it appears second. Messages at the same priority are stored in arrival sequence. Now suppose the queue is opened by a single application which then issues a get with MQGMO_BROWSE_FIRST. The application's browse cursor is pointed at Message A and MQ returns a copy of Message A in the application buffer. Since this was a browse call, Message A remains on the queue.

The queue now looks like this:
Message A priority 4 <-- application browse cursor
Message B priority 4
Message C priority 3

Now suppose that a second application opens the queue for output and puts a new message (D) onto the queue at priority 5. In an MQ priority queue, 5 is a higher priority than 4 so this message is added at the start of the queue.

The queue now looks like this:

Message D priority 5
Message A priority 4 <-- application browse cursor
Message B priority 4
Message C priority 3

When the browsing application issues its next get with MQGMO_BROWSE_NEXT then its browse cursor moves down to the next message on the queue after the current position and the application is given a copy of Message B.

The queue now looks like this:
Message D priority 5
Message A priority 4
Message B priority 4 <-- application browse cursor
Message C priority 3

Even though Message D is earlier in the queue and has not been seen by the application yet, the design of MQGMO_BROWSE_NEXT is to move down to the next message in the queue. The browse continues in this manner until the browsing application reaches the end of the queue. At this point, the application receives reason code 2033 (MQRC_NO_MSG_AVAILABLE) and therefore knows it has reached the end. So it then issues a get with MQGMO_BROWSE_FIRST again, which resets the browse cursor to the top of the queue and so now it will get a copy of Message D.

The performance delay in this case was accounted for by the time it took to browse the remainder of the queue before re-issuing a get with the MQGMO_BROWSE_FIRST option. During that time, the higher priority message sat at the top of the queue without being processed until the browse first call was issued.

Resolving the problem

There are various strategies for handling the situation differently, and so avoiding the symptom.

One simple option is to change the queue to be a First In First Out (FIFO) queue. You can do this using runmqsc to "ALTER QLOCAL(QNAME) MSGDLVSQ(FIFO)". This would cause messages to be delivered in arrival sequence instead of priority.

From an application server perspective, there are a couple of options available to help ensure messages are processed in a more timely manner.

1. Reconfigure the application server to use non-ASF mode. Using non-ASF mode changes the mechanism that the application server uses to get messages so that, rather than performing a browse of the queue, it simply issues MQGET calls. This means that MDB's will always be given the message that is currently at the top of the queue. More details about non-ASF mode, including how to configure the application server to use this mode of operation, can be found in the following developerWorks article titled "When to use ASF and non-ASF modes to process messages in WebSphere Application Server".

2. Change the custom property eoqtimeout. When scanning the queue for a message, once the queue agent reaches the end of the queue, this is the length of time it waits before going back to the top of the queue and beginning a new scan. Information about how to change this timeout value can be found in the technote titled "How to modify the Generic JVM Arguments and Java system properties for a WAS server".

Related information

The browse cursor
MQGMO options

Product Alias/Synonym

WebSphere MQ WMQ MQSeries

Document information

More support for:

WebSphere MQ

Software version:


Operating system(s):

AIX, HP-UX, Linux, Solaris, Windows

Reference #:


Modified date:


Translate my page

Content navigation