Balancing producers and consumers in publish/subscribe networks

An important concept in asynchronous messaging performance is balance. Unless message consumers are balanced with message producers, there is the danger that a backlog of unconsumed messages might build up and seriously affect the performance of multiple applications.

In a point-to-point messaging topology, the relationship between message consumers and message producers is readily understood. You can get estimates of message production and consumption, queue by queue, channel by channel. If there is a lack of balance, the bottlenecks are readily identified and then remedied.

It is harder to work out whether publishers and subscribers are balanced in a publish/subscribe topology. Start from each subscription, and work back to the queue managers having publishers on the topic. Calculate the number of publications flowing to each subscriber from each queue manager.

Each publication that matches a subscription on a remote queue manager (based on proxy subscriptions) is put to a transmission queue. If multiple remote queue managers have proxy subscriptions for that publication, multiple copies of the message are put to a transmission queue, each targeted for a different sender channel.

In a publish/subscribe cluster, those publications are targeted at the SYSTEM.INTER.QMGR.PUBS queue on the remote queue managers that host the subscriptions. In a hierarchy, each publication is targeted at the SYSTEM.BROKER.DEFAULT.STREAM queue, or any other stream queues listed in the SYSTEM.QPUBSUB.QUEUE.NAMELIST on the remote queue managers. Each queue manager processes messages arriving on that queue and delivers them to the correct subscriptions on that queue manager.

For this reason, monitor the load at the following points where bottlenecks might arise:
  • Monitor the load at the individual subscription queues.
    • This bottleneck implies that the subscribing application is not consuming the publications as quick as they are being published.
  • Monitor the load at the SYSTEM.INTER.QMGR.PUBS queue or the stream queues.
    • This bottleneck implies that the queue manager is receiving publications from one or more remote queue managers faster than it can distribute them to the local subscriptions.
    • When seen on a topic host queue manager when using topic host routing in a cluster, consider making additional queue managers topic hosts, allowing the publication workload to be balanced across them. However, this will affect the message ordering across publications. See Topic host routing using multiple topic hosts for a single topic.
  • Monitor the load at the channels between the publishing queue manager and the subscribing queue managers, which are fed by the transmission queues on the publishing queue manager.
    • This bottleneck implies that either one or more channels is not running, or messages are being published to the local queue manager faster than the channels can deliver them to the remote queue manager.
    • When you use a publish/subscribe cluster, consider defining additional cluster receiver channels on the target queue manager. This allows the publication workload to be balanced across them. However, this affects the message ordering across publications. Also consider moving to a multiple cluster transmission queue configuration, because this can improve performance in certain circumstances.
  • If the publishing application is using a queued publish/subscribe interface, monitor the load at (a) the SYSTEM.BROKER.DEFAULT.STREAM queue, and any other stream queues listed in the SYSTEM.QPUBSUB.QUEUE.NAMELIST ; and (b) the SYSTEM.BROKER.DEFAULT.SUBPOINT queue, and any other subpoint queues listed in the SYSTEM.QPUBSUB.SUBPOINT.NAMELIST .
    • This bottleneck implies that messages are being put by local publishing applications faster than the local queue manager can process the messages.