IBM Support

IBM FileNet Workflow System / Process Engine environment record caching

Question & Answer


Question

How does the Workflow System handle user and group membership information?

Answer

The IBM Content Platform Engine's Workflow System (known as the Process Engine on the 5.0 and earlier releases) gets information about users and groups by calling into the IBM Content Platform Engine's Content Engine APIs. The Content Engine APIs return the requested user and group information using calls into the LDAP directory server as necessary.

The Workflow System stores (caches) information about each user that logs on and about each user and group that is referenced by a workflow or security setting in the Workflow System. This user and group information is saved as "environment records" in the Workflow System's database.

Without this caching, the Workflow System would have to call into the Content Engine APIs much more often than it does when it uses its cache. The cache helps improve Workflow System's performance by avoiding calls to other servers to get information about users and groups.

How the Workflow System Caches User and Group Information

The Workflow System might need a particular user's Domain Name or email address, or the Workflow System might need to know the list of users who are members of a particular group (e.g., for workflow routing or security purposes). The Workflow system stores this information in "environment records."

The Workflow System maintains a cache of "environment records" (user and group information) in the VWUser table in the Workflow System's database. This is for performance reasons, as it's much faster to get information from the Workflow System's database than it is to call into the Content Engine APIs which, in turn, may need to make multiple calls into LDAP to get the required information.

The Workflow System also maintains a memory cache of environment records based on what is in the Workflow System's database because it's faster to use what's in memory than to read the database.

An environment record in the database includes a timestamp that the Workflow System maintains. This timestamp records when the information for that user or group was last refreshed from the Content Engine.

An environment record in the memory cache includes a timestamp that the Workflow System maintains. This timestamp records when the information for that user or group was last refreshed from the Workflow System database.

An environment record, once read into the memory cache or placed into the database, is considered "not stale" for a period of time specified by the "Cached Entry Timeout" setting. For example, with default settings, the Workflow System will use group membership and other user information for four hours before it considers the information to be "stale" and in need of being refreshed.

You can configure how long an environment record (in the memory cache or in the Workflow System's database) is considered valid by adjusting the "Cached Entry Timeout" setting on the "Advanced" tab of the "Workflow System" in ACCE (or, for PE 5.0, on the Advanced tab of the "Process Engine" node using the Process Task Manager).

When the Workflow System needs user or group information, it first checks its memory cache. If the required record is found in its memory cache and its timestamp is recent enough so that it is "not stale," that information will be used.

If the required environment record is not currently in the memory cache, or if the environment record in the memory cache is stale, the Workflow System will generally read the environment record from the Workflow System's database. See discussion below for the flushGranularity option which can inhibit reading the database in some circumstances.

If the required environment record is not found in the database, the Workflow System will immediately call into the Content Engine APIs to get the required information, which will then be used to create the environment record in the database and in the memory cache.

If the required environment record exists in the database but is found to be stale, the Workflow System may or may not call into the Content Engine APIs to get updated information. See the discussions below for the CacheUseExpiredIfNecessary and flushGranularity options. Depending upon those options, either the older, stale group membership information will be used, or the Workflow System will immediately call into the Content Engine APIs to get the required information.

In general, environment records in the Workflow System's database should almost never be stale. This is because the Workflow System's vwusrsync daemon, running in the background, refreshes the environment records automatically with a goal that no environment record in the database is ever stale. So, by the time an environment record in the memory cache becomes stale, the vwusrsync daemon should have updated the database to with newer, non-stale information. So simply reading the database generally gets the Workflow System the updated user or group information that it needs, and workflow processing continues.

When the Workflow System has to make a call into the Content Engine APIs because it's found a stale environment record in the database, this is an "on demand" update. The whole point of caching user and group information in the Workflow System is to avoid the relatively expensive and disruptive "on demand" calls into the Content Engine (and LDAP) while processing a running workflow.

As long as the vwusrsync daemon is running and able to keep up with its background "environment record refresh" work, a relatively expensive "on demand" call into the Content Engine to get user and group information is rare. That is, as a workflow is being processed, up-to-date information for a user or group should always be in the database where it can be efficiently accessed when the memory cache copy expires. An "on demand" access to the Content Engine normally happens only for the very first time a new group or user is referenced by the Workflow System.

When an on demand call is made because of stale persistent data, recent releases of the Workflow System (CPE 5.2.1.2, CPE 5.2.0.4, and PE 5.0.0.9) will log an "On demand call for user/group: <name>" message into the Workflow System log (e.g., pesvr_system.log). If there are many of these informational records in the log (other than in the first few hours after the Workflow System is initiated), that may mean that vwusrsync is having trouble keeping the environment records up to date.

Balancing Use of Stale Environment Records With Response Times

The basic "Is the cached environment record stale?" question is answered by comparing the current time with the timestamp in the environment record. If the difference between those two times is less than the Cached Entry Timeout setting, the data in the cached environment record is considered to be good; otherwise it's stale. This check pertains to an environment record in the memory cache as well as to an environment record in the database.

However, before the Workflow System acts like the environment record in memory is stale, it does two "extra" checks. There are two configuration settings that can let the Workflow System use stale environment records rather than do extra work to get fresher data: flushGranularity and CacheUseExpiredIfNecessary.

flushGranularity

If some other environment record in the memory cache has recently been declared stale resulting in the Workflow System having to refresh it by either reading the environment record from the database or by doing an "on demand" call into the Content Engine in order to get up-to-date user or group information, and that previous refresh happened within the last <flushGranularity> seconds (2 seconds by default), the Workflow System will use current, stale environment record as if it was not stale (even though its timestamp says it is stale). Essentially, with the flushGranularity option enabled, the Workflow System will sometimes accept and use stale environment records to avoid "too many" hits in a short time period on the Content Engine APIs for group and user information.

But the flushGranularity option also inhibits the Workflow System from reading in the current environment record from its database, where the updated data may already be present because of the vwusrsync daemon.

Setting the flushGranularity option to zero tells the Workflow System not to skip reading the Workflow System database (or calling the Content Engine APIs) just because there was a recent call to read an environment record or call the Content Engine APIs. Setting flushGranularity to zero is recommended.

CacheUseExpiredIfNecessary

Another option that lets the Workflow System use stale group information in order to provide better response times is the CacheUseExpiredIfNecessary option. This option was added in CPE 5.2.1.2, CPE 5.2.0.4, and PE 5.0.0.9. The option is enabled by default. The option is in PE 5.0.0.8 as well, but is not enabled by default. This option lets the Workflow System read the database to get more up to date group information (assuming the vwusrsync daemon has been updating the database). But the Workflow System will then use what it finds in the database (even if it's stale) and assume that vwusrsync will be updating the information in the database soon. This is an improvement over the older flushGranularity mechanism because it lets the Workflow System read its database to get updated group information. This lets the Workflow Server carry on with out-of-date group membership information, but it will still refresh the group membership information as soon as vwusrsync has updated the database. Leaving CacheUseExpiredIfNecessary enabled is recommended.


How a Workflow System Keeps User and Group Information Up To Date

The Workflow System uses a background daemon called vwusrsync that automatically refreshes the environment records in the Workflow System's database with the goal that that the records in its database never actually become stale. The vwusrsync daemon's goal is to make sure that if any Workflow System server reads an environment record from the Workflow System's database, it will find data that is not stale. This means that a Workflow System, while processing a work item, will never have to wait for user or group membership information to be refreshed.

The vwusrsync daemon is automatically initiated when a Workflow System is started. By default, every 120 seconds, it wakes up and refreshes a subset of the environment records.

The number of records refreshed on each “refresh interval” is calculated by the Workflow System. The calculation uses the number of environment records, the number of Workflow Systems, the cached entry timeout setting (how long environment records last before they expire), and the cache sync interval setting (how often the daemon wakes up to refresh some environment records).

The number of records to be refreshed each time a vwusrsync daemon wakes up is calculated as
( (CacheSyncInterval * environmentRecordCount / numberOfWorkflowServers) / (CachedTimeOutHours * 3600) ) + 1.

For example, let's assume you have:

  • 3 Workflow Servers
  • 3000 environment records in the Workflow System's VWUser table
  • Cached Entry Timeout at the default of 4 hours
  • CacheSyncInterval at the default of 120 seconds

With the above numbers, every 120 seconds, a Workflow Server's vwusrsync daemon will wake up and refresh nine environment records in the VWUser table. The "nine" is calculated, using the above formula, as ((120 * 3000 / 3) / (4 * 3600)) + 1 = 9.3, or 9 environment records every 120 seconds.

At this refresh rate, the three vwusrsync daemons (one on each Workflow Server) will be working at a rate such that each environment record in the VWUser table will be updated every four hours, so environment records in the database will never become stale.

On PE 5.0 releases, things are slightly different, but the end result is the same. In PE 5.0, the number of servers was not directly included in the calculation. But PE 5.0 adds a rule that says "don't refresh environment records that are not at least 80% of the way to being stale." So, in the previous example, if we had a PE 5.0 farm of three Process Engines, vwusrsync on each server would be trying to refresh up to about 25 users every two minutes. But because they were all refreshing the same set of old environment records, a vwusrsync daemon may wake up and find that it only needs to refresh nine or so environment records rather than 25 because the threshold rule says that the other "oldest records" are not yet old enough to require a refresh (because the vwusrsync on the other Process Engine servers already took care of them).


Configuration Settings

The following settings let you configure how the user and group caching works in the Workflow System. These settings can be defined and adjusted on the "Advanced" tab of the Workflow System in ACCE (or for PE 5.0, on the “Advanced” tab of the "Process Engine" node in the Process Task Manager).

Cached Entry Timeout - number of hours an environment record (in memory or in the database) is considered current and valid. This defaults to four hours. Setting Cached Entry Timeout to zero tells the Workflow System that the data in the memory cache never expires and the Workflow System will use any user and group data that it finds in the memory cache (no matter how old) without refreshing it by accessing the database or by communicating to the Content Engine. Note that setting Cached Entry Timeout to zero does not disable the vwusrsync daemon. If the Cached Entry Timeout is set to zero (never expires), the vwusrsync daemon will use four hours when calculating how quickly to refresh the environment records in the database.

CacheUseExpiredIfNecessary - This option was added in these releases: CPE 5.2.1.2, CPE 5.2.0.4, and PE 5.0.0.9. CacheUseExpiredIfNecessary defaults to true. The default behavior of this new option is to provide better response times at the expense of using slightly out of date group membership information if necessary. This option can be disabled by setting it to “False”, “F”, or “0”.
    If an older release is being used and CacheUseExpiredIfNecessary is not part of that release and the Workflow System encounters an expired environment record, the database will be read and then, if necessary, the calls will be made into the Content Engine APIs to get the updated information on that group. This can be expensive and disruptive when groups contain many members. End users can notice the resulting slow response times.

    But when CacheUseExpiredIfNecessary is enabled, the Workflow System's environment record caching behavior changes so that a stale environment record in memory will trigger the Workflow System to read its database (which is generally kept up to date by vwusrsync) to get an updated environment record. But with CacheUseExpiredIfNecessary enabled, even if the environment record in the database is stale, the Workflow System will use the stale database information rather than incurring the costs and response time impacts that are seen when the "on demand" calls into the Content Engine and LDAP are used to refresh group membership information.

    Generally, these "on demand" calls into the Content Engine to get updated group membership information happen only if the vwusrsync daemon was unable to keep the environment records in the database up to date in the time allowed by the Cached Entry Timeout setting.

    With CacheUseExpiredIfNecessary enabled, the vwusrsync daemon will continue to do its background refreshing of the user and group information in the Workflow System's database. And the Workflow System's runtime will still notice that the memory cache group information is stale and re-read the group membership data from the database (a relatively fast and inexpensive operation). So once the vwusrsync daemon gets the Workflow System's database updated, the new group membership information will be seen by the Workflow System the next time it's accessed. That is, the group membership is still automatically refreshed and brought into play as soon as possible. But, with the CacheUseExpiredIfNecessary enabled, the Workflow System server will tolerate and use older (slightly stale) group membership information and carry on with its work rather than forcing users to wait until the group membership is updated.
    This can delay how quickly LDAP group membership changes are seen by the CPE server.

CacheSyncInterval – number of seconds the vwusrsync daemon will sleep before it updates the next batch of environment records in the Workflow System database. This defaults to 120 seconds. This is also the setting that can completely disable the vwusrsync daemon. If Cache Sync Interval is set to zero, the vwusrsync daemon will be completely disabled and no background refreshing of the user and group environment records in the database will occur. If Cache Sync Interval is non-zero, the vwusrsync daemon will wake up every Cache Sync Interval seconds and refresh some of the oldest user and group environment records. If Cached Entry Timeout is set to zero (meaning cached data in memory never expires) and Cache Sync Interval is non-zero (meaning the vwusrsync daemon will be running to refresh the database), the vwusrsync daemon will refresh the database at a rate that is calculated using "4 hours" as the value for Cached Entry Timeout. Usually, there is not a good reason to change this setting from the default of 120 seconds.

CacheSyncFixupEmail – This option says whether or not the Workflow System will fix up the e-mail addresses in the environment records to match the email addresses found in the directory server. This is the same functionality that is in vwtool’s “env fixup...” command. Setting this attribute to a value of "True", "T", or “1” signifies that the email addresses should be automatically fixed up. Values of “False”, “F”, or “0” signify no fix up of email addresses. Default is False. So, by default, email addresses being “remembered” by the Workflow System in the user environment records will not be fixed up by the vwusrsync daemon. If the vwusrsync daemon is inhibited from running, the Cache Sync Fixup Email setting will have no effect.

flushGranularity - This setting allows the Workflow System to sometimes use stale environment records in the Workflow System server's memory. The Workflow System will use stale user or group environment records in memory if, within the previous flushGranularity seconds, it has had to refresh another stale environment record by reading its database or doing an "on demand" call into the Content Engine. The purpose of the flushGranularity option is to attenuate the calls into the Workflow System database or into the Content Engine server for the cases where the Workflow System finds a large set of environment records in memory that are all expired and all need refreshing. A non-zero flushGranularity setting lessens the overhead on the database and, more importantly, into the Content Engine server at the expense of forcing the Workflow System to accept and work with stale user/group information. Note that with the vwusrsync daemon running, the Workflow System should never find a stale user/group environment record in its database. This attribute defaults to two seconds, but, as long as vwusrsync has not been disabled, the flushGranularity option should generally be explicitly set to zero, which would let the Workflow System always read the database to get updated environment records. The vwusrsync daemon itself will always access the Content Engine when it needs to refresh an environment record in the database, no matter how flushGranularity is configured.


How Long Will It Take For An LDAP Group Membership Change To Be Visible in the Workflow System?

It will depend on the state of the caches. It can take up to eight hours before an LDAP change is visible to Workflow System users on a Workflow System server that is running with the default parameters. Although the average time is probably closer to four hours.

For example, let's look at a "worst case" example, assuming the default Cached Entry Timeout of 4 hours, and assuming that the vwusrsync daemon is keeping up with its background refresh work:

1:00 PM - The vwusrsync daemon refreshes the Workflow System's database for the Admins group. So the database will still have the current (original) group membership information for four more hours.

1:05 PM - Group Membership for Admins group in LDAP is changed.

4:55 PM - The Workflow System needs the Admins group information and the PE's memory cache record has just expired, so it reads the Admins group information from the database. The database still contains the old group membership information from the 1:00 PM update and the database's information is still good for another 5 minutes. So the Workflow System puts that copy of that environment record in the memory cache and the memory cache copy is given a new, "good for four hours" timestamp.

5:00 PM - The database information for the Admin group is now going stale and the vwusrsync daemon refreshes the database with the new group membership information, picking up the LDAP change that happened at 1:05 PM. So the group membership information for the Admins group is now up to date in the Workflow System database, but the memory cache still contains the older group information that was read in 5 minutes ago and will be good for the next 4 hours or so.

8:50 PM- An access to the Admins group membership information finds the old environment record in the memory cache, and that copy has another five minutes before it goes stale. So the Workflow System uses the old group membership information and is not yet seeing the LDAP change that was made at 1:05 PM.

9:00 PM - An access to the Admins group membership information will now find that the copy in the memory cache is stale. The Workflow System then reads a new copy from the database. At this point, the LDAP group membership change that was made at 1:05 PM is now seen in the memory cache of the Workflow System.


Manually updating group membership information for a single group

If a group's membership list has been changed in LDAP but the Workflow System is not yet seeing the changes because of its caching, the new group membership information can be brought into play by using vwtool's env cache command. If this is being done on a CPE 5.2.x system, the command has the form env cache <groupname> <server name> because vwtool needs to know which server's memory cache to update. If this is being done on a PE 5.0 system, the command has the form env cache <groupname> because the PE 5.0 vwtool program is running on a particular server.

As of CPE 5.2.1.2, CPE 5.2.0.4, and PE 5.0.0.9, this command will call into CE/LDAP to get the current group membership information, update the Workflow System's database, and then refresh the Workflow System's memory cache with the new group information. This brings the new group membership into use immediately.

If you enter just the env cache <groupname> on CPE 5.2.x, and omit the name of the <server name> part, vwtool will prompt with the names of the servers.

On older releases, this command would update the Workflow System's database to contain the new, current group membership information. But on older releases, vwtool did not force the Workflow System to refresh its memory cache, so the updated group membership information wouldn't necessarily be brought into immediate use. On older releases, vwtool's flushusercache command could then be used to cause the Workflow System's entire environment record memory cache to be flushed. That would cause the Workflow System to re-read the now-updated information from the database and thus see the new group membership information.

Why isn't group membership being updated?

In recent releases (CPE 5.2.1.2, CPE 5.2.0.4, and PE 5.0.0.8) when the vwusrsync daemon finds that an environment record references a user or group that no longer exists in the LDAP server, it sets the timestamp on that environment record to a date far in the future (in the year 9999). This marks the record so that the vwusrsync daemon will no longer try to update the information in that record; vwusrsync ignores any environment record with the timestamp set to the year 9999. This avoids unnecessary overhead on the CE and LDAP servers and allows the vwusrsync daemon to spend its time refreshing the users and groups that are currently in LDAP. And it allows a site to keep old environment records around without significant overhead in case they're being referenced by other in-flight workflows.

A vwtool env v p <group name> command will show the timestamp for a group in the Workflow System's database. If the year in the timestamp is 9999, that means that the group was marked as no longer being visible in LDAP and so the Workflow Server is no longer attempting to refresh its group membership information.

If the timestamp year is 9999, that implies that (at some point) the Content Engine's interface to the LDAP server could not see or find that group. Can the CE APIs see that group now?

Use vwtool to verify that Content Engine is seeing a particular group

As of CPE 5.2.1.3, vwtool's sectool x <group name> will call into the CE APIs and expand the specified group and show the group membership information. If that information does not show the group along with the expected group membership, it's likely that there is a configuration issue in the Content Engine (or the app server itself) as to how LDAP is being accessed.

If the sectool command sees the group, you can refresh it manually using vwtool.

Manually updating group membership for a specific group

There are two times when this might be desired:
  1. A group has had its environment record flagged as "deleted from LDAP" (with the expiration date in the year 9999) but is now visible in LDAP again and you'd like to explicitly update the group membership information for that group and reinstate automatic updating of the members for that group.
  2. If a recent change to LDAP group membership has been done and you'd like to get that new group membership information into play as quickly as possible rather than waiting for vwusrsync and the normal Workflow System cache updates.

The vwtool utility can be used to do an env cache <group name> command to refresh the group membership information on all of the servers. Specifically, on PE 5.0, on each of the PE Servers, run vwtool and execute the env cache <group name> command. And on CPE 5.2.x, run vwtool and execute the env cache <group name> <server name> command for each server. This causes the Workflow System to immediately refresh the group membership information for that group in the database and in the Workflow System's memory cache. This command also reinstates the automatic vwusrsync refreshing of that particular group by resetting the environment record's timestamp in the database to "now."

Then, use vwtool to run the env v p <group name> output again and you should see that the group membership information is as expected and the timestamp is no longer in the year 9999. Check the memory cache has been updated with the expected/current group membership information as well:
For PE 5.0, use: env v c <group name>
For CPE 5.2.x, use: env v c <group name> <server name>

Check the Workflow System (Process Engine) server logs to see if there are any problems updating the group information.

If none of the above helps explain why a group's membership information is not as expected, check the server logs (e.g., pesvr_system.log) to see if any errors or exceptions are being seen from vwusrsync. If the server logs don't shed any light on the issue, more information can be gathered by enabling the "environment cache" and "security" traces using vwtool and letting vwusrsync attempt to refresh the group. This could require leaving those traces enabled for <Cached Entry Timeout> hours.

Manually updating all environment records

A vwtool env i f m y command can be used to check the environment records of all users and groups. For any users or groups that were flagged with the expiration date in the year 9999 but are now found in LDAP, vwtool will refresh their information and reinstate the automatic vwusrsync refreshing for them. Note that env i f m y can take hours if you have multiple thousands of users and groups.


Some Configuration Recommendations

Out-of-the-box behavior - no changes to the caching settings
    The "out-of-the-box" behavior is that the Workflow System will automatically keep all user environment records up-to-date such that, in the database, none of them are over four hours old. The updates to the environment records are nicely spread out over a four hour time period. LDAP changes to group information are usually brought into the Workflow System's memory cache in around 4 hours, but could take as long as 8 hours to appear. Response times should be optimal because the Workflow System will tolerate stale group membership information rather than forcing disruptive "on demand" calls into the CE and LDAP if vwusrsync's background refreshing of group membership information is a little behind schedule.
You can set flushGranularity set to 0 (recommended)
    Leaving flushGranularity to its default setting of 2 seconds will inhibit the Workflow System from reading its database to get updated group information. That's a cheap operation with negligible performance impact and it's generally worth doing because that can bring in updated group information. So it is recommended to explicitly set flushGranularity to 0. A non-zero setting can cause confusion where it's clear that the VWUser table contains updated group membership information but that information wasn't brought into memory cache (possibly because of this option). Usually, explicitly setting flushGranularity to 0 is the best choice.
You can disable the CacheUseExpiredIfNecessary option
    You can disable the Workflow System's ability to tolerate slightly expired group membership information by explicitly disabling this option. If you disable this feature and the vwusrsync daemon is unable to update the group membership information in a timely manner, you may see severe response time degradation that users of the system will notice. With the option enabled, response times will remain good, and vwusrsync will refresh the stale group membership information shortly, and the updated group membership information will be automatically brought into play after that. Any time the Workflow System uses stale data because of having CacheUseExpiredIfNecessary enabled, that fact is logged in the server logs. Usually, leaving the CacheUseExpiredIfNecessary option enabled is the best choice.

You can set the Cached Entry Timeout to a longer time period
    This will lower the overhead associated with keeping environment records up to date. This lets the Workflow System refresh the records at a slower rate so that the CE and LDAP servers are not impacted as heavily by the calls to fetch that information. This can be useful if the LDAP group membership lists are not being changed very often.

You can set the Cached Entry Timeout to a shorter time period
    This will let the Workflow System see more up-to-date user and group information at the expense of higher overheads caused by more frequent calls into the Content Engine (and LDAP Server) to get the needed information. This can be useful when there are frequent changes to LDAP group membership lists that must be brought into play more quickly.

You can set the Cached Entry Timeout to zero and the Cache Sync Interval to zero.
    Setting Cached Entry Timeout to zero tells the Workflow System that environment records in memory never expire. Setting Cache Sync Interval to zero completely disables vwusrsync's automatic, background refreshing of the environment records.

    If a required environment record is not currently in the memory cache, it will be read from the persistent store on disk, and if the required environment record is not yet in the persistent store, a call will be made into the Content Engine APIs so as to create the environment record in the persistent cache.

    If you set Cached Entry Timeout to zero, the Workflow System will never see updated user or group information until vwtool's "environment" command is used to manually update the Workflow System's environment records in the database. This lets a site have complete manual control over when and how often the environment records are refreshed.

    The main reason you might configure a Workflow System to work this way is that you require zero overhead costs associated with environment record updates during the main processing time periods and you can schedule manual updates to the Workflow System's environment caches at other non-critical times. This is not generally recommended but it can be done.

[{"Product":{"code":"SSTHRT","label":"IBM Case Foundation"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"Process Engine","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"},{"code":"PF016","label":"Linux"}],"Version":"5.2.1;5.2;5.0;4.5.1","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
17 June 2018

UID

swg21501780