IBM Support

DCE IPC Resource Usage

Troubleshooting


Problem

DCE IPC Resource Usage

Resolving The Problem

DCE IPC Resource Usage


Way-background: CDS client processes

There are two programs that are needed on a CDS client system (i.e. on any DCE system) to access CDS: cdsadv ("the advertiser") and cdsclerk ("the clerk"). There should always be exactly one cdsadv process running; it starts at DCE startup time along with the other DCE daemons like dced. There will be zero or more cdsclerk processes running at any point in time.

Each UNIX user who has used CDS recently has a cdsclerk devoted to him/her. If UNIX user abc logs into UNIX, creates 6 windows, and logs into DCE as 6 different principals in different windows and starts doing things like dce_login and cdsls, there will still be just one cdsclerk for this set of windows -- a cdsclerk goes with a UNIX login ID, not a DCE dce_login ID. You'll often see a cdsclerk for UNIX root and a few others for human UNIX users in dce.ps output; in order to see which cdsclerk is which, use /bin/ps -ef | egrep 'PID|cds' rather than dce.ps (dce.ps doesn't show complete argument vectors, and you need to see the -U argument in order to know which UNIX user a particular cdsclerk belongs to).

cdsadv starts cdsclerk processes as necessary, and the clerks die after a period of inactivity (20 minutes, I think). So, clerks come and go and they help different UNIX users access CDS; a single cdsadv is responsible for starting clerks. Both cdsadv and cdsclerks use the same CDS cache, which is stored on disk and is also kept in memory. In memory, it's accessed via the UNIX "shared memory" facility, which is what this document is all about...

Note that not all versions of DCE work this way. Specifically, HP has done away with cdsclerk and IPC usage by CDS, so none of this document applies to the current version of DCE (1.4.x or higher) on HP-UX.



Background: what IPC is

Semaphores and shared memory are two of UNIX's so-called "inter-process communication" (IPC) features. The third IPC feature, messages, is not used by DCE.

Shared memory allows multiple processes to see the same piece of memory, sort of like how multiple threads in a single process can all see global variables. Semaphores allow the multiple processes to synchronize their access to the global memory, like mutexes allow multiple threads to synchronize access to global variables. Semaphores and shared memory are both managed by the kernel (they have to be since they're used by several different processes), whereas global variables and mutexes are managed inside a single process.

To create a piece of shared memory (a so-called "shared memory segment"), a process uses the shmget() system call. A process passes a "shared memory key" to shmget() and it gets back a "shared memory ID" (much like to open a file you pass in a file name and get back a file descriptor -- you get to pick the name and the kernel picks the descriptor number -- here the program gets to pick the key and the kernel picks the ID). A process can "attach" a shared memory segment via a call to shmat() -- this associates some program variable with the shared memory. Then the process can just manipulate that variable like any other, the only difference being that the variable is potentially visible to other programs. There are permissions associated with shared memory segments to determine who can read and write them, just like there are with files in the filesystem. A program can "detach" the shared memory segment when it no longer needs to use it by calling shmdt(), and the last process to use a shared memory segment can "destroy" it via a call to shmctl(). You can of course read the man pages to learn more...

As for semaphores, they're organized into "semaphore sets"; each semaphore set has a key that you choose and an ID that the kernel picks, as with shared memory. A semaphore set consists of one or more individually usable semaphores, which can be locked and unlocked separately. (A semaphore set is sort of like an array of mutexes.)

Typically a semaphore set is used to lock and unlock parts of a shared memory segment. There is a set of semaphores, not just one, in case you want to lock and unlock different parts of the shared memory segment separately. For example, CDS uses shared memory to allow all cdsclerks to see the cache. The cache has a header and a data section, and the semaphore set has two semaphores, one to lock the header and one to lock the data.

Semaphores are created via semget(), locked and unlocked via calls to semop(), and destroyed via calls to semctl().

The ipcs command can be used to examine IPC resources on a system; ipcrm can be used to remove unused IPC resources (although that shouldn't normally be necessary, software that creates IPC resources should also delete them when they're no longer needed).



How CDS uses IPC

When the cdsadv starts, it tries to create a shared memory segment with key 1a4c, by calling shmget(). If there's already a segment with key 1a4c, then shmget() fails; in this case the advertiser tries key 1a4d, 1a4e, and so on until it succeeds. (So you can see that if there was a leftover shared memory segment from last time CDS ran, it doesn't get reused -- it just sticks around and clutters up the system.) cdsadv writes the shared memory ID that it gets from shmget() into file /opt/dcelocal/etc/cdscache.shmid so other processes (cdsclerks) can find it.

cdsadv attaches the newly-created shared memory segment via a call to shmat(). That makes it available for use within the program, as a variable called CAfixed_p (of type "pointer to structure CacheFixed", where CacheFixed is the structure that defines the layout of the CDS cache). The magic of shared memory is that accesses to CAfixed_p are now references to this shared memory segment, although inside cdsadv they're programmed like any other variable accesses. The advertiser reads the existing CDS cache, if it exists on disk in /opt/dcelocal/var/adm/directory/cds/cds_cache.NUMBER, into CAfixed_p; otherwise it just initializes a new cache in CAfixed_p. The cdsadv will periodically dump the contents of CAfixed_p back to the disk version of the cache, but for minute-to-minute operations, the in-memory cache that lives in the shared memory segment is what's used.

cdsadv calls semget() to create a semaphore set that has the same key as its shared memory segment has. So, the CDS shared memory and semaphore will always have keys (as seen in ipcs output) of 1a4c or 1a4c + something; the IDs as reported by ipcs will vary depending on the whim of the kernel.

SO now cdsadv has created a new shared memory segment and written its ID to the cdscache.shmid file; it's loaded the on-disk cache into the shared memory, and it's created a semaphore set that has the same key as the shared memory (but probably a different ID). The semaphore set consists of two semaphores, one to lock and unlock the part of the cache that is the header, and the other to lock and unlock the data part of the cache.

When a clerk starts up, it can read cdscache.shmid to learn the ID of the shared memory segment and attach the shared memory to its own CAfixed_p variable, so the cdsadv and all cdsclerks have a variable called CAfixed_p that all point to the same memory. The clerk can lock and unlock the semaphores in the semaphore set that has the same key as the shared memory segment when it reads and writes data from or to its CAfixed_p, in order to synchronize its access with other clerks.

The last thing you need to know is that there's a so-called SEM_UNDO feature available with the semop() system call (recall that semop() is used to lock and unlock semaphores). The SEM_UNDO feature is used when locking a semaphore; it tells the kernel to keep track of the fact that a particular process has locked this semaphore and if the process dies, the kernel will automagically unlock the semaphore. This means that if somebody uses kill -9 to kill a clerk when it has a semaphore locked, the semaphore won't just stay locked -- it gets unlocked as the process dies. SEM_UNDO is an optional feature of the semop() call. CDS always uses SEM_UNDO when locking semaphores.



ipcs

The ipcs program is used to examine IPC resources. Typical output looks like this:



# ipcs -a
IPC status from as of Fri Oct 11 11:10:32 1996
Message Queue facility not in system.
T ID KEY MODE OWNER GROUP CREATOR CGROUP NATTCH
SEGSZ CPID LPID ATIME DTIME CTIME
Shared Memory:
m 0 0x00001a4c --rw------- root wheel root wheel 2
512000 294 301 9:58:20 9:58:19 9:58:17
T ID KEY MODE OWNER GROUP CREATOR CGROUP NSEMS
OTIME CTIME
Semaphores:
s 9 0x00001a4c --ra------- root wheel root wheel 2
11:08:20 9:58:17

# ps -ef | egrep 'PID|cds'
UID PID PPID C STIME TTY TIME COMD
root 294 1 80 09:58:16 ? 0:03 cdsadv -s
root 301 294 80 09:58:19 ? 0:08 /opt/dcelocal/bin/cdsclerk -U
root -w FATAL:STDERR:-;FILE:/opt/dcelocal/var/svc

# cat /opt/dcelocal/etc/cdscache.shmid
0
845042297

# ls -l /opt/dcelocal/var/adm/directory/cds
total 1096
prw------- 1 root wheel 0 Oct 11 09:58 cdsAdver
prw-rw-rw- 1 root wheel 0 Oct 11 09:58 cdsLib
-rw------- 1 root transarc 512000 Oct 10 23:54 cds_cache.0000000013
-rw------- 1 root transarc 10 Oct 10 23:54 cds_cache.version
-rw------- 1 root transarc 108 Oct 10 23:54 cds_cache.wan
drwxr-x--x 2 root transarc 512 Apr 10 1996 cdsadv
drwxr-x--x 2 root transarc 512 Apr 10 1996 cdsclerk
prw------- 1 root wheel 0 Oct 11 09:58 cdsclerk_301_root
-rw------- 1 root transarc 0 Oct 9 15:39 cdsclerk_7838_root
-rw-r--r-- 1 root transarc 32768 Jun 18 08:33 clerk_mgmt_acl_v1.dat

This tells you a lot (if you've read man ipcs):
        • The CDS shared memory ID is 0 (from the cdscache.shmid file); this matches ipcs -a, where shared memory ID 0 has key 1a4c
        • Semaphore set with ID 9 also has key 1a4c; it's the CDS set. NSEMS of 2 shows that there are two semaphores in this set, as expected
        • The MODE columns show who can manipulate these IPC resources -- only root can. This is why the CDS IPC stuff will not show up in ipcs output if you're not root (try it).
        • For the shared memory, NATTCH of 2 shows two processes attached to this segment -- the cdsadv and the single cdsclerk that's running; SEGSZ shows the segment size is 512000 bytes, which matches the cache size on disk.
        • CPID, the creating process ID, of the shared memory segment is that of cdsadv; LPID, the last process to attach or detach the segment, is that of the clerk.
        • CTIME is the shared memory segment creation time, it's close to the time that ps says cdsadv started; ATIME, the time of last attach, is close to the time that root's cdsclerk started
        • The second line in cdscache.shmid is the shm creation time in UNIX seconds- since-1970 format; if you count seconds (or call ctime()), you'll see that 845042297 is Fri Oct 11 09:58:17 1996, which matches CTIME
        • Semaphore CTIME matches shared memory CTIME
        • Semaphore OTIME is time of last operation on this semaphore set -- the last time that someone locked or unlocked a CDS semaphore, it's pretty recent
        • No leftover junk IPC resources from prior incarnations of CDS, cool. If there were leftovers -- if, for example, CDS was using ID 1a4e and you saw that 1a4c and 1a4d were still in existence -- then you could safely use the ipcrm program to get rid of the old, unused 1a4c and 1a4d shared memory and semaphores if and only if the other attributes of those IPC resources matched typical CDS settings. It's slightly possible that some other software could be using 1a4c as a key and you wouldn't want to delete shared memory or semaphores out from under it. The NATTCH column for the shared memory is a strong clue -- a value of 0 means no process is attached to a particular shared memory segment, which means it may well be garbage. Also the last-access times and last-process-ID-to-use-the-thing values can help determine what's in use and what's not.
    • This is all healthy. Except for the cdsclerk_7838_root file in /opt/dcelocal/var/adm/directory/cds, that is. Files named like cdsclerk_NUMBER_USER are UNIX named pipes that correspond to cdsclerk processes with process id of NUMBER and for user USER -- like cdsclerk_301_root corresponds to the current root cdsclerk, the one that has PID 301. If a clerk dies gracelessly, its UNIX pipe is left laying there and it turns into a 0-length regular file (first character in the ls line is "-" rather than "p" for pipe). The "p" cdsclerk lines should correspond to cdsclerk processes in dce.ps.

      DTS and others

      For what it's worth, DTS creates a shared memory segment of size 88 bytes, and the key is usually 1. It creates two 1-semaphore sets, with keys usually (shared memory key + 1) and (shared memory key + 2). /opt/dcelocal/var/adm/time/dts_shared_memory_id records the shared-memory ID on disk. The DTS shared memory segment is used to hold a block of DTS control data (a variable of type SharedState, which is typedef'd to a "struct State").

      You should know that other software (e.g. Encina, Oracle, NIS+, and other random applications) also use IPC. In general there's no way to know how a particular piece of software uses IPC (e.g. what key(s) it uses, how many semaphores, what the shared memory is for, etc.) without source code or documentation of some sort.



      Solaris /etc/system

      There are about a dozen Solaris kernel parameters that affect various aspects of IPC resources. These parameters can be left at default values, or they can be set in /etc/system. A change to /etc/system requires a reboot in order to take effect. The sysdef command will show current settings of these IPC parameters, near the end of its output). Parameters of interest are (this from Sun's System Administration Guide):



      PARAMETER NAME DEFAULT MEANING
      seminfo_semmni * 10 Max. number of semaphore IDs ("s" lines in ipcs)
      seminfo_semmap * 10 Max. entries in the free-semaphore-block map
      (set this to the same value as seminfo_semmni)
      seminfo_semmns * 60 Max. number of semaphores (sum of NSEMS column)
      seminfo_semmnu * 30 Max. number of processes using SEM_UNDO feature

      seminfo_semmsl 25 Max. number of semaphores per ID (max. NSEMS)
      seminfo_semopm 10 Max. number of operations in a semop call
      seminfo_semume 10 Max. number of SEM_UNDO locks per process
      seminfo_semvmx 32767 Max. value of a semaphore
      seminfo_semaem 16384 Max. SEM_UNDO adjustment value

      shminfo_shmmni * 100 Max. number of shared memory IDs ("m" lines)

      shminfo_shmmax 1048576 Max. size of shm segment in bytes (max SEGSZ)
      shminfo_shmmin 1 Min. size of shm segment in bytes (min SEGSZ)
      shminfo_shmseg 6 Max. number of shm segments per process

      * indicates a parameter that you may need to adjust; non-* entries have
      default values that should work properly with DCE (although you may need
      to adjust them for other software, see below).

      Note that messages in fatal.log may indicate necessary changes, for example:

      1996-10-10-03:05:34.219-04:00I20.272 cdsclerk(671) FATAL cds cache calock.c
      445 0x00000005 Routine semop(2) failed: No space left on device.

      So some cdsclerk called semop() to lock or unlock a semaphore and it failed. What could be wrong -- what in the world does "No space left on device" mean? Well, look at the end of the man page for semop:

      ENOSPC The limit on the number of individual
      processes requesting an SEM_UNDO would be
      exceeded.

      ENOSPC is the UNIX code that corresponds to "no space left on device", see /usr/include/sys/errno.h. So, semop() failed because CDS always requests SEM_UNDO and too many processes have requested that feature. Now, cdsadv and cdsclerk will use SEM_UNDO so a system with, say, 10 active clerks will account for 11 processes using SEM_UNDO; and of course other processes may also use that feature. The seminfo_semmnu parameter needs to be increased.

      The NATTCH field of the ipcs -mo command will tell you how many processes are currently attached to each shared memory segment; each of these processes will actively use SEM_UNDO. You could monitor NATTCH during periods of heavy DCE usage in order to help determine how to set seminfo_semmnu.

      Modifications to /etc/system

      In the past, we've found that customers who use e.g. Oracle and/or NIS+ on Solaris need to turn up some of these limits. Typically we've recommended this:



      set semsys:seminfo_semmns=100
      set semsys:seminfo_semmnu=50
      set semsys:seminfo_semmsl=50 (I don't understand why this is needed.)

      These settings aren't definitive; you have to understand the IPC needs of the various software that runs on your system in detail, and set your kernel parameters appropriately. In the end, it's like allocating disk space or swap space or any other UNIX resource: the administrator needs to understand the demands that each of his software systems makes, and configure his system accordingly. In a nutshell, the IPC requirements for CDS are these:

        • One cdsadv process per system, and one cdsclerk process per UNIX user who makes use of DCE, all of which use of the following:
            • One 500 KB shared memory segment
            • One semaphore set, consisting of 2 semaphores
        • SEM_UNDO is always used in every semop() call

      Don't get carried away

      Note that only semmnu can cause problems after cdsadv creates the CDS IPC resources (assuming that none of the limits in /etc/system have been decreased from their defaults). That is, if cdsadv comes up and then cdsadv or cdsclerk die later, the only /etc/system IPC parameter that could be responsible is semmnu; and if semmnu is the problem then the fatal.log will contain a message like the one above, mentioning semop() and ENOSPC. Any other CDS client failure mode is unlikely to be the result of IPC limitations. Of course, IPC resource limits could cause cdsadv to fail to start -- that's a real possibility and you should always consider it if cdsadv won't come up. But if the advertiser starts, then IPC resources are only likely to cause problems later if one of the following occurs:

        • so many processes use SEM_UNDO that the semmnu limit is exceeded (could happen on a system with lots of cdsclerks)
        • someone uses ipcrm or the programmatic equivalent to delete the shared memory or semaphores out from under CDS

      Miscellaneous semop() failures

      DCE 1.1 defect 18315 is a problem with CDS not checking the return code from calls to semop(). Prior to the fix, if a cdsadv or a cdsclerk had trouble locking or unlocking a semaphore, it would just keep trying -- this led to some "cdsclerk spins and chews up lots of CPU time" problem reports. The defect fix is to check the error return and die with a core dump if semop() sees an error. This fix went into DCE 1.1 / Solaris 2.5 patch 3.

      As an example, suppose someone deletes the CDS semaphore set (via the ipcrm command, for example) while CDS is running. Without the 18315 fix, cdsclerks will repeatedly try and try again to lock the non-existent semaphore. With the fix, they will notice that semop() is failing with the EINVAL error code, and will produce this message in fatal.log, then die with a core dump:



      1996-10-14-10:42:52.789-04:00I----- cdsclerk(7580) FATAL cds cache calock.c
      445 0x0000000e Routine semop(2) failed: Invalid argument.

      Note that the man page tells you that EINVAL from semop() can mean that you tried to use an invalid semaphore ID -- that's what's happening here, the CDS semaphore ID has become invalid because it was deleted and it no longer exists. Also notice that the permissions on the CDS semaphore set are such that only UNIX root can access it.

      CDS IPC cleanup

      CDS didn't used to be very good about removing its IPC resources if it died gracelessly. Actually it still isn't very good about it, so we've modified the DCE start/stop script to do some cleanup after the fact. This is defect 17047, which has been in DCE 1.1 patches for a while now. In addition to the usual cdscache.shmid file that is standard for CDS, CDS on Solaris now also writes a file named cdscache.ipcid, in /opt/dcelocal/etc. This file holds the shared-memory and semaphore IDs used by CDS, and the DCE stop script looks at it and removes the CDS IPC stuff if CDS itself didn't do so. Without this fix, CDS can strand "orphan" IPC resources if it dies and is restarted repeatedly without rebooting the machine. If that happens, you'll see numerous shared-memory and semaphore IDs in ipcs -a output, with IDs like 1a4c, 1a4d, 1a4e, and so on. With the fix, you should only ever see the 1a4c ID (assuming DCE is always stopped via /etc/init.d/dce stop rather than by just manually killing processes).

      DTS has a similar file, /opt/dcelocal/var/adm/time/dts_ipc_id, which is used by dce stop in a similar manner.

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSGMCP","label":"Distributed Computing Environment"},"Component":"--","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF027","label":"Solaris"}],"Version":"3.1;3.2","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
23 August 2018

UID

swg21112231