dfsstat

Including some advice about investigating heavily loaded DFS servers

dfsstat is a DFS utility program that reports on various counters that are maintained by the DFS kernel extensions. It used to be unsupported and was available only via download from a special "unsupported tools" page, but we're making it part of the official DFS product on some platforms. Specifically, it is included in DFS 3.1 PTF 3 for AIX 4.3, and will be available for all other platforms in DFS 3.1 PTF 4.

You can run dfsstat on client or server systems. It is most commonly run without any arguments, as we do in the example below. You can type dfsstat -help to see a list of options.

dfsstat was originally intended to be a tool for service personnel (support and development), so the meaning of some of the counters won't be apparent unless you have some internals knowledge of DFS. Usually, you will run dfsstat only as directed by your Support rep as a means of gathering data about a particular problem. We are making it part of the official DFS product to speed up diagnosis in cases where we want customers to run dfsstat; it's obviously easier to run if the binary is already on your system and doesn't have to be downloaded from the web.

Sample output is as follows (this from a DFS server machine):

  # /opt/dcelocal/bin/dfsstat
  KERNEL RPC
  ----------
  ccalls     scalls     txpkts     rxpkts     retrans    rxdups     oo_pkts
  6385       12646      35635      36974      0          0          0          
  rxfacks    txfacks
  4811       9794       
  DFS CLIENT
  ----------
  vn_lkups   vn_rdir    vn_gattr   vn_sattr   vn_read    vn_write   vn_map
  124968     5474       9480       1035       5423       6945       20         
  rd_faults  wr_faults  pr_faults  cachehits  inflight   rd_waits
  2710       2035       0          2710       0          0          
  lookups    fstatus    fdata      readdir    gettokens
  1599       16         0          223        2          
  sstatus    sdata      reltokens  revokes
  1041       1138       611        78         
  DFS SERVER
  ----------
  lookup      lkuproot    fstatus     sstatus     fdata       sdata
  3092 24%    3  0%       20  0%      2447 19%    21  0%      2098 16%    
  readdir     mkdir       rmdir       create      rmfile      rename
  446  3%     163  1%     156  1%     930  7%     1616 12%    600  4%     
  link        symlink     fetchacl    storeacl    gettoken    reletoken
  400  3%     400  3%     0  0%       0  0%       63  0%      72  0%      
  sctx        gettime     setparam    bulkkalive  bulkfetchVV bulkfstatus
  23  0%      17  0%      0  0%       0  0%       0  0%       0  0%       
  totalcalls: 12567       

  -------- FXD TokenProcs counter --------
  Total number of threads : n_threads = 2 
  Number of idle threads  : n_idle    = 2
  Number of calls queued  : n_queued  = 0
  Max. number of calls that could be queued = 400

  -------- FXD MainProcs counter ---------
  Total number of threads : n_threads = 8
  Number of idle threads  : n_idle    = 8
  Number of calls queued  : n_queued  = 0
  Max. number of calls that could be queued = 400

  #

(On a DFS client machine, the "DFS SERVER" stats will be reported as all zeroes.)

Meanings of some of the fields are as follows. Note that we can't promise to fully document all the output, since much of it depends on DFS internals knowledge, and Support can't provide on-demand analyses of output from customer machines, but for whatever it's worth, here's what the fields mean:

KERNEL RPC:
-------------
ccalls
        The number of RPC client calls made.

scalls
        The number of calls received by the RPC.  Note even a client
        only DFS system can have scalls since it exports a
        token revocation server for the DFS file server to issue
        token revokes to.

txpkts
        The number of RPC packets transmitted including retransmits.

rxpkts
        The number of RPC packets received including rxdups and
        oo_pkts.

retrans
        The number of packets retransmitted.  This should
        be compared against txpkts.  Ratios less than 1 percent
        are excellent.  Ratios above 5 percent should be investigated.
        Ratios above 10 percent undesirable.  Look for
        causes of network packet loss or server overload.

rxdups
        The number of duplicate packets received out of the total
        received packets.  Duplicated packets are an indication
        of network loss of the original sent packet or a loss of
        an acknowledgment.  Investigate possible causes of network
        loss for rxdup to rxpkts ratios above 10 percent.

oo_pkts
        The number of out of order packets received.  This
        can be a sign of network loss or heavy network traffic.
        UDP does not guarantee in order delivery of packets.
        Ratios of oo_pkts to rxpkts above a few percent should
        probably be investigated.

rx_facks
        The number of fragment acknowledgments received.  This is
        associated with DFS data reads and writes which use the
        RPC pipe mechanism to stream data between the server and
        client.

tx_facks
        The number of fragment acknowledgments transmitted.  This is
        associated with DFS data reads and writes which use the
        RPC pipe mechanism to stream data between the server and
        client.


DFS CLIENT:
-----------
vn_lkups
        The number of lookup vnode operations performed in DFS.
        Typically this relates to system calls which take a file
        pathname as input.  Examples are open() and stat().

vn_rdir
        The number of readdir vnode operations performed in DFS.

vn_gattr
        The number of getattr vnode operations performed in DFS.
        Many system calls which work with files will get file
        attributes.

vn_sattr
        The number of setattr vnode operations performed in DFS.
        System calls like chmod() and utimes() will use this vnode
        operation.

vn_read
        The number of read vnode operations performed in DFS.
        This relates to the read system call.  Note that mapped
        file I/O will not use the read system call.  AIX uses
        mapped files for binaries.  Also the AIX C compiler and
        linker makes use of mapped files.

vn_write
        The number of write vnode operations performed in DFS.
        This relates to the write system call.  Note that
        mapped file stores will not go through the write vnode operation.

vn_map
        The number of map vnode operations.  This is related to
        shmat() and mmap() system calls.

rd_faults
        The number of read page faults issued by the Virtual
        Memory Manager (VMM) to DFS.  Note that DFS is integrated
        with the AIX VMM. DFS creates memory segments for files
        and then performs data I/O on the segment.  This allows
        DFS to support mapped files, and adds a layer of
        fast "memory" caching above the DFS client cache.
        When DFS data is is not in VM memory, then the VMM must
        fault to a DFS page fault handler which either gets the
        data from the DFS client cache, or retrieves it from a
        DFS server.  One interesting statistic to examine is
        the ratio of rd_faults to vn_reads which for some
        environments can give an indication of how a data working
        set fits into system memory (RAM).  If the ratio of
        rd_faults to vn_reads is high, it may indicate that
        adding system RAM could improve performance.  The
        usage of mapped files must also be taken into account.

wr_faults
        The number of write faults issued by the VMM.  This is
        the VMM calling DFS to give it pages with dirty data so
        DFS can store it in the DFS cache and possibly to the DFS
        server as well.  wr_faults are driven by vn_writes, mapped
        file stores, and VMM page replacement.  If the number of
        wr_faults is significantly greater than the number of
        vn_writes, this may indicate a high amount of page replacement
        activity due to thrashing.  Increasing system memory
        may result in better performance.

pr_faults
        The number of protection faults issued by the VMM.

cachehits
        The number of rd_faults that were serviced by data from
        the DFS client cache.

inflight
        The number of rd_faults were the requested data has already been
        requested and is currently arriving from a DFS server.  Inflight
        data can result from sequential page faults on the same DFS
        "chunk" which is usually several pages in size, or read ahead
        which can be triggered by the VMM or the DFS client based
        on file access patterns.

rd_waits
        The number or wait loops a rd_fault takes before the requested
        inflight data has arrived.  As data is steaming in from a DFS
        server, the page fault path will be notified periodically
        to see if enough data has arrived to satisfy the fault.

lookups
        The number of file lookup RPC calls made to a DFS file
        server. This should be compared to the vn_lkups stat
        to get an idea of how many lookups are resolved in the
        DFS client's name lookup cache.  In some environments
        increasing the name lookup cache with the dfsd -namecachesize
        option can reduce lookup RPCs.  The -stat option should
        be increased equally with the -namecachesize option.
        For example: dfsd -namecachesize 2000 -stat 2000.
        On AIX the default values are based in the amount of
        system memory with typical values being around 400 for
        a 32 MB system.  For single user workstations the
        defaults are usually sufficient for a modest name cache
        hit ratio.  Multiuser systems may benefit from increased
        name caches and status caches.

fstatus
        The number of fetch status RPC calls made to a DFS file
        server.  This call is used to get file attributes.  Most
        DFS RPC calls return file attributes with the result. fetchstatus
        RPC calls may be made when permissions need to be
        calculated for a new user accessing a file whose name may already
        be cached.  Increasing the status cache may reduce fetchstatus
        RPCs.

fdata
        The number of fetch data RPC calls made to a DFS file server.
        fetch data RPC calls are required when requested data is not
        in the DFS client cache.  When data is not in the VMM or the
        DFS client cache, then a fetch data RPC must be made to
        retrieve the data from the DFS file server.

readdir
        The number of readdir RPC calls made to a DFS file server.
        Compare this against vn_rdir.
        

gettokens
        The number of gettoken RPC calls made to a DFS file server.
        Tokens are the internal mechanism DFS uses to maintain
        cache coherency between clients and servers. Most DFS RPC
        calls return token rights.  gettoken RPC calls are usually
        required when there have been data collisions or directory
        content changes which required revocation of tokens from
        a client or when a client needs to renew a token that is
        about to expire. 

status
        The number of storestatus RPC calls made to a DFS file server.
        storestatus RPCs are used to store file attributes to the
        DFS file server.

sdata
        The number of storedata RPC calls made to a DFS file server.
        storedata RPCs are used to store file data to the DFS
        file server.

reltokens
        The number of internal DFS client released tokens.  This
        stat should be ignored.

revokes
        The number of token revocation requests that the DFS client
        has received from servers.

DFS SERVER:
-----------
lookup
        The number of lookup RPC calls received by a DFS server.

lkuproot
        The number of lookup root RPC calls received by a DFS server.
        This RPC is made by DFS clients when they first cross a
        DFS mount point and then periodically there after.

fstatus
        The number of fetch status RPC calls received by a DFS server.

sstatus
        The number of store status RPC calls received by a DFS server.

fdata
        The number of fetch data RPC calls received by a DFS server.

sdata
        The number of store data RPC calls received by a DFS server.

readdir
        The number of read directory RPC calls received by a DFS server.

mkdir
        The number of directory create  RPC calls received by a DFS server.

rmdir
        The number of directory remove  RPC calls received by a DFS server.

create
        The number of file create RPC calls received by a DFS server.

rmfile
        The number of file remove RPC calls received by a DFS server.

rename
        The number of rename RPC calls received by a DFS server.

link
        The number of hard link RPC calls received by a DFS server.

symlink
        The number of symbolic link RPC calls received by a DFS server.

fetchacl
        The number of fetch ACL RPC calls received by a DFS server.

storeacl
        The number of store ACL RPC calls received by a DFS server.
gettoken
        The number of get token RPC calls received by a DFS server.

reletoken
        The number of release token RPC calls received by a DFS server.

sctx
        The number of set context RPC calls received by a DFS server.
        DFS client's make set context RPC calls to setup up a
        "DFS connection" to a file server. A connection represents
        a DCE principle at a client. Connections may periodically
        by renewed or re-activated when they become stale.

gettime
        The number of get time RPC calls received by a DFS server.
        DFS clients use get time calls as "keep alives" during
        periods of idle activity to keep cache coherency state active
        at DFS file servers. Normal RPC calls act as keep alives.
        Idle client systems typically send a keep alive about every
        90 seconds when there are "active" tokens at the client.

setparam
        The number of setparameter RPC calls received by a DFS server.

bulkkalive
        The number of bulk keep alive calls received by a DFS server.
        Replication servers make this RPC to DFS file servers which
        hold replicas.

bulkfetchVV
        The number of bulk fetch version calls received by a DFS server.
        Replication servers make this RPC to DFS file servers which
        hold replicas.

totalcalls
        The total number of RPC calls received by a DFS server.

The last two sections, which report statistics about FXD procs counters, can be used to help detect situations where a DFS server is having load problems. To understand them, you have to know a little bit about the DCE RPC facility. DCE RPC servers have pools of pre-created threads, dedicated to handling various types of incoming RPCs. The DFS file server has two such thread pools, the so-called "tokenprocs" pool and the "mainprocs" pool. The sizes of these pools can be controlled by arguments to the fxd command that starts the DFS server. Defaults are 2 tokenprocs and 8 mainprocs, as above; they can be increased to a maximum of 10 tokenprocs and 24 mainprocs. Each thread pool also has a queue of size 400 to handle overflow.

The n_threads, n_idle, and n_queued counters, as you can see above, combine to give you an idea of how busy this server is and how close it is to its maximum capacity. If you ever reach the server's full capacity (i.e., if n_idle is zero and n_queued is equal to the max. number of calls that could be queued), then subsequent incoming RPC requests will be ignored until space frees up in the server's queue. To clients, this will look like the server is down, so this can cause apparent DFS outages from clients. Some customers run dfsstat on their servers every few minutes via a background script as a means of monitoring server load. If you detect an overloaded server, the first thing you can do is to increase the fxd mainprocs and tokenprocs parameters, by modifying the fxd line in /opt/dcelocal/etc/cfgarg.dat as follows, and then rebooting:

  fxd: -mainprocs 24 -tokenprocs 10 -admingroup subsys/dce/dfs-admin

If that doesn't work, then your server is just plain maxxed out, and you'll need to consider moving busy filesets to other DFS servers. This of course assumes that the load is "legitimate", and not the result of some rogue client(s) needlessly hammering the DFS server. If you see unexpectedly high load on a DFS server, then you could consider using tracing tools (network packet tracing or DFS tracing, or both) to see which clients are accessing the server most frequently; then you could use DFS tracing on those clients to see what they're doing. You could also look at the read/write counts as shown in "fts lsft" output, to determine which filesets are getting the most activity, then try to track down the users of those filesets. These are the general procedures for investigating loaded DFS servers.