Reading directories

To optimize directory reading, v_readdir() is designed to return as many entries as possible on each call.

The VFS server must maintain directory positioning if more than one call must be made to read an entire directory, and this topic describes positioning:

The v_readdir() output buffer is mapped by the DIRENT structure, and its format is defined as follows:

The buffer contains a variable number of variable-length directory entries. Only full entries are placed in the buffer, up to the buffer size specified, and the number of entries is returned on the interface.
Each directory entry that is returned in the buffer has the following format:
1. 2-byte Entry_length. This length field includes itself.
2. 2-byte Name_length, which is the length of the following Member_name subfield.
3. Member_name. A character field of length Name_length. This name is not null-terminated.
4. File-system-specific data. If (Name_length + 4) = Entry_length, this field is not present. Whenever the field is present, however, it starts with the file's serial number, st_ino, in 4 bytes. This field is not part of POSIX, but it is supported for special-use programs that are dealing with particular file systems that they know about.
The entries can be packed together, and the length fields are not aligned on any particular boundary.

An example of an entry for the name abc would be X'0007 0003 818283' or X'000B 0003 818283 00001234' with a file serial number of X'1234' also returned.

Entries for "." and ".." may or may not be returned by the PFS that owns the directory.

In order for successive calls to v_readdir() to proceed through a directory from the point at which the last one left off, the VFS server must specify the directory position at which the operation is to start. There are two different ways this can be done:

Cursor technique. The cursor that is returned in the UIO contains PFS-specific information that locates the next directory entry. The VFS server is required to preserve the UIO cursor and the entire output buffer from the last v_readdir(), and present both of these on the next v_readdir().
The PFS may use the cursor as an offset into a simple linear directory file, ignoring the buffer; or it may use it as an offset into the previous output buffer of the last entry returned. The latter approach is used by a PFS with a tree-structured directory, where the previous entry name is used as a key to search for the next entry. That is, the last returned name, a 1-to-255-byte-long text string, is really the "cursor" for the directory position.
Index technique. The index that is set in the UIO by the VFS server determines which entry to start reading from. To read through a directory, the VFS server starts at one and maintains the index by adding the number of entries that are returned to the previous index. The directory is treated as a one-based array, where the first entry has index 1, the second entry has index 2, and so on.
This technique is slower than the cursor technique, but it is useful when a VFS server does not maintain state information from one call to the next. The index can be passed back to the client, who must return it with the next request to continue reading the same directory for a particular end user.

The UIO contains both the cursor and the index fields that are used with these continuation techniques. The interpretation of these two fields is summarized in the following table:

The index and cursor fields are listed, along with the related actions.

Index	Cursor	Action
0	0	Start reading from the first entry.
0	M	Use the cursor value to resume reading.
N	0	Start reading from entry N.
N	M	Start reading from entry N.

Note: 0=zero; N and M are nonzero values.

A nonzero index overrides the cursor; when both are zero, reading starts from the front of the directory.

The end of the directory stream is indicated in two different ways:

A Return_value of 0 entries is returned. This happens when the previous v_readdir() exhausted the directory.
A null name entry is returned as the last entry in the output buffer. A null name entry has an Entry_length of 4 and a Name_length of 0—that is, X'00040000'.
This happens when the current v_readdir() exhausts the directory and there are at least 4 bytes left in the output buffer.

The Move With Destination Key machine instruction or the osi_copyout or osi_uiomove services must be used to write to the user's buffer.

The end of the directory stream is indicated by the PFS in two different ways:

A Return_value of 0 entries is returned. This must be supported by the PFS for cases in which a vn_readdir is issued and the position is already at the end of the directory.
A null name entry is returned in the output buffer. A null name entry has an Entry_length of 4 and a Name_length of 0—for example, X'00040000'.
This would be the last entry in the buffer, when the directory end has been encountered on a call and there are at least 4 bytes left in the buffer.

A PFS that supports this indicator helps the caller to run faster. A small directory may be read in only one operation, because the caller can detect that a second call is unnecessary.

Note: POSIX allows open() and read() from a directory, but it only specifies that these operations do not fail with an error. The PFS cannot tell whether a vn_open is from an open() or from an opendir(), but read() results in a vn_rdwr while readdir() results in a vn_readdir. The PFS is free to support vn_rdwr as a traditional UNIX system would, or to just return zero bytes on every operation. The X/Open Portability Guide, Version 4, Issue 2 allows the EISDIR error to be returned for read(). The LFS ensures that only reading is allowed.