|
This flow is discussed as an addition to an existing PFS design
that already handles synchronous blocking and non-blocking socket
operations. - BPX1AIO/BPX4AIO (asyncio) is called with an Aiocb structure. The
Aiocb contains all the information that is needed to do the specific
function.
- The LFS builds an Async I/O Request Block (RqBlk). The PFS has
signified support via the Pfsi_Asyio PFSinit output bit. The regular
vnode operation for the function is invoked in the PFS with:
- + The osi_asy1 bit turned on to indicate Async I/O Part 1.
- + The osi_asytok field holding the LFS_AsyTok token.
- Part 1 in the PFS:
- The PFS builds its own Request Block. The LFS_AsyTok is saved
for later use with osi_sched(). The PFS's PFS_AsyTok is passed back
to the LFS via osi_upda(). This identifies the request to the PFS
in Part 2 and to vn_cancel. Basic preliminary parameter and state
checking can be done here.
- The user's read buffers are not referenced during Part 1 unless
osi_ok2compimd=ON (see the Variations in this topic). This allows
the user to defer read buffer allocation to just before Part 2. The
requested length for reads is available, even if the buffers are not.
- The PFS queues the request to await the desired event. This is
essentially the same thing that is normally done for blocking requests.
Instead of calling osi_wait(), as it would at this point for a blocking
request, the PFS returns to the LFS with the Return_value, Return_code,
and Reason_code (RRR) from queueing the asynchronous I/O. For a successfully
queued request, the Return_value is 0, and any output from the operation
is deferred until Part 2. Important PFS structures are preserved as
necessary over this return and the subsequent reentry to the PFS for
Part 2.
The variations are as follows: - If the operation fails during Part 1, the normal path is taken
and, instead of the request being queued, the failure is returned.
This includes both queueing failures and failures of the function
that is being requested.
- If the operation can be completed immediately and osi_ok2compimd=ON,
the PFS can proceed as it would normally and complete the operation
synchronously. osi_compimd is turned ON to tell the LFS that this
has happened.
- If osi_ok2compimd=OFF, the PFS must make the call to osi_sched
from within this vnode operation, and proceed from Part 2 as if the
data were not immediately available. This bit is only OFF for read/write
type operations. If the PFS does not need to be recalled for Part
2 (for instance, with a short write), it can skip the call to osi_upda.
It is all right to transfer the responsibility for calling osi_sched
to some other thread, making the call asynchronously and returning
to the LFS, as long as you do not wait for network input.
- The LFS returns to the caller with AioRC=EINPROGRESS; or, if it
has failed or completed immediately, cleans up and returns the operation's
results.
- The original caller continues. All structures and data buffers
must persist throughout the operation.
- Event occurrence in the PFS:
- At some point data arrives for the socket, or buffers become available,
and the request can be completed.
- The PFS notices, or responds to, this condition as it normally
does. Instead of calling osi_post(), as it would at this point for
a blocked request, it calls osi_sched() with the saved LFS_AsyTok
to drive Part 2.
- For read type operations, the passed Return_Value contains the
length of the data that is available to be read in Part 2. This is
an optional performance enhancement that some applications may take
advantage of. If the length is not easily known, 0 should be passed.
- The rest of the action happens on the SRB, because user data cannot
generally be moved while it is on the thread that calls osi_post/osi_sched.
The variations are as follows: - If the request fails asynchronously, the PFS can report this on
the call to osi_sched() by passing the failing three R's. There will
be no Part 2 if the passed Return_value is -1, so the PFS has to clean
everything up from here.
- Alternatively, the PFS can save the results, pass success to osi_sched(),
and report the failure from Part 2. This is sometimes more convenient
when the event handler is in a separate address space and the PFS
has resources to clean up in the kernel address space. The only time
osi_sched() fails is if the passed LFS_AsyTok is no longer valid,
which may represent a logic error in the PFS. osi_sched() succeeds
even after the user has terminated, but the PFS sees vn_cancel instead
of Part 2.
- The LFS schedules an SRB into the user's address space and returns
to the PFS. The SRB runs asynchronously to the caller of osi_sched().
- The SRB runs in the user's address space, so that the user's data
buffers can be referenced from "home" while in cross-memory mode.
This also gets the user's address space swapped in if necessary. The
LFS is recalled to get into the kernel address space.
- The LFS reconstructs the original vnode request structures. The
same vnode operation is invoked in the PFS as for Part 1, with:
- + The osi_asy2 bit turned on to indicate Async I/O Part 2.
- + The osi_asytok field holding the PFS_AsyTok value from osi_upda()
The variations are as follows:
If osi_upda was not
called during Part 1, the PFS is not called for Part 2.
- Part 2 in the PFS:
- This is running on an SRB instead of the more usual TCB, and the
PFS has to be able to handle this mode.
- From the PFS_AsyTok, the PFS is able to pick up from where it
left off at the end of Part 1 (3), when it returned to the LFS instead
of waiting. Necessary information that is related to the completing
operation is obtained in a manner similar to that in which it is obtained
after coming back from osi_wait().
- Data is moved between the user's and the PFS's buffers for read/write
types of operations; or the operation is completed as appropriate.
- The normal cross-memory environment has been recreated, with the
user's buffers in home and the PFS's buffers in primary; or it is
otherwise addressable as arranged by the PFS.
- The normal move-with-key instructions are used to protect against
unauthorized access to storage. The osi copy services are available.
- For unauthorized callers in a TSO address space, the LFS has stopped
the user from running authorized TSO commands while async I/O is outstanding.
This avoids an obscure integrity problem, with user key storage being
modified from a system SRB.
- The PFS returns to the LFS with the results of the operation and
the normal output for this particular vnode operation, such as the
vnode_token from vn_accept. The operation is over at this point, as
far as the PFS is concerned.
The variations are as follows: - If the operation fails during Part 2, this is reported back. An
earlier failure may have been deferred to Part 2 by the PFS.
- For very large writes, the PFS may not want to commit all of its
buffers to one caller. It may instead loop, sending smaller segments
and waiting in between for more buffers. If this is the case, the
PFS remains in control and does not return from Part 2 until the whole
operation is complete, that is, until the remainder of the operation
is synchronous and the PFS blocks as necessary, as it normally does
in this loop. osi_wait is convenient here, as it accommodates SRB
callers. Essentially, osi_sched() is only called when the first set
of buffers become available and the effect is to offload the work
from the user's task or SRB to a system SRB. The operation is still
asynchronous to the user. This ties up the SRB, but it is considered
to be a situation of relatively small frequency.
- Because SRBs are not interrupted with signals, osi_waits during
Part 2 normally do not return as they do in the EINTR cases. If the
user's process terminates, signal-enabled osi_waits return as if they
have been signaled.
- On return to the LFS, signals are sent and unauthorized exits
are queued to the user's TCB (not shown).
- The LFS returns to the SRB.
- On return to the SRB, authorized exits are called and ECBs are
posted. When the user program is notified that the I/O has completed,
either on the SRB or user's TCB, it can free the Aiocb and buffers.
The operation is over, as far as the LFS is concerned, either at the
end of the SRB or after an unauthorized exit has run on the user's
TCB.
|