This topic describes how to design an application protocol so that
the partner program can divide the receive stream into individual
messages.
Some socket applications are simple, and the receiver can continue
to receive data until the sender closes the socket, for example, a
simple file transfer application. Most applications are not that
simple and usually require that the stream can be divided into a number
of distinct messages.
A message exchanged between two socket programs must imbed information
so that the receiver can decide how many bytes to expect from the
sender and (optionally) what to do with the received message.
A few common techniques are used to imbed information about the
length of a message into the stream, as follows:
- The message type identifier technique
If your messages are
fixed length, you can implement a message ID per message type worked
with. Each message type has a predefined length that is known by
your client and server programs. If you place the message ID at the
start of each message, the receiving program can determine how long
the message is if it knows the content of the first few bytes in the
message. This is illustrated in Figure 1:
Figure 1. Layout of a message between a
TPI client and a TPI server*---------------------------------------------------------------*
* Layout of a message between TPI client and TPI server *
*---------------------------------------------------------------*
01 tpi-message.
05 tpi-message-id pic x.
88 tpi-request-add value '1'.
88 tpi-request-update value '2'.
88 tpi-request-update value '2'.
88 tpi-request-query value '3'.
88 tpi-request-query value '3'.
88 tpi-request-delete value '4'.
88 tpi-query-reply value 'A'.
88 tpi-response value 'B'.
05 tpi-constant pic x(4).
88 tpi-identifier value 'TPI '.
Each message ID is associated with a fixed length known
to your application.
- The record descriptor word (RDW) technique
If your messages
are variable length, you can implement a length field in the beginning
of each message. Normally, you would implement the length in a halfword
binary length with the value encoded in network byte order, but you
can implement it as a text field, as shown in Figure 2.
Figure 2. Transaction request message segment*---------------------------------------------------------------*
* Transaction Request Message segment *
*---------------------------------------------------------------*
01 TRM-message.
05 TRM-message-length pic 9(4) Binary Value 20.
05 filler pic x(2) Value low-value.
05 TRM-identifier pic x(8) Value '*TRNREQ*'.
05 TRM-trancode pic x(8) Value '?????'.
- The end-of-message marker technique
A third technique most
often seen in C programs is to send a null-terminated string. A null-terminated
string is a string of bytes terminated by a byte of binary 0. The
receiving program reads whatever data is on the stream and then loops
through the received buffer separating each record at the point where
a null-byte is found. When the received records have been processed,
the program issues a new read for the next block of data on the stream.
If
your messages contain only character data, you can designate any non-display
byte value as your end-of-message marker. Although this technique
is most often seen in C programs, it can be used with any programming
language.
- The TCP/IP buffer flushing technique
This technique is based
on the observed behavior of the TCP protocol, where a send() call
followed by a recv() call forces the sending TCP protocol layer to
flush its buffers and forward whatever data might exist on the stream
to the receiving TCP protocol layer. You can use this method to implement
a half-duplex, flip-flop application protocol, where your two partner
programs acknowledge the receipt of each message with, for example,
a 1-byte application acknowledgment message.
Figure 3 shows the TCP buffer flush technique.
Figure 3. The TCP buffer flush technique
In Figure 3, the client sends an 80-byte
message. The server has issued a recv() call for 1000 bytes, but
receives only the 80 bytes (RETCODE=80). This presents a problem
because there is no guarantee the server will receive the full 80-byte
message on its receive call. It might only receive 30 bytes, but
with this technique it has no way of knowing that it is missing another
50 bytes. The smaller the messages are, the less likely the server
will receive only a part of the full message.
Note: This technique is widely used, but you should use it only in
controlled environments, or in programs where you use non-blocking
socket calls to implement your own timeout logic.
The message type identifier and the record descriptor word techniques
require that the receiving program be able to learn the content of
the first bytes in the message before it reads the entire message.
If this is a problem for your application, use the peek flag on
a recv socket() call.
A recv() call with the peek flag on does not remove the data from
the TCP buffers, but copies the number of bytes you requested into
the application buffer you specified on the recv() call.
For example, if your message length field or message ID field is
located within the first 5 bytes of each message, issue the following
recv() call:
*---------------------------------------------------------------*
* Peek buffer and length fields for RECV call *
*---------------------------------------------------------------*
01 soket-recv pic x(16) value 'RECV'.
01 recv-flag-peek pic 9(8) binary value 2.
01 recv-peek-len pic 9(8) binary value 5.
01 recv-peek-buffer.
05 message-id pic x value space.
88 tpi-query-reply value 'A'.
88 tpi-response value 'B'.
05 message-constant pic x(4).
88 tpi-identifier value 'TPI'.
01 socket-descriptor pic 9(4) binary value 0.
01 errno pic 9(8) binary value 0.
01 retcode pic s9(8) binary value 0.
*---------------------------------------------------------------*
* Peek at first 5 bytes of client data *
*---------------------------------------------------------------*
call 'EZASOKET' using soket-recv
socket-descriptor
recv-flag-peek
recv-peek-len
recv-peek-buffer
errno
retcode.
if retcode < 0 then
- process error -
if retcode = 0 then
- process client closed socket -
if not TPI-identifier then
- translate recv-peek-buffer from ASCII to EBCDIC -
The recv() call blocks until some bytes have been received or the
sender closes its socket. The above example is not complete since
you cannot be sure that you actually received the 5 bytes requested.
Your call might come back to you with only 1 byte received. In order
to manage the situation, you need to repeat your recv() call until
all 5 bytes have been received and recognized as such.
If the other half of the connection closes the socket, the recv()
call returns 0 in the retcode field.
The data is copied into your application program buffer only, but
it is still available to a recv() call, in which you can specify the
full length of the message you now know to be available.