Recent TCP/IP Enhancements
Jamie Farmer and Evan Jennings, IBM TPF Development
As the TPF community begins to rely more heavily on TCP/IP,
it would help to have better control over IP traffic and the applications
themselves. TPF has delivered new TCP/IP enhancements that allow
these types of things to happen. Following are some highlights
of these new enhancements.
Traffic limiting (APAR PJ28901)
allows you to limit the amount of inbound traffic for a specific
TPF server application. TPF delivered the connection limiting
APAR (PJ28493) on
program update tape (PUT) 17. This enhancement allowed you to
limit the number of active TCP connections. Although connection
limiting allowed you to limit the amount of resources that are
used by applications, traffic limiting extends this functionality
even further. Traffic limiting works for both TCP and UDP sockets,
and allows you to limit inbound traffic for an application, acting
as a resource manager for the system. This prevents the system
from being flooded by a given application or socket and improves
TPF intrusion detection services. Traffic limiting also is used
primarily for long-lived connections, while connection limiting
works best for short-lived connections.
To limit the amount of inbound traffic for an application,
you must define the application in the TPF Network Services Database
(NSD). To define an application to the NSD, it must be coded in
the /etc/services file. Once the application is defined in the
NSD, you can code the applrate parameter, socrate
parameter, or both, for that application. Although both of these
parameters limit the amount of inbound traffic for an application,
the effect that they have is very different.
The applrate parameter specifies the maximum inbound
rate in messages per second for all sockets that are used by this
application. For example, let's assume that you have a TCP server
application and the application currently has five active connections
to it. If an applrate of 100 was coded for this application,
100 messages per second (at most) would be presented to the application
for all five connections combined.
The socrate parameter specifies the maximum inbound
rate in messages per second for each socket that is used by this
application. Let's assume that you have a TCP server application
and the application currently has five active connections to it.
If a socrate of 20 was coded on the application, 20 messages
per second (at most) for each active socket connection would be
presented to the application.
The socrate parameter only can be specified for a TCP
server application because UDP is a connectionless protocol; therefore,
all remote clients use the same UDP server socket. For TCP sockets,
these parameters can be used with each other to further monitor
and limit the amount of traffic for a given application.
If a traffic limit is reached
in a 1-second time interval, the read API (which includes all
read-type APIs) will be delayed until the current time interval
expires if running in blocking mode, or will return with the SOCWOULDBLOCK
error code if running in nonblocking mode. When the traffic limit
is reached, the TPF application is prevented from reading more
messages. New messages that are received will be buffered in the
socket receive buffer. If the buffer fills up with new messages
that are waiting to be read by the application (for TCP), the
remote end is prevented from sending more data. For UDP, new input
messages will be discarded if the socket receive buffer is full.
This is standard UDP behavior and can occur without traffic limiting
applied. Implementing traffic limits for an application requires
no application code changes because it is the system that slows
the rate at which data is presented to the application.
A number of NSD display enhancements
are included with the traffic limiting enhancement. For example,
you can now display message rates online for a given application.
You also can display statistical information about connection
limiting and traffic limiting. By using these new displays, you
can display the high-water mark for connections or traffic rate
and determine how many times a connection or traffic limit was
reached for a given application. This becomes very important when
tuning your traffic limiting values.
TPF congestion control and
avoidance (APAR PJ29144)
is another very important APAR that has been provided.
With this APAR, the TPF system supports the congestion control
algorithm defined in Request for Comments (RFC) 2581. Congestion
control is a reactive algorithm that dynamically adjusts the rate
at which data is sent to reduce the amount of network congestion
and packet loss. The term reactive algorithm refers to
the fact that once congestion is detected in the network (packets
lost, retransmission invoked), the TPF system will slow the rate
at which it sends data to the network on a given socket. The congestion
control algorithm also incorporates the slow-start algorithm.
Slow-start processing is a way to control network congestion by
slowly introducing packets to the network when a socket is started
(slow-start processing also occurs when there is network congestion).
Along with congestion control, APAR PJ29144 also includes a
congestion avoidance algorithm. This is a proactive algorithm
that analyzes round-trip times (RTTs) of packets flowing on individual
sockets. The term proactive algorithm refers to the fact
that when the likelihood of congestion is detected, the TPF system
will slow the rate at which data is sent to the network. This
action is taken before any packets are lost. The combination of
congestion control and avoidance results in better end-to-end
throughput.
TPF also enhanced the implementations of the congestion control
and avoidance algorithms. For example, initial slow-start processing
is enabled for all TCP sockets by default. However, for certain
applications (such as short-lived connections over local or high-speed
networks), you may not want slow-start processing to be enabled
when the socket is started. TPF development created an ioctl()
option called TPF_NOSLOWSTART to turn off the initial slow-start
processing for a given socket. TPF also enhanced the congestion
avoidance algorithm to accommodate sockets with very low RTTs
(for example, host-to-host traffic in a data center) and sockets
with much higher RTTs (for example, traffic across a wide area
network).
With APAR PJ29144 applied to the system, testing showed that
end-to-end throughput of a TPF socket was increased significantly.
In the comparison test that follows, the time to completion and
number of retransmits were tested with and without the congestion
control and avoidance APAR applied. The test consisted of 10,000
1400-byte messages being sent from one TPF system as fast as possible,
through the network, and back into another TPF system. The following
results were obtained:
|
Without Congestion Control and Avoidance |
With Congestion Control and Avoidance |
|
Time to Completion |
156 sec |
45 sec |
|
Retransmits |
5,820 |
12 |
The next enhancement is a TPF API to read a complete TCP
message (APAR PJ29118).
The TCP architecture has no concept of a message. Application
data usually contains a header in front of the message that contains
the length of the message. In cases like this, it is up to the
application to parse out the message and involves issuing two
or more reads for each message: one or more reads for the length
of the message and one or more reads for the message itself. For
an AOR, this can result in multiple ECBs being created to read
in one message. Not only is this an inefficient process, it has
been a common cause of error in application programming.
TPF development has created the following two APIs to alleviate
the problems of reading a single TCP message:
tpf_read_TCP_message
activate_on_receipt_of_TCP_message
The format of the message being received is passed to the API.
For example, where in the message header does the message length
field reside? The system will then get the length of the message,
read the entire message, and pass the full message to the application
once it is received. Partial messages are never passed to the
application.
This enhancement not only makes application programming easier,
it is a performance improvement as well. The application issues
fewer APIs per message, which increases the efficiency of the
application. This will also cause less dispatching of ECBs and,
for AOR, it might reduce the number of ECBs that are created.
For those who have applications that use Systems Network Architecture
(SNA) LU 6.2, converting these applications to TCP/IP has just
become much easier with the new read message API. LU 6.2 also
has structured data messages, meaning that LU 6.2 messages contain
a header that contains the length of the message. One of the most
difficult tasks in converting LU 6.2 applications to TCP/IP was
how the two protocols read data. Now, the LU 6.2 receive_and_wait
API can be replaced with the new read message API.
With the addition of these APARs, TPF continues to enhance
its TCP/IP stack to better control TCP/IP applications, enhance
TCP/IP application design and coding, and increase the performance
of the stack itself.