Authors Top

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

1. Overview

In TCP, the TIME_WAIT state is one of the states during connection termination. The initiator of the connection termination takes this time to ensure reliability. It ensures that the receiver receives the acknowledgment of its connection termination request.

In this article, we’re going to see when this TIME_WAIT state occurs during the connection lifetime, why this state is required, and some workarounds to avoid this state.

2. When Does the TIME_WAIT State Occur?

TCP Connection Termination
Let’s consider one scenario that two programs, A and B, are in communication through a TCP socket connection. Program A calls close on its socket and sends FIN packet to program B to terminate the connection. The one who initiated the termination is called to initiate an active close. Now program A is in the FIN_WAIT_1 state.

Program B, which receives a FIN packet, initiates the passive close and enters in the CLOSE_WAIT state. On receiving a FIN packet from program A, program B sends an ACK packet to program A. After receiving an ACK packet from program B, program A goes into the FIN_WAIT_2 state, waiting for a FIN packet from program B.

When program A receives a FIN packet from program B, it sends a final ACK packet to program B and enters into the TIME_WAIT state. However, program A doesn’t know that the ACK packet from it successfully reached the TCP protocol layer of program B.

Therefore, in the TIME_WAIT state, program A waits a reasonable amount of time to see whether program B retransmits the FIN packet to indicate that it never received an ACK packet from program A. If this happens, then program A must be able to retransmit the final ACK packet.

3. Why Is TIME_WAIT Required?

The time for which program A stays in the TIME_WAIT state is twice the Maximum Segment Lifetime (2*MSL). The MSL is the maximum time a TCP segment can exist in the network before being discarded. As per RFC 793, the value of MSL is defined as 2 minutes.

There are two primary purposes for the TIME_WAIT state. Firstly, it prevents delayed packets from one connection from being accepted by another socket relying on the same source address, source port, destination address, and destination port. Secondly, it ensures the reliable connection termination of the TCP connection. If the final ACK from A is lost, then the peer stays in the LAST_ACK state. Without the TIME_WAIT state, the remote end assumes that the previous connection is still valid – if it gets SYN for a new connection request for the same address, then it will terminate the request by sending an RST.

4. Working Around the TIME_WAIT State

A user may want to avoid the TIME_WAIT state for a couple of reasons. For instance, to create a new connection of the same kind immediately after a program crashes. Moreover, the socket structure occupies memory, CPU resources, etc., until the timeout. On a busy server where many connections are created and closed, some workarounds to avoid such scenarios may be required.

4.1. Restarting init.d Daemon

The init.d is the sub-directory of the /etc directory in the Linux file system. The content of this directory varies depending on the applications installed on your system. Usually, we can find a couple of scripts related to various services in our system.

The use of these scripts is to control (start, stop, reload, restart) the respective services during the booting of the system or while the system is running. A user with root privilege can control these scripts:

$ /etc/init.d/<script> <options>

Here, <script> is the name of the script of the service we want to control, for instance, networking, and <options> are actions like start, stop, restart, reload, and so on. When a connection goes into the TIME_WAIT state, it can be avoided by restarting network services by executing the network script with the restart option.

4.2. Using SO_REUSEADDR

Let’s reconsider the connection termination scenario as discussed earlier.

Here, if the initiator of the connection termination is a client, then the TIME_WAIT state does not create major issues as the client is usually assigned an ephemeral port number. When a client restarts before the TIME_WAIT period has elapsed, it is merely assigned another port number.

However, if a server initiates the active close in a TCP connection, then this TIME_WAIT period may create a problem. A server binds its socket to a predefined port number with the bind call. Because of this, the address/port combination bound to the socket isn’t available for reuse. Therefore, if a server tries to bind the same port number before the TIME_WAIT period expires, the bind call fails with an EADDRINUSE error. We can observe this scenario if we try to run a server program immediately after it crashes.

One workaround to this problem is to forcibly bind the address already in use to another socket. We can do this before calling bind by setting a server socket option in setsockopt. When using setsockopt, we set the hostname parameter to SO_REUSEADDR and optval parameter as Boolean TRUE.

SO_REUSEADDR indicates that during address validation of the bind call, the kernel should allow the reuse of the local address, even though it is already in use or in the TIME_WAIT state.

4.3. Using SO_LINGER

The SO_LINGER socket option specifies the operation of the close function for TCP. By default, close returns immediately, and if there is any data remaining to send in a socket buffer, then the system tries to deliver the data to the receiver.

We can change this default behavior using the SO_LINGER socket option. The structure that controls the mode of operations for SO_LINGER is:

struct linger
{
   int   l_onoff;        /* 0 = off, nonzero = on */
   int   l_linger;       /* linger time in seconds */
};

We need to pass this structure as optval argument to setsockopt with appropriate values set. When we set l_onoff to nonzero and l_linger to zero (zero timeouts), TCP aborts the connection on invocation of close system call.

In this case, instead of four-packet connection termination, as we’ve seen in a previous section, TCP discards any remaining data from a socket send buffer and sends an RST to the peer. Therefore, it avoids the TIME_WAIT state.

5. Conclusion

In this article, we explored the occurrence of the TIME_WAIT state during the TCP socket connection lifetime, the requirements of the TIME_WAIT state, and some workarounds to avoid this state.

Authors Bottom

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

Comments are closed on this article!