-- TOC --
TCP自身有一保活机制,原理如下:
定义一个时间段,在这个时间段内,如果没有任何数据交互,TCP保活机制会激活,然后每隔一个时间间隔,发送一个探测报文,该探测报文包含的数据非常少,如果连续几个探测报文都没有得到响应,则认为当前的TCP连接已经死亡,系统内核将错误信息通知给上层应用程序。反之如果得到了响应,TCP连接保持。
Linux内核有对应的参数可以设置TCP的保活时间、保活探测的次数、保活探测的时间间隔,以下都为默认值,他们是全局配置
:
$ sudo sysctl -a | grep tcp_keepalive
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_time = 7200
默认启动TCP Keepalive机制的时间较长,要2小时,即7200秒。
注意,应用程序若想使用TCP保活机制,需要通过socket接口设置SO_KEEPALIVE
选项才能够生效,如果没有设置,那么就无法使用TCP保活机制。其实,很多App都会自己实现时间间隔较短的心跳功能,TCP自己的保活机制有些鸡肋。
Linux系统还可以为单条TCP连接设置以上这些参数,他们是单个socket配置
,Python代码如下:
if tcp_keepalive:
sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
if tcp_keepcnt:
sock.setsockopt(socket.SOL_TCP, socket.TCP_KEEPCNT, tcp_keepcnt)
if tcp_keepidle:
sock.setsockopt(socket.SOL_TCP, socket.TCP_KEEPIDLE, tcp_keepidle)
if tcp_keepintvl:
sock.setsockopt(socket.SOL_TCP, socket.TCP_KEEPINTVL, tcp_keepintvl)
用man 7 tcp
查看相关信息:
TCP_KEEPCNT (since Linux 2.4)
The maximum number of keepalive probes TCP should send before dropping the connection. This option should not be used in code intended to be portable.
TCP_KEEPIDLE (since Linux 2.4)
The time (in seconds) the connection needs to remain idle before TCP starts sending keepalive probes, if the socket option SO_KEEPALIVE has been set on this socket. This option should not be used in code intended to be portable.
TCP_KEEPINTVL (since Linux 2.4)
The time (in seconds) between individual keepalive probes. This option should not be used in code intended to be portable.
这段解释也说的很清楚,这几个单socket级别的配置项,恐怕没有可移植性。
一般应用开发,TCP连接的保活机制,不会依赖系统协议栈的Keepalive,而是自己实现,在灵活性和可移植性上都有保证,代码可理解性也更好。网上有人说的也非常好:
本文链接:https://cs.pynote.net/net/tcp/202303232/
-- EOF --
-- MORE --