详解TCP编程（Python版）

Last Updated: 2023-04-25 09:51:26 Tuesday

-- TOC --

本文尝试总结一些在TCP编程中容易出现的问题和坑。

兄弟篇：详解UDP编程

创建TCP socket技巧

最原始的创建tcp socket的方法：

import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

其实，不需要写一大堆参数，直接默认就行，默认就tcp：

s = socket.socket()

创建tcp socket，并连接某个server，两行代码可以用一个调用完成：

s = socket.create_connection(('locahost',54321))

或者：

with socket.create_connection((ip, port)) as s:
    ...

Python的socket模块还提供了一个便捷的创建tcp server的接口，看源码，这个接口自动启动了SO_REUSEADDR选项（后文有详细介绍），还可以直接输入backlog值，是否开启reuseport等参数，确实方便：

s = socket.create_server((ip,port), *, ...)
sock, addr = s.accept()

多线程TCP Server示例

import socket
import threading

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(('', 12345))
s.listen()

def handle_tcp(sock, addr):
    print("new connection from %s:%s" % addr)
    sock.send(b'Welcome!')

    while True:
        data = sock.recv(1024)
        if not data:
            break
        sock.send(b'Hello, %s!' % data)
    sock.close()

while True:
    sock, addr = s.accept()
    t = threading.Thread(target=handle_tcp, args=(sock, addr))
    t.start()

多进程TCP Server示例

将threading模块换成multiprocessing模块，就可以实现多进程tcp server：

import socket
import multiprocessing as mp

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(('', 12345))
s.listen()

def handle_tcp(sock, addr):
    print("new connection from %s:%s" % addr)
    sock.send(b'Welcome!')

    while True:
        data = sock.recv(1024)
        if not data:
            break
        sock.send(b'Hello, %s!' % data)
    sock.close()

while True:
    sock, addr = s.accept()
    t = mp.Process(target=handle_tcp, args=(sock, addr))
    t.start()

listen接口

说一下listen函数接口，这个接口有个backlog参数，可以定义一个等待连接的队列长度：

socket.listen([backlog])

Enable a server to accept connections. If backlog is specified, it must be at least 0 (if it is lower, it is set to 0); it specifies the number of unaccepted connections that the system will allow before refusing new connections. If not specified, a default reasonable value is chosen. Changed in version 3.5: The backlog parameter is now optional.

backlog参数影响TCP的两个队列长度，具体参考：TCP的两个队列，查看生效的backlog参数值，使用ss命令。

listen不会阻塞，accept才会阻塞！

预先创建多个进程或线程的TCP Server示例

同一个socket对象，在一处listen，但可以在多个进程或线程中同时阻塞在accept，Linux内核每次唤醒一个等待进程来处理请求：

import os
import socket
import multiprocessing as mp
# import threading

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(('', 12345))
s.listen()

def accept_handle_tcp():
    while True:
        sock, addr = s.accept()
        print('pid %d', os.getpid())
        print('new connection from %s:%s' % addr)
        data = sock.recv(1024)
        print('received:', data)
        sock.send(b'Hello, %s!' % data)
        sock.close()

for i in range(10):
    t = mp.Process(target=accept_handle_tcp, args=())
    # t = threading.Thread(target=accept_handle_tcp, args=())
    t.start()

这种预先创建进程或线程资源的方式，可以更快的响应请求，就像CGI升级到了FastCGI。

这里没有多个socket，只有一个socket！当连接请求到来时，用哪个进程或线程，由OS调用决定，Linux2.6以后的内核，都不再有惊群现象。

SO_REUSEADDR

前面的几个TCP Server示例，都设置了SO_REUSEADDR这个socket选项，这只是为了当Server停止后，可以马上重新启动，如果没有这个选项，会提示OSError: [Errno 98] Address already in use的错误。

SO_REUSEADDR有两个作用：

（1）可以绑定处于TIME_WAIT状态的socket地址

TCP链接主动终止的一方，会进入TIME_WAIT状态，此时地址（ip+port）默认不可以使用。但设置了SO_REUSEADDR的socket，可以使用。

在未设置SO_REUSEADDR时，内核将一个处于TIME_WAIT状态的socketA看成是一个绑定了指定ip和port的有效socket，因此，如果另外一个socketB试图绑定相同的ip和port将会失败，直到socketA被真正释放后，才能够绑定成功。

（2）改变绑定wildcard地址的行为

所谓绑定wildcard地址，就是绑定0.0.0.0这个地址，表示这个socket绑定在当前主机的所有网络接口地址上。

请看下面的表格：

let's make a table here and list all possible combinations:

SO_REUSEADDR       socketA        socketB       Result
---------------------------------------------------------------------
  ON/OFF       192.168.0.1:21   192.168.0.1:21    Error (EADDRINUSE)
  ON/OFF       192.168.0.1:21      10.0.0.1:21    OK
  ON/OFF          10.0.0.1:21   192.168.0.1:21    OK
   OFF             0.0.0.0:21   192.168.1.0:21    Error (EADDRINUSE)
   OFF         192.168.1.0:21       0.0.0.0:21    Error (EADDRINUSE)
   ON              0.0.0.0:21   192.168.1.0:21    OK
   ON          192.168.1.0:21       0.0.0.0:21    OK
  ON/OFF           0.0.0.0:21       0.0.0.0:21    Error (EADDRINUSE)

当使用SO_REUSEADDR时，socketA绑定0.0.0.0:21，socketB还可以成功绑定一个更具体ip地址的21端口。这是SO_REUSEADDR的第2个准确含义，虽然它并不是真正的reuse，它更像一种更优的地址管理策略。

accept接口不再惊群

在另一篇文章中看到惊群效应的英文：thundering herd。

惊群效应，也有人叫做雷鸣群体效应，不管叫什么，简言之，惊群现象就是多进程（多线程）同时阻塞等待同一个事件的时候（休眠状态），如果等待的这个事件发生，那么它们被全部唤醒，但是最终，却只可能有一个进程（线程）获得这个事件的控制权，只可能有一个进程（线程）对该事件进行处理，而其他进程（线程）虽然都被惊醒，但获取控制权失败，只能重新进入休眠状态，这种现象就叫做惊群，一般都是在说某种性能的浪费。

举一个很简单的例子，当你往一群鸽子中间扔一粒谷子，所有的鸽子都被惊动，前来抢夺这粒食物，但是最终注定只可能有一个鸽子满意的抢到食物，没有抢到的鸽子只好回去继续游荡，等待下一粒谷子的到来。这里鸽子表示进程（线程），那粒谷子就是等待处理的事件。

前面的tcp server示例代码，创建了多个进程（线程），全都阻塞在accept接口上，那么，当一个tcp连接到来的时候，是否会有惊群现象呢？

答：已经没有了！

其实在Linux2.6版本以后，已经解决了accept函数的惊群现象，大概的处理方式就是，当内核接收到一个客户连接后，只会唤醒等待队列上的第一个进程（线程）。

SO_REUSEPORT

SO_REUSEPORT is what most people would expect SO_REUSEADDR to be. Basically, SO_REUSEPORT allows you to bind an arbitrary number of sockets to exactly the same source address and port as long as all prior bound sockets also had SO_REUSEPORT set before they were bound. If the first socket that is bound to an address and port does not have SO_REUSEPORT set, no other socket can be bound to exactly the same address and port, regardless if this other socket has SO_REUSEPORT set or not, until the first socket releases its binding again. Unlike in case of SO_REUESADDR the code handling SO_REUSEPORT will not only verify that the currently bound socket has SO_REUSEPORT set but it will also verify that the socket with a conflicting address and port had SO_REUSEPORT set when it was bound.

SO_REUSEPORT是真正的reuse addr！不同的socket，如果设置了此属性，就可以bind到完全一样的地址上。Linux要求绑定相同地址的进程或线程，必须在相同user下，以防止port hijacking。

SO_REUSEPORT does not imply SO_REUSEADDR. This means if a socket did not have SO_REUSEPORT set when it was bound and another socket has SO_REUSEPORT set when it is bound to exactly the same address and port, the bind fails, which is expected, but it also fails if the other socket is already dying and is in TIME_WAIT state. To be able to bind a socket to the same addresses and port as another socket in TIME_WAIT state requires either SO_REUSEADDR to be set on that socket or SO_REUSEPORT must have been set on both sockets prior to binding them. Of course it is allowed to set both, SO_REUSEPORT and SO_REUSEADDR, on a socket.

There is not much more to say about SO_REUSEPORT other than that it was added later than SO_REUSEADDR, that's why you will not find it in many socket implementations of other systems, which "forked" the BSD code before this option was added, and that there was no way to bind two sockets to exactly the same socket address in BSD prior to this option. （几乎所有的tcp/ip协议栈的实现，都来自BSD系统，至少也要保持interface相同）

前面几个多线程（或多进程）的socket代码示例，都是在一个线程中listen，accept后创建新的线程。还有个示例，是在一个进程中listen，预先创建出多个进程accept，OS负责调用者多个accept进程。现在有了SO_REUSEPORT，又有了一种新的创建TCP server的方式。

The new socket option SO_REUSEPORT allows multiple sockets on the same host to bind to the same port, and is intended to improve the performance of multithreaded network server applications running on top of multicore systems.

SO_REUSEPORT支持多个进程或者线程绑定到同一端口（地址），提高服务器程序的性能，解决的问题：

允许多个进程或线程的socket listen/accept相同地址（多listen）
每一个进程或线程拥有自己的socket（多socket）
内核层面实现负载均衡

创建tcp server的新思路：

import os
import socket

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
s.bind(('', 12345))
s.listen()

while True:
    sock, addr = s.accept()
    print('pid %d', os.getpid())
    print('new connection from %s:%s' % addr)
    data = sock.recv(1024)
    print('received:', data)
    sock.send(b'Hello, %s!' % data)
    sock.close()

以上这端代码创建的tcp server，启动多少个进程，在命令行上实现：

$ python3 tcp_server.py &
[1] 3580
$ python3 tcp_server.py &
[2] 3581
$ python3 tcp_server.py &
[3] 3582

问题：测试时发现，3个server进程，能够同时连接的只有2个，这是不是Linux内核一定要维持一个accept阻塞的原因呢？

以前通过fork形式创建多子进程，现在有了SO_REUSEPORT，可以不用通过fork的形式，让多进程监听同一个端口，各个进程中socket fd不一样，有新连接建立时，内核只会唤醒一个进程来accept，并且保证唤醒的均衡性。模型简单，维护方便了，进程的管理和应用逻辑解耦，进程的管理水平扩展权限下放给程序员/管理员，可以根据实际进行控制进程启动/关闭，增加了灵活性。这带来了一个较为微观的水平扩展思路，线程多少是否合适，状态是否存在共享，降低单个进程的资源依赖，针对无状态的服务器架构最为适合了。

Tcp Server可多版本共存，测试，或者平滑升级！多个server进程，可以是不同的版本，这样在升级的时候，会非常平滑。

`SO_REUSADDR` vs `SO_REUSEPORT`

设置了SO_REUSADDR的应用可以避免TCP的TIME_WAIT状态时间过长无法复用端口，尤其表现在应用程序关闭重启交替的这一小段时间。多进程/线程只能共享同一个socket对象。
SO_REUSEPORT可以然隶属于同一个用户（防止端口劫持）的多个进程/线程，多个socket，共享一个地址，内核层面替上层应用做数据包进程/线程的处理均衡。

Multicast Address

Most people ignore the fact that multicast addresses exist, but they do exist. While unicast addresses are used for one-to-one communication, multicast addresses are used for one-to-many communication. Most people got aware of multicast addresses when they learned about IPv6 but multicast addresses also existed in IPv4, even though this feature was never widely used on the public Internet.

The meaning of SO_REUSEADDR changes for multicast addresses as it allows multiple sockets to be bound to exactly the same combination of source multicast address and port. In other words, for multicast addresses SO_REUSEADDR behaves exactly as SO_REUSEPORT for unicast addresses. Actually, the code treats SO_REUSEADDR and SO_REUSEPORT identically for multicast addresses, that means you could say that SO_REUSEADDR implies SO_REUSEPORT for all multicast addresses and the other way round.

Windows系统socket选项

Windows系统socket选项与Linux并不一致！

Windows only knows the SO_REUSEADDR option, there is no SO_REUSEPORT. Setting SO_REUSEADDR on a socket in Windows behaves like setting SO_REUSEPORT and SO_REUSEADDR on a socket in BSD。

Windows下只有SO_REUSEADDR，而且这个选项在Windows系统下的行为模式，与Linux下的SO_REUSEPORT类似。即在Windows下，默认就有了Linux下的SO_REUSEADDR功能。（我自己测试时就发现，server不断地在相同地址上重启，没有任何问题）

为了防止Port Hijacking，Linux下处理方式为所有端口复用的进程必须在同一个用户下，Windows下处理方式为添加SO_EXECLUSIVEADDRUSE参数，程序设置该参数后，其它程序就不能复用这个端口。

Setting SO_EXCLUSIVEADDRUSE on a socket makes sure that if the binding succeeds, the combination of source address and port is owned exclusively by this socket and no other socket can bind to them, not even if it has SO_REUSEADDR set.

所以，在Windows下创建socket后，用SO_REUSEADDR来允许其它进程绑定相同端口，用SO_EXECLUSIVEADDRUSE来限制其它进程绑定相同端口，默认情况就是不允许，只是没有显示地限制。

socket.create_server接口

Python socket模块提供一个便捷的创建tcp server的接口，看源码，这个接口自动启动了SO_REUSEADDR，接口还可以设置backlog，以及reuse_port等参数，比较方便：

s = socket.create_server((ip,port), backlog=5, reuse_port=True)
sock, addr = s.accept()

ThreadingTCPServer的用法

ThreadingTCPServer来自Python标准库中的socketserver模块，我已经用它做了好几个服务器了，简单好用，特此总结。

ThreadingTCPServer，就是线程化的TCP服务器，客户端发起的TCP连接，在服务器侧，都是一个个的Python线程。显然，这个ThreadingTCPServer是多线程的框架模型。

创建Server：

import socketserver

# useful configurations， they are all class variables
socketserver.ThreadingTCPServer.allow_reuse_address = True
socketserver.ThreadingTCPServer.allow_reuse_port = True
socketserver.ThreadingTCPServer.daemon_threads = True

with socketserver.ThreadingTCPServer(
                    ('0.0.0.0', cm.TCP_PORT),
                    myTcpHandler) as tcp_server:
        tcp_server.serve_forever()

0.0.0.0表示在所有IPv4地址上监听，也可以写成''
allow_reuse_address对应SO_REUSADDR
allow_resue_port对应SO_REUSEPORT
daemon_threads = True，这是为daemon线程

myTcpHandler继承自socketserver.BaseRequestHandler类，这个类的框架代码如下：

class BaseRequestHandler:

    """Base class for request handler classes.

    This class is instantiated for each request to be handled.  The
    constructor sets the instance variables request, client_address
    and server, and then calls the handle() method.  To implement a
    specific service, all you need to do is to derive a class which
    defines a handle() method.

    The handle() method can find the request as self.request, the
    client address as self.client_address, and the server (in case it
    needs access to per-server information) as self.server.  Since a
    separate instance is created for each request, the handle() method
    can define other arbitrary instance variables.

    """

    def __init__(self, request, client_address, server):
        self.request = request
        self.client_address = client_address
        self.server = server
        self.setup()
        try:
            self.handle()
        finally:
            self.finish()

    def setup(self):
        pass

    def handle(self):
        pass

    def finish(self):
        pass

从这段代码可以看出，对于每一个TCP连接线程，先setup，然后handle，不管有无异常，最后总会finish。我们自己的代码，主要就是重写handle和finish这两个函数，当然也可以在继承类中添加自己的函数。

就这样简简单单，一个多线程TCP服务器就能搭起来。

TCP私有协议

TCP是无边界的字节数据流，data stream，用TCP发消息，判断一个消息是否接收完整，可以有以下几个选项：

通过断开socket的方式来判断data stream的结束
通过私有协议实现，最简单的协议，就是先传输消息长度
通过创建file-like object，使用readline接口，但这个方法不具有普适性，只适合部分场景

私有协议示例

下面两个函数，通过在发送的tcp数据前增加8个表示长度的byte，来实现一个简易的tcp私有传输协议：

from functools import partial


def tsend_all(s, msg, autoencode=True):
    """Sending the msg all in once and prefixing a length of 8 bytes.
    s should be a TCP socket.
    msg == [b]'' (empty) is allowed, also for multi-space msg like [b]'    '.
    Length is 8 bytes in big order which is automatically prefixed to the msg.
    This is a wrap of socket.send method, but return None or raise otherwise
    the sending bytes number. Unlike socket.sendall, the timeout you set is
    still valid in each socket.send calling.
    """
    prefix = (len(msg)+8).to_bytes(8, 'big')
    bmsg = prefix + (msg.encode() if autoencode else msg)
    msglen = len(bmsg)  # msglen includs prefix
    i = j = 0
    while i < msglen:
        j = s.send(bmsg[i:])
        if j == 0:
            raise ConnectionError('socket connection broken (tsend_all) %d'%i)
        i += j


def trecv_all(s, autodecode=True):
    """Return the receiving msg without length of the first 8 bytes.
    s should be a TCP socket.
    Returned msg would be [b]'' (empty), or multi-space like [b]'    '.
    This is a wrap of socket.recv, but return the whole msg in accordance
    with the length in the first 8 bytes of msg.
    """
    bmsg = b''
    while len(bmsg) < 8:
        chunk = s.recv(8-len(bmsg))
        if len(chunk) == 0:
            raise ConnectionError('socket connection broken (trecv_all) 1')
        bmsg += chunk
    msglen = int.from_bytes(bmsg, 'big')
    while len(bmsg) < msglen:
        chunk = s.recv(msglen-len(bmsg))
        if len(chunk) == 0:
            raise ConnectionError('socket connection broken (trecv_all) 2')
        bmsg += chunk
    return bmsg[8:].decode() if autodecode else bmsg[8:]


tsendb_all = partial(tsend_all, autoencode=False)
trecvb_all = partial(trecv_all, autodecode=False)

这两个接口的实现，完全考虑到了data stream的特点，即tcp socket的recv的返回值可能会小于对端发送的数据长度，tcp的send函数也可能返回小于数据完整长度的数值。8个byte来表示长度，可能太大的，不过相对于data部分来说，这个overhead也不算大。用8个byte可以放心的传输任意大文件！

tsend_all可以替代socket.sendall接口，但是必须要搞清楚的细节有：

socket.sendall在其内部反复调用send接口的过程中，不会reset socket timeout
socket.sendall成功时返回None，与tsend_all一样

使用readling的坑

如果一个socket在接收数据的时候，使用了readline，并且有buffer，后续如果切换到recv接收数据，recv返回的数据会不完整！因此，这种情况，makefile的时候，buffer=0。

TCP Server的最大连接数

澄清一个自己的认识错误：虽然端口号最大是65535，但这个数字跟tcp server可以建立的最大连接数没有关系！

accept虽然会返回一个socket对象，用来与client通信，但是此socket对象不占用端口号。如果使用getsockname接口查看，此socket返回的总是tcp server bind的地址！即tcp server所有与client通信的socket的name都是一样的，都是bind的那个地址。因此，端口号不会成为限制最大连接数的因素，是其它因素，比如：file descriptor或memory。

不要close在做accept的socket，会导致此socket异常。它们的地址一样，可能是这个原因。

当send接口返回错误的时候

记住：

send接口返回错误的原因很丰富，很可能需要代码做进一步判断
如果send接口返回错误后，并没有断开socket，此时要特别注意，不要丢掉没有send成功的数据

close或shutdown

坑1

这一段记录的内容，在测试的时候，可能很重要！

有一次测试发现，调用close后，发出的是RST报文，导致对端没有进入TIME_WAIT状态。查阅资料，加上自己测试和思考总结，得出一种直接发RST的场景：当socket没有任何收发，建立连接后直接close，发RST。

$ sudo tcpdump -i ens33 -nn -vv tcp and port 12345
tcpdump: listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes
12:54:43.453866 IP (tos 0x0, ttl 64, id 31047, offset 0, flags [DF], proto TCP (6), length 52)
    192.168.16.101.50631 > 192.168.16.104.12345: Flags [S], cksum 0x0f89 (correct), seq 2774345498, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0
12:54:43.453949 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52)
    192.168.16.104.12345 > 192.168.16.101.50631: Flags [S.], cksum 0xee20 (correct), seq 306974476, ack 2774345499, win 64240, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
12:54:43.455200 IP (tos 0x0, ttl 64, id 31048, offset 0, flags [DF], proto TCP (6), length 40)
    192.168.16.101.50631 > 192.168.16.104.12345: Flags [.], cksum 0x27e3 (correct), seq 1, ack 1, win 513, length 0
12:54:43.456012 IP (tos 0x0, ttl 64, id 14577, offset 0, flags [DF], proto TCP (6), length 48)
    192.168.16.104.12345 > 192.168.16.101.50631: Flags [P.], cksum 0x8f86 (correct), seq 1:9, ack 1, win 502, length 8
12:54:43.505651 IP (tos 0x0, ttl 64, id 31049, offset 0, flags [DF], proto TCP (6), length 40)
    192.168.16.101.50631 > 192.168.16.104.12345: Flags [.], cksum 0x27db (correct), seq 1, ack 9, win 513, length 0
12:54:44.783433 IP (tos 0x0, ttl 64, id 31050, offset 0, flags [DF], proto TCP (6), length 40)
    192.168.16.101.50631 > 192.168.16.104.12345: Flags [R.], cksum 0x29d8 (correct), seq 1, ack 9, win 0, length 0

以上用tcpdump命令抓取的报文，对应一个client连接server，收到server的消息后，直接close的情况。

发送RST包关闭连接时，不必等缓冲区的包都发出去，直接就丢弃缓冲区的包发送RST包，接收端接收到RST包以后，也不必发送ACK包确认。

而如果socket先调用shutdown，则肯定会发送FIN报文。后面再调用close，就不会发送RST报文了。

不占用端口的Server端的socket，如果主动关闭，会按照TCP协议规范，进入TIME_WAIT状态，此时这个socket不占用port资源，但是占用内存和file description资源。

坑2

有的时候，可能是大部分正常的时候，调用close并不会立即发出FIN。调用close，只是那个socket不能再执行其它的操作，但底层协议栈还在工作！

我在测试时，确实能够复现这样的场景。通过公网连接的两个socket，一直相互收发数据，如果将其中一个close，抓包发现没有FIN报文。这导致socket资源得不到真正释放，直到某一端超时发出FIN，而某一端卡死在CLOSE_WAIT状态。

这下终于理解了一些库中的代码，他们基本都是在close之前，先调用shutdown，确保立即开始四次挥手。

我测试发现，shutdown(socket.SHUT_RD)还不能发出FIN报文，只有关闭写才能发出FIN：

socket.shutdown(socket.SHUT_RDWR)
# OR
socket.shutdown(socket.SHUT_WR)

阅读源码，可以看到socket.close接口并不一定会真的close socket：

# in cpython/lib/socket.py
def _real_close(self, _ss=_socket.socket):
    # This function should not reference any globals. See issue #808164.
    _ss.close(self)

def close(self):
    # This function should not reference any globals. See issue #808164.
    self._closed = True
    if self._io_refs <= 0:
        self._real_close()

def detach(self):
    """detach() -> file descriptor

    Close the socket object without closing the underlying file descriptor.
    The object cannot be used after this call, but the file descriptor
    can be reused for other purposes.  The file descriptor is returned.
    """
    self._closed = True
    return super().detach()

而shutdown接口来自更底层的_socket，应该是个extension python module。

本文链接：https://cs.pynote.net/net/tcp/202202062/

-- EOF --

-- MORE --