深入理解bind和connect接口

在socket编程中，bind接口用于固定本地socket的地址，一般TCP Server是必须要先调用bind，然后再listen。

一直知道这样一个事实：在bind的时候，如果ip不写，或者写成0.0.0.0，表示bind在所有接口上。

>>> import socket
>>> s = socket.socket()
>>> s.bind(('',12345))
>>> s.getsockname()
('0.0.0.0', 12345)

>>> import socket
>>> s = socket.socket()
>>> s.bind(('',0))
>>> s.getsockname()
('0.0.0.0', 41623)
>>>
>>> s2 = socket.socket()
>>> s2.bind(('',0))
>>> s2.getsockname()
('0.0.0.0', 46913)

当port写成0的时候，bind接口会自动在允许的范围内，随机指定一个可用端口给这个socket。

一般情况下，tcp的client不会bind，会直接调用connect，在server端就会看到client使用的是比较随机的port，因为bind调用自动完成了。

Unlike in case of ports, a socket can really be bound to "any address" which means "all source IP addresses of all local interfaces". If the socket is connected later on, the system has to choose a specific source IP address, since a socket cannot be connected and at the same time be bound to any local IP address. Depending on the destination address and the content of the routing table, the system will pick an appropriate source address and replace the "any" binding with a binding to the chosen source IP address.

在bind的时候，port与ip不同，socket只能bind到一个port上，但是确实可以bind到所有ip上。在连接到来的时候，由系统确定一个ip地址。

connect接口

TCP的connect我们用的很多，向对端发起三次握手，请注意，UDP也可以调用connect接口，用来确定通信对端的地址。

Most people know that bind() may fail with the error EADDRINUSE, however, when you start playing around with address reuse, you may run into the strange situation that connect() fails with that error as well. How can this be? How can a remote address, after all that's what connect adds to a socket, be already in use? Connecting multiple sockets to exactly the same remote address has never been a problem before, so what's going wrong here?

a connection is defined by a tuple of five values, remember? And I also said, that these five values must be unique otherwise the system cannot distinguish two connections any longer, right? Well, with address reuse, you can bind two sockets of the same protocol to the same source address and port. That means three of those five values are already the same for these two sockets. If you now try to connect both of these sockets also to the same destination address and port, you would create two connected sockets, whose tuples are absolutely identical. This cannot work, at least not for TCP connections (UDP connections are no real connections anyway). If data arrived for either one of the two connections, the system could not tell which connection the data belongs to. At least the destination address or destination port must be different for either connection, so that the system has no problem to identify to which connection incoming data belongs to.

So if you bind two sockets of the same protocol to the same source address and port and try to connect them both to the same destination address and port, connect() will actually fail with the error EADDRINUSE for the second socket you try to connect, which means that a socket with an identical tuple of five values is already connected.

一个conection在kernel中由5元组来确定，（协议，源ip，源port，目的ip，目的port）。在地址复用的场景下，协议，源ip，源port都一样，如果再连接相同的目的ip和目的port，就会出现EADDRINUSE错误，因此kernel此时无法将两个connection区分开来。