Raw Socket使用总结

Last Updated: 2023-06-18 06:25:00 Sunday

-- TOC --

通过raw socket,可以实现很多高级网络功能,比如自己组装底层报文并收发,抓包,统计某个具体地址的流量等等。

创建raw socket,需要sudo权限!

看到一个很新颖的观点:一般的socket,其实也是抓取报文,只是抓的都是应用层的报文,是去掉了ip头和tcp/udp头的应用层Payload。

基本上什么都能抓,但具体问题需要去寻找具体的解决方案!可以先问问chatGPT...

抓取RAW IP报文

IP报文的Payload,TCP or UDP or ICMP.....因此抓取设定的协议,都是这些...

下面是个抓UDP报文的示例:

import socket
s = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_UDP)
while True:
    data, addr = s.recvfrom(65535)
    # data is an IP packet contains UDP Payload
    ...

以上代码,可以抓取所有发往本机IP地址的UDP报文!

常用的是下面3个选项,可以用来抓IPv4协议报文:

socket.IPPROTO_UDP   # 17
socket.IPPROTO_TCP   # 6
socket.IPPROTO_ICMP  # 1
socket.IPPROTO_IGMP
socket.IPPROTO_IP    # Unsupported in Linux

抓到的都是Raw IP Packet,自己代码的解析要从IP头开始。

下面是pingscan项目中的一段ICMP收发代码:

def rpingc(ip, cnt):
    """Return the number of received ping response"""
    rsock = socket.socket(socket.AF_INET, socket.SOCK_RAW, PROTO_ICMP)
    #
    # Only receive icmp packet with specified source ip set in connect below,
    # port number is irrelevant.
    # But, it's still possible to receive icmp pkt with wrong ip source.
    # That's because there is a gap between socket creation and connect!
    #
    rsock.connect((ip,0))
    rsock.settimeout(2)
    rtv = 0
    tnum = 0
    for i in range(cnt):
        try:
            # send recv in sync mode
            rsock.sendto(icmp_pkt(ICMP_IDENT,i), (ip,0))
            ippkt, addr = rsock.recvfrom(RECV_BUFFER)
            if ck_recv_pkt(ippkt, ip, i):
                rtv += 1
        except socket.timeout:
            if (tnum:=tnum+1) == 2:  # 2 timeout chance
                break
    rsock.close()
    return rtv,i+1

抓取本机发出报文

tcpdump可以抓发出去的报文。

看了很多资料,测试了一些代码,发现只能在2层抓包的时候,设置协议为ETH_P_ALL,才能够抓取本机发出的报文。

而且,Python的socket模块,没有预定义ETH开头的这些constant!

import socket

ETH_P_ALL = 0x0003

s = socket.socket(socket.PF_PACKET, socket.SOCK_RAW, socket.htons(ETH_P_ALL))
s.bind(('ens33',0))
while True:
    data, addr = s.recvfrom(65535)
    print(addr)
    print(addr[4].hex(':'))
    # destination mac
    print(data[:6].hex(':'), end=' --> ')
    # source mac
    print(data[6:12].hex(':'))
    # check if it is an IP package
    if data[12:14] == b'\x08\x00':
        # source ip
        print(socket.inet_ntoa(data[26:30]), end=' --> ')
        # destination ip
        print(socket.inet_ntoa(data[30:34]))

这里的bind,需要使用接口名称,表示只抓取某指定接口的所有报文。

addr在socket.PF_PACKET模式下,就不再是(ip,port)这样的tuple,而是一个比较复杂的有5个元素的tuple

('ens33', 2054, 1, 1, b'H}.\xc4\xc8\x81')
48:7d:2e:c4:c8:81
ff:ff:ff:ff:ff:ff --> 48:7d:2e:c4:c8:81
('ens33', 2054, 1, 1, b'H}.\xc4\xc8\x81')
48:7d:2e:c4:c8:81
ff:ff:ff:ff:ff:ff --> 48:7d:2e:c4:c8:81
('ens33', 2054, 4, 1, b'\x00\x0c)l \x00')
00:0c:29:6c:20:00
48:7d:2e:c4:c8:81 --> 00:0c:29:6c:20:00
('ens33', 2048, 0, 1, b'H}.\xc4\xc8\x81')
48:7d:2e:c4:c8:81
00:0c:29:6c:20:00 --> 48:7d:2e:c4:c8:81
192.168.16.1 --> 224.0.0.1
('ens33', 2048, 4, 1, b'\x00\x0c)l \x00')
00:0c:29:6c:20:00
01:00:5e:00:00:16 --> 00:0c:29:6c:20:00
192.168.16.104 --> 224.0.0.22
('ens33', 2048, 1, 1, b'H}.\xc4\xc8\x81')
48:7d:2e:c4:c8:81
ff:ff:ff:ff:ff:ff --> 48:7d:2e:c4:c8:81
192.168.16.1 --> 192.168.16.255
('ens33', 2054, 1, 1, b'H}.\xc4\xc8\x81')
48:7d:2e:c4:c8:81
ff:ff:ff:ff:ff:ff --> 48:7d:2e:c4:c8:81
('ens33', 2054, 1, 1, b'H}.\xc4\xc8\x81')
48:7d:2e:c4:c8:81
ff:ff:ff:ff:ff:ff --> 48:7d:2e:c4:c8:81
('ens33', 2054, 1, 1, b'H}.\xc4\xc8\x81')
48:7d:2e:c4:c8:81
ff:ff:ff:ff:ff:ff --> 48:7d:2e:c4:c8:81
('ens33', 2054, 4, 1, b'\x00\x0c)l \x00')
00:0c:29:6c:20:00
48:7d:2e:c4:c8:81 --> 00:0c:29:6c:20:00

ens33是抓取到报文的接口;

2048 = 0x0800,这两个字节对应EthernetII报文结构的type/length字段,0x0800表示是IP报文;0x0806 = 2054,表示这是ARP报文。

中间的integer,表示pkttype - Optional integer specifying the packet type;

>> for it in dir(socket):
...   if it.find('PACKET_') != -1:
...     print(it, eval('socket.'+it))
... 
PACKET_BROADCAST 1
PACKET_FASTROUTE 6
PACKET_HOST 0
PACKET_LOOPBACK 5
PACKET_MULTICAST 2
PACKET_OTHERHOST 3
PACKET_OUTGOING 4

然后是hatype - Optional integer specifying the ARP hardware address type.(The type of hardware used for the local network transmitting the ARP message. Ethernet is the common Hardware Type and the value for Ethernet is 1)

最后是addr - Optional bytes-like object specifying the hardware physical address, whose interpretation depends on the device. 是个mac地址,从上面的打印信息看,都是source mac。

从测试代码计算ip地址的方式可以算出,2层报文头只抓了14个byte,按照EthernetII报文结构,这14个字节对应目的mac,源mac和type/length字段。

这种socket据说常常用来写抓包程序。

抓取2层EthernetII报文

使用前面抓取本机发出报文的方法,协议使用其它值,就是抓取2层报文的技巧,而且只能抓接收的报文。

协议族为PF_PACKET套接字使用较多。

ETH_P_ALL自身定义于  /usr/include/linux/if_ether.h中,

#define ETH_P_ALL       0x0003

ETH_P_ALL占两个字节值为0x0003

其他的:

#define ETH_P_LOOP 0x0060 /* Ethernet Loopback packet */

#define ETH_P_PUP 0x0200 /* Xerox PUP packet */

#define ETH_P_PUPAT 0x0201 /* Xerox PUP Addr Trans packet */

#define ETH_P_IP 0x0800 /* Internet Protocol packet */

#define ETH_P_X25 0x0805 /* CCITT X.25 */

#define ETH_P_ARP 0x0806 /* Address Resolution packet */

#define ETH_P_BPQ 0x08FF /* G8BPQ AX.25 Ethernet Packet [ NOT AN

#define ETH_P_IEEEPUP 0x0a00 /* Xerox IEEE802.3 PUP packet */

#define ETH_P_IEEEPUPAT 0x0a01 /* Xerox IEEE802.3 PUP Addr Trans packet*/

以太网封装中,数据链路帧中类型字段为0800指示的数据为ip报文,0806为ARP报文等等。

ETH_P_RARP 0x8035 只接受发往本机mac的rarp类型的数据帧

大部分都没什么用,估计只有0x0800(IP)和0x8086(ARP)最常用。

PACKET socket

man packet,很详细!

Packet sockets are used to receive or send raw packets at the device driver (OSI Layer 2) level. They allow the user to implement protocol modules in user space on top of the physical layer.

前面代码使用的socket.PF_PACKET,就是packet socket。

流量统计

如果能够抓包,统计某个具体IP地址或者port上的流量就成为很简单的一件事情,需要考虑:

  1. 是否统计两个方向的流量;
  2. 是否需要统计IP头和TCP或UDP头;

以上两个具体需求的不同,会导致代码的不同。

如果只需要统计接收的流量,用抓RAW IP报文的方法既可以了,是否需要统计IP头或TCP/UDP头,自己在代码中处理。而抓取全部EthernetII报文,这个方法的代价会相对大一点,因为所有的报文都会被抓出来,通过代码分析处理一下。

混杂模式(Promiscuous Mode)

一般情况下,网卡只会接收目的地址是它的数据包,而不会接收目的地址不是它的数据包。混杂模式就是接收所有经过网卡的数据包,包括不是发给本机的包。默认情况下,网卡只把发给本机的包(包括广播包)传递给上层程序,其它包一律丢弃。

混杂模式是指网卡能接受所有通过它的数据流,无论目的地址。当网卡处于这种混杂模式时,它会接收所有遇到的数据帧。

网卡的工作模式

网卡默认工作模式包括广播模式和直接模式,即它只接收广播帧和发给自己的帧。(可以同时处于多个模式下)

注意,并不是任何情况下,局域网中的所有数据都会流经你的网卡,比如交换机组成的网络,交换机会绑定端口和MAC,并进行二层转发,一般情况下,其它host的非广播报文不会经过你的网卡。除非ARP欺骗!

设置网卡为混杂模式,可以使用ifconfig命令

Linux设置混杂模式抓2层Frame(C)

下面这份代码,实现了设置和取消设置接口的混杂模式,抓取2层Frame:

$ cat cap.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <linux/if_ether.h>
#include <sys/ioctl.h>
#include <net/if.h>
#include <net/ethernet.h>
#include <arpa/inet.h>


/**
 * Set pormiscuous mode for interface identified by ifname
 **/
int set_promisc(char *if_name, int sockfd, int set) {
    struct ifreq ifr;

    strcpy(ifr.ifr_name, if_name);
    if (ioctl(sockfd, SIOCGIFFLAGS, &ifr) != 0) {
        printf("Get interface flags failed.\n");
        return -1;
    }

    /* set the pormisc mode */
    if (set != 0)
        ifr.ifr_flags |= IFF_PROMISC;
    else
        ifr.ifr_flags &= ~IFF_PROMISC;

    if (ioctl(sockfd, SIOCSIFFLAGS, &ifr) != 0) {
        printf ("Set interface flags failed.\n");
        return -1;
    }

    return 0;
}


int main(void) {
    int sockfd;
    int ret = 0;
    unsigned char buffer[1518] = {0};
    unsigned char *eth_head = NULL;

    if ((sockfd=socket(PF_PACKET,SOCK_RAW,htons(ETH_P_ALL))) < 0) {
        printf ("create socket failed\n");
        return -1;
    }

    // set interface name here
    if (set_promisc("eth0",sockfd,1) != 0) {
        printf ("Failed to set interface promisc mode\n");
        return -1;
    }

    int i = 4;
    while (i--)
    {
        memset(buffer, 0, sizeof(buffer));
        ret = recvfrom(sockfd, buffer, sizeof(buffer), 0, NULL, NULL);
        printf("#### recview frame length : %d\n", ret);

        eth_head = buffer;
        printf("---- Frame Start ----\n");

        /* get source and dectination mac address */
        printf("dectination mac %02X:%02X:%02X:%02X:%02X:%02X  <--  "
               "source mac %02X:%02X:%02X:%02X:%02X:%02X\n", eth_head[0],
                  eth_head[1], eth_head[2], eth_head[3], eth_head[4],
                  eth_head[5], eth_head[6], eth_head[7], eth_head[8],
                  eth_head[9], eth_head[10], eth_head[11]);
        printf("eth_type 0x%02x%02x\n", eth_head[12], eth_head[13]);

        /* ARP protocol flag */
        if (0x08 == eth_head[12] && 0x06 == eth_head[13]) {
            printf("ARP source ip:%u.%u.%u.%u,destination ip:%u.%u.%u.%u;\n",
                      eth_head[28], eth_head[29], eth_head[30], eth_head[31],
                      eth_head[38], eth_head[39], eth_head[40], eth_head[41]);
        }

        /* IPv4 protocol flag */
        else if (0x08 == eth_head[12] && 0x00 == eth_head[13]) {
            if (0x45 == eth_head[14]) {
                printf("IPv4 source ip:%u.%u.%u.%u,destination ip:%u.%u.%u."
                          "%u;\n", eth_head[26], eth_head[27], eth_head[28],
                          eth_head[29], eth_head[30], eth_head[31],
                          eth_head[32], eth_head[33]);
            }
            else {
                printf("p_head:%02x\n", eth_head[14]);
            }
        }

        printf("---- Frame End ----\n");
    }

    // unset promisc
    set_promisc("eth0", sockfd, 0);
    return 0;
}

运行效果:

#### recview frame length : 60
---- Frame Start ----
dectination mac 00:16:3F:00:67:7B  <--  source mac EE:FF:FF:FF:FF:FF
eth_type 0x0800
IPv4 source ip:49.77.232.115,destination ip:172.16.6.90;
---- Frame End ----
#### recview frame length : 118
---- Frame Start ----
dectination mac EE:FF:FF:FF:FF:FF  <--  source mac 00:16:3F:00:67:7B
eth_type 0x0800
IPv4 source ip:172.16.6.90,destination ip:49.77.232.115;
---- Frame End ----
#### recview frame length : 102
---- Frame Start ----
dectination mac EE:FF:FF:FF:FF:FF  <--  source mac 00:16:3F:00:67:7B
eth_type 0x0800
IPv4 source ip:172.16.6.90,destination ip:49.77.232.115;
---- Frame End ----
#### recview frame length : 118
---- Frame Start ----
dectination mac EE:FF:FF:FF:FF:FF  <--  source mac 00:16:3F:00:67:7B
eth_type 0x0800
IPv4 source ip:172.16.6.90,destination ip:49.77.232.115;
---- Frame End ----

用Python在Linux下设置混杂模式

$ cat promisc.py
import ctypes
import fcntl
import socket
import time


ETH_P_ALL    = 0x0003 # 所有协议
SIOCGIFFLAGS = 0x8913 # 获取标记值
SIOCSIFFLAGS = 0x8914 # 设置标记值
IFF_PROMISC  = 0x100


class ifreq(ctypes.Structure):
    _fields_ = [("ifname", ctypes.c_char * 16),
                ("ifflags", ctypes.c_short)]


sk = socket.socket(socket.PF_PACKET,
                   socket.SOCK_RAW,
                   socket.htons(ETH_P_ALL))


def set_promisc(ifname, set=True):
    ifr = ifreq()
    ifr.ifname = ifname.encode()
    fcntl.ioctl(sk, SIOCGIFFLAGS, ifr)
    if set:
        ifr.ifflags |= IFF_PROMISC  # 添加混杂模式的值
    else:
        ifr.ifflags &= ~IFF_PROMISC
    fcntl.ioctl(sk, SIOCSIFFLAGS, ifr)  # 更新


print('set promisc')
set_promisc('eth0')
time.sleep(10)
print('unset promisc')
set_promisc('eth0', False)

发送RAW IP报文

发送EthernetII报文

Windows下抓取双向IP报文(Python)

在Windows下用Python编写网络程序,有一些socket参数是不一样的。下面这段代码,可以在Windows向抓取双向IP报文:

import os
import socket


if __name__ == "__main__":
    # 监听主机(获取本机ip)
    host = '192.168.16.101'

    # 创建原始套接字, 然后绑定在公开接口上
    if os.name == "nt":
        socket_protocol = socket.IPPROTO_IP

    sniffer = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket_protocol)
    sniffer.bind((host, 0))

    # 设置在捕获的数据包中包含的IP头
    sniffer.setsockopt(socket.IPPROTO_IP, socket.IP_HDRINCL, 1)

    # 在windows平台上, 我们需要设置IOCTL以启用混杂模式
    if os.name == "nt":
        sniffer.ioctl(socket.SIO_RCVALL, socket.RCVALL_ON)

    try:
        while True:
            # 读取数据包
            data = sniffer.recvfrom(65565)[0]

            protocol = data[9]
            srcip = socket.inet_ntoa(data[12:16])
            destip = socket.inet_ntoa(data[16:20])

            # 输出协议和通信双方ip地址
            print("protocol: %d %s -> %s" % (protocol, srcip, destip))

    # CTRL-C
    except KeyboardInterrupt:
        # 如果运行在windows上,关闭混杂模式
        if os.name == "nt":
             sniffer.ioctl(socket.SIO_RCVALL, socket.RCVALL_OFF)
        raise

Windows下抓取EthernetII报文

libpcap

想要获得高性能或更强大的抓包能力,就要使用libpcap配合BPF过滤规则,BPF规则会被编译成bytecode,在kernel中执行。

本文链接:https://cs.pynote.net/net/202205161/

-- EOF --

-- MORE --