Windows驱动开发

Last Updated: 2023-05-29 10:00:12 Monday

-- TOC --

有幸在工作中参与了几个Windows驱动的开发和维护,很有必要做些总结。我的Windows驱动生涯从一个虚拟麦克风驱动开始,有一份非常初级的pdf在life内。

Knowledge

开发调试环境

Visual Studio只是一个框,什么版本的SDK都可以装...

如果更换WDK版本,可直接重新安装新版本。

target系统不需要装WDK也可以调试,比如用DebugView看打印,但如果希望使用WinDbg断点调试,就需要在target上安装WDK:

Microsoft (R) Windows Debugger Version 10.0.22621.382 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.

Using NET for debugging
Opened WinSock 2.0
Waiting to reconnect...
Connected to target 192.168.16.101 on port 50019 on local IP 192.168.16.103.
You can get the target MAC address by running .kdtargetmac command.

devcon和sc

devcon工具在安装WDK后就有了,在WDK的某个安装目录中...

驱动的安装卸载查找更新:

devcon install xxxx.inf root\xxxx
devcon remove root\xxx
devcon find root\xxxx
devcon update xxxx.inf root\xxxx

devcon find 后面要跟硬件id,可以在驱动属性里面查到,但有的驱动的id还带*号,而且,也不需要root\前缀,搞不懂....下面是一个bat脚本,判断*cldvirtaudiodevice这个驱动是否已经安装过了,如果是,echo后就结束,不要重复安装。

cd /d %~dp0
@echo off
devcon find *cldvirtaudiodevice | find "No matching"
if %errorlevel% == 0 (
    devcon.exe install cvadriver.inf "*cldvirtaudiodevice"
)^
else (
    echo #### Virtual audio device already exists. ####
)
@echo on

sc这个windows系统自带的命令,也可以用来安装启停和卸载驱动。但我没有搞懂sc和devcon的异同?有些驱动必须要基于inf文件来安装,此时用sc就不行。sc貌似只关心sys文件,适合software driver(不用关心PnP和Power Management)。

> sc create aname type= kernel binPath= path\to\sysfile.sys
> sc start aname
> sc stop aname
> sc delete aname
> sc query aname

进入Windows系统的测试模式

安装没有签名的Windows驱动,需要在测试模式下进行。前面介绍在target系统中安装WDK的过程,会自动将target系统设置为测试模式。对于普通的Windows系统,如果不装WDK,设置测试模式,用管理员打开cmd窗口:

C:\>bcdedit /set nointegritychecks on
C:\>bcdedit /set testsigning on

重启。(可能testsigning on包含了nointegritychecks on,单独testsigning on也可以安装未签名驱动。注意关闭Secure Boot!)

DebugView

target上完全可以不装WDK(装WDK是为了WinDbg设置断点后调试),Win系统设置为测试模式后,就可在系统中安装未签名的驱动进行测试,调试的主要手段,DebugView看打印。个人经验:DebugView64容易自己崩溃,32位的DebugView很稳定!

Capture [Global] Win32这两个选项可以关闭,它们用来捕捉用户态进程的打印。捕捉内核态的打印,打开Capture Kernel,如果打开Verbose,应该就不需要修改注册表了,此时就能看到驱动的打印输出。但据说有的时候不行,还是要修改注册表,如下:

reg_debugview.png

重启。

WinDbg [Pre]

WinDbg可以断点调试kernel,可以打开dump文件分析。(还有很多其它功能)

Microsoft Store里面有个WinDbg Pre,界面很炫酷,功能更强大,可自动下载symbols。

Symbol Path设置:SRV*C:\Temp*http://msdl.microsoft.com/download/symbols(从最右边指定的地址下载后保存在C:\Temp目录下,貌似可以不需要提供地址,SRV*表示用默认地址)。WinDbg Pre都不需要设置了,自动下载symbols。也可设置环境变量_NT_SYMBOL_PATH为前面那个值,这样也可以方便一些其它需要symbol的工具。

系统日志

应用程序的问题,也有可能在Windows系统日志中找到线索。

BlueScreen

Kernel-mode code, on the other hand, being implicitly trusted, cannot recover from an unhandled exception. Such an exception causes the system to crash with the infamous Blue screen of death (BSOD) (newer versions of Windows have more diverse colors for the crash screen). The BSOD may first appear to be a form of punishment, but it’s essentially a protection mechanism. The rationale being it, is that allowing the code to continue execution could cause irreversible damage to Windows (such as deleting important files or corrupting the registry) that may cause the system to fail boot. It’s better, then, to stop everything immediately to prevent potential damage.

inf文件

微软的官方文档,是最好的参考资料:

https://learn.microsoft.com/en-us/windows-hardware/drivers/install/inf-version-section

Class ClassGuid 说明
1394 6BDD1FC1-810F-11D0-BEC7-08002BE2092F 1394主控制器
CDROM 4D36E965-E325-11CE-BFC1-08002BE10318 CD-ROM驱动器
DiskDrive 4D36E967-E325-11CE-BFC1-08002BE10318 磁盘驱动器
Display 4D36E968-E325-11CE-BFC1-08002BE10318 显示适配器
FDC 4D36E969-E325-11CE-BFC1-08002BE10318 软盘驱动器
HDC 4D36E96A-E325-11CE-BFC1-08002BE10318 硬盘控制器
HIDClass 745a17a0-74d3-11d0-b6fe-00a0c90f57da 人机接口设备
Keyboard 4D36E96B-E325-11CE-BFC1-08002BE10318 键盘
Modem 4d36e96c-e325-11ce-bfc1-08002be10318 调制解调器
Monitor 4d36e96e-e325-11ce-bfc1-08002be10318 监视器
Mouse 4d36e96f-e325-11ce-bfc1-08002be10318 鼠标
Net 4d36e972-e325-11ce-bfc1-08002be10318 网络适配器
Ports 4d36e978-e325-11ce-bfc1-08002be10318 端口(COM&LPT)
Printer 4d36e979-e325-11ce-bfc1-08002be10318 打印机
System 4d36e97d-e325-11ce-bfc1-08002be10318 系统设备
TapeDrive 6D807884-7D21-11CF-801C-08002BE10318 磁带驱动器
USB 36FC9E60-C465-11CF-8056-444553540000 USB

%windir%:系统windows目录

关于inf文件的鼠标右键安装

要支持这种安装方式,对inf文件有一定的要求:inf文件中需要包含DefaultInstall这样的节,这个节是右键安装时的Inf入口点(MSDN的原话:"An INF file's DefaultInstall section is accessed if a user selects the "Install" menu item after right-clicking on the INF file name.")。很遗憾的是,很多驱动安装包提供的Inf文件中并没有提供这样的节,因此右键安装并不能达到预期的效果("Providing a DefaultInstall section is optional. If an INF file does not include a DefaultInstall section, selecting "Install" after right-clicking on the file name causes an error message to be displayed."摘自MSDN)。

于是,聪明的你会想:既然inf文件中没有这样的节,我自己往文件中插入这样的节不就能达到目的了吗?这种变通的思维当然是值得肯定的,但是,最终可能还是安装失败。不要质疑是不是自己哪写的不对,再继续修改inf文件前,请看下MSDN上这段话:

"Remarks DefaultInstall sections must not be used for device installations. Use DefaultInstall sections only for the installation of class filter drivers, class co-installers, file system filters, and kernel driver services that are not associated with a device node (devnode). Note The INF file of a driver package must not contain an INF DefaultInstall section if the driver package is to be digitally signed. For more information about signing driver packages, see Driver Signing."

这段话翻译过来是说:DefaultInstall这种方式不适合设备安装(我们安装驱动不就是为了设备安装吗?),仅适用于安装类过滤驱动/类协安装器/文件系统过滤驱动/以及内核驱动服务(即legendDriver,比如WinIO.sys)这些不涉及到设备节点(devnode)的情况。说的再通俗一点:inf文件中包含[Manufacturer]这样的节(这个节包含了驱动匹配的设备ID)就不应该再有[DefaultInstall]这样的字眼。

驱动与注册表(regedit)

\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum这里面应该是枚举了系统中所有设备,比如devcon install jmic.inf root\jmic,在ROOT\MEDIA下就能看到这个设备,Service字段对应\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services下的路径(每个驱动都是一个内核service,Services内有ImagePath字段,对应此驱动程序文件路径,还有Start方式和Type),ClassGUID字段对应\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control路径下的Class分类信息。

Enum中有时能看到UpperfiltersLowerfilters,它们都是FiDO的service name,可能有多个。

FiDO驱动的安装

sysinternals工具

对于Windows的开发和使用者,sysinternals部分工具是无价的,这家公司已经被微软收购,搜索sysinternals链接到微软官方网页。有几个工具非常好:desktops, procexp,winobj,debugview...

关于Windows内核中C++的使用

With kernel code, Microsoft started officially supporting C++ with Visual Studio 2012 and WDK 8. C++ is not mandatory, of course, but it has some important benefits related to resource cleanup, with a C++ idiom called Resource Acquisition Is Initialization (RAII). (引入C++语法可以让程序员少犯错误)

Windows Kernel内大部分还是C!

DbgPrint和KdPrint

DbgPrint是函数接口,使用它的问题是,Release版本此接口依然会被调用,有点overhead;KdPrint是个Macro,给DbgPrint包了一层(因此使用时要扩两层括号),在Release版本下,KdPrint为空。因此,一般情况都应该使用KdPrint

#include <ntddk.h>


void Unload(_In_ PDRIVER_OBJECT DriverObject) {
    UNREFERENCED_PARAMETER(DriverObject);
    KdPrint(("byebye wdm driver, %s\n", "abcde"));
}


extern "C" NTSTATUS
DriverEntry(PDRIVER_OBJECT DriverObject, PUNICODE_STRING RegistryPath) {
    UNREFERENCED_PARAMETER(RegistryPath);

    KdPrint(("hello wdm driver, %d\n", 12345));
    DriverObject->DriverUnload = Unload;

    return STATUS_SUCCESS;
}

DriverEntry必须是C接口,在cpp文件中要使用extern "C"这个prefix

KdPrint打印UNICODE_STRING,使用%wZ

// %wZ is for UNICODE_STRING objects
KdPrint(("Original registry path: %wZ\n", RegistryPath));
KdPrint(("Copied registry path: %wZ\n", &g_RegistryPath));

DbgPrintEx和KdPrintEx

The output from DbgPrint(Ex) is limited to 512 bytes. Any remaining bytes are lost.

UNICODE_STRING

typedef struct _UNICODE_STRING {
  USHORT Length;
  USHORT MaximumLength;
  PWSTR  Buffer;
} UNICODE_STRING, *PUNICODE_STRING;
typedef const UNICODE_STRING *PCUNICODE_STRING;

设备名和符号链接名

Windows下的设备是以\Device\[设备名]形式命名,例如磁盘分区的C盘和D盘的设备名称就是\Device\HarddiskVolume1\Device\HarddiskVolume2, 当然也可以不指定设备名称。

如果IoCreateDevice中没有指定设备名称,那么I/O管理器会自动分配一个数字作为设备的名称。例如\Device\00000001

\Device\[设备名]这样的设备名使用起来稍有不便。符号链接可以理解为设备的别名。设备名只能被内核模式下的其他驱动所识别,而别名可以被用户模式下的应用程序识别,例如C盘,就是名为C:的符号链接,其真正的设备对象是\Device\HarddiskVolume1(不同系统可能编号有别)。所以在写驱动时候,一般我们会创建一个符号链接别名,供用户态程序使用。

用winobj可以查看系统中的设备和符号链接:

sym_winobj.png

从winobj的界面上看,同一个device可以存在多个symbolic link。

驱动代码中定义:

#define NT_DEVICE_NAME  L"\\Device\\VMouse"      // device
#define DOS_DEVICE_NAME L"\\DosDevices\\VMouse"  // symlink
// or
#define DOS_DEVICE_NAME L"\\??\\VMouse"          // symlink

DosDevices的符号链接名就是??, 所以\\DosDevices\\XXXX其实就是\\??\\XXXX

用户态代码使用:

#define   VIRTUAL_DEVICE_NAME   L"\\\\.\\VMouse"
// or
#define   VIRTUAL_DEVICE_NAME   LR"(\\.\VMouse)"  // C++11

DriverObject和DeviceObject

一个DriverObject可以挂多个DeviceObject:

driver_device_object.png

Although a driver object may look like a good candidate for clients to talk to, this is not the case. The actual communication endpoints for clients are device objects. Device objects are instances of the semi-documented DEVICE_OBJECT structure. Without device objects, there is no one to talk to. This means that at least one device object should be created by the driver and given a name, so that it may be contacted by clients.

开发经验:

微软官方:

The control device object is the only type of device object that can safely be named, because it is the only device object that is not attached to a driver stack.

A user mode application can not access the filter driver with out a device name since a call to IoRegisterDeviceInterface is not valid for control device objects. If a non-NULL value is passed in the DeviceName parameter, this value becomes the name of the control device object.

如何在miniport中增加control device

在miniport中,可以创建control device,即不加入devstack的device,设置symlink,设置属性,与用户态沟通。假设用户态只用DeviceIoControl来与驱动交互,此时,miniport的DriverEntry中,需要注册至少3个MJ:

DriverObject->MajorFunction[IRP_MJ_DEVICE_CONTROL] = for_DeviceIoControl;
DriverObject->MajorFunction[IRP_MJ_CREATE] = for_CreateClose;
DriverObject->MajorFunction[IRP_MJ_CLOSE] = for_CreateClose;

如果不注册CREATE和CLOSE,我遭遇的就是,用户态createfile的时候蓝屏(另一个FDO有一个系统自动生成的symlink,被我找出来用了)。因为,此时createfile流程会进入classport注册的CREATE接口,我们不知道这个接口内在做什么事情。而如果在miniport中注册CREATE,用户态在createfile的时候,调用就会进入miniport的这个CREATE接口,此时,我们只需要成功返回即可:

NTSTATUS
for_CreateClose(
    _In_ PDEVICE_OBJECT     pDeviceObject,
    _Inout_ PIRP            pIrp
)
{
    PAGED_CODE();
    ASSERT(pDeviceObject != NULL);
    ASSERT(pIrp != NULL);

    if (pDeviceObject == pcm_device) {
        pIrp->IoStatus.Status = STATUS_SUCCESS;
        pIrp->IoStatus.Information = 0;
        IoCompleteRequest(pIrp, IO_NO_INCREMENT);
        return STATUS_SUCCESS;
    }

    return PcDispatchIrp(pDeviceObject, pIrp);
}

记得做一个pDeviceObject == pcm_device的判断。DeviceIoControl接口中,也要做这个判断。如果不是miniport创建的这个control device,就把IRP交给classport处理。

LIST_ENTRY和CONTAINING_RECORD

我在Linux内核中也见过类似的技术,我在一篇英文博文中读到:这个技术无处不在...

typedef struct _LIST_ENTRY {
    struct _LIST_ENTRY *Flink;
    struct _LIST_ENTRY *Blink;
} LIST_ENTRY, *PLIST_ENTRY;

struct MyDataItem {
    // some data members
    LIST_ENTRY Link;
    // more data members
};

//
// CONTAINING_RECORD macro
//Gets the value of structure member (field),given the type(MYSTRUCT, in this code) and the List_Entry head(temp, in this code)
// 这个macro的定义,不是来自微软官方
#define CONTAINING_RECORD(address, type, field) (\
    (type*)((char*)(address) - (unsigned long)(&((type*)0)->field)))

MyDataItem* GetItem(LIST_ENTRY* pEntry) {
    return CONTAINING_RECORD(pEntry, MyDataItem, Link);
}

CONTAINING_RECORD第3个参数是field (name),在实际调用的时候,它就是结构体中LIST_ENTRY类型的成员名称,如上,Link!

用这个技术,能够将任意自定义的结构体串起来,形成双向链表,只要知道Link这个field的address,和结构体的type。

Windows Kernel中还提供了一组用来操作这种list结构的函数接口。

IRQL

Every hardware interrupt is associated with a priority, called Interrupt Request Level (IRQL) (not to be confused with an interrupt physical line known as IRQ), determined by the HAL. Each processor’s context has its own IRQL, just like any register. IRQLs may or may not be implemented by the CPU hardware, but this is essentially unimportant. IRQL should be treated just like any other CPU register. The basic rule is that a processor executes the code with the highest IRQL.

When user-mode code is executing, the IRQL is always zero (PASSIVE_LEVEL). This is one reason why the term IRQL is not mentioned in any user-mode documentation - it’s always zero and cannot be changed. Most kernel-mode code runs with IRQL zero as well.

高IRQL的ISR(Interrupt Service Routine)可以抢占低IRQL,用户态的代码“毫无尊严”,可以随时被中断抢占。(但用户态自由啊...)

DISPATCH_LEVEL (2) - this is where things change radically. The scheduler cannot wake up on this CPU. Paged memory access is not allowed - such access causes a system crash. Since the scheduler cannot interfere, waiting on kernel objects is not allowed (causes a system crash if used).(在这个级别,只有更高级别的硬件设备终端才能抢占CPU,不能访问paged memory,不能等待...spinlock运行在这个级别)

In kernel mode, the IRQL can be raised with the KeRaiseIrql function and lowered back with KeLowerIrql.

IRQL is an attribute of a processor. Priority is an attribute of a thread. Thread priorities only have meaning at IRQL < 2. Once an executing thread raised IRQL to 2 or higher, its priority does not mean anything anymore - it has theoretically an infinite quantum - it will continue execution until it lowers the IRQL to below 2.

中断控制器(如APIC)允许设定每一个硬件中断的优先级,但 Windows 并没有使用中断控制器的优先级,而是规定了一套软件中断优先级,称为中断请求级别(IRQL,Interrupt Request Level)。在 Intel x86 系统中,Windows 使用 0~31 来表示 IRQL,数值越大,优先级越高。处理器在运行时总是有一个当前 IRQL,如果发生中断时,中断源的 IRQL 等于或者低于当前级别,则该中断被屏蔽,直到处理器的 IRQL 降下来为止。IRQL=0 表示普通线程,称为 PASSIVE_LEVEL 或被动级别,它的优先级最低,可被任何其他级别的中断打断;IRQL=1 表示异步过程调用(APC,Asynchronous Procedure Call),称为 APC_LEVEL,它仅仅比 PASSIVE_LEVEL 高,因此,在一个线程中插入一个 APC 对象可以打断该线程的执行;IRQL=2 表示处理器正在做以下两件事情之一:正在进行线程调度,比如选择新的线程;正在处理一个硬件中断的后半部分(不那么紧急的部分),在 Windows 中,这被称为延迟过程调用(DPC,Deferred Procedure Call)。因此 IRQL 为 2 也被称为 DISPATCH/DPC级别,或者简单地称为 DISPATCH_LEVEL。3~26 是设备 IRQL,27~31 是一些特殊的硬件中断,包括时钟中断、电源中断、处理器间中断等,它们都是硬件中断。

DPC 是一个重要的概念,它的 IRQL 等于 DISPATCH_LEVEL,高于 PASSIVE_LEVEL和 APC_LEVEL,因此它优先于任何一个与线程相关的函数,也屏蔽了线程调度;同时又低于所有的硬件中断,所以它不会屏蔽任何一个硬件中断。之所以称为“延迟的”过程调用,是因为它往往被用来执行一些相对于当前高优先级的任务来说不那么紧急的事情,例如,硬件中断服务例程可以把一些相对不紧急的事情放到一个 DPC 对象中处理,从而缩短处理器停留在高 IRQL 的时间。

Spin Lock

The Spin Lock is just a bit in memory that is used with atomic test-and-set operations via an API. When a CPU tries to acquire a spin lock, and that spin lock is not currently free (the bit is set), the CPU keeps spinning on the spin lock, busy waiting for it to be released by another CPU (remember, putting the thread into a waiting state cannot be done at IRQL DISPATCH_LEVEL or higher).

资料上说:spinlock的内存,以及被保护的资源,都需要在non-paged pool中。因为获取spinlock会让线程的IRQL升级到DISPATCH_LEVEL,这个级别不能访问paged pool。另外,spinlock只有在多核CPU或多CPU场景下才有效,在单核CPU上,spinlock无效(相当于获取spinlock的那一行代码什么都不干,直接过)。(获取spinlock的接口,先升级到DISPATCH_LEVEL,这样才能更好地实现在某个CPU核上做busy wait)

网上有多份资料都说明,默认情况下(没有#pragma指令),内核加载器会加载所有的代码和全局数据到non-paged内存中!

If several CPUs try to acquire the same spin lock at the same time, which CPU gets the spin lock first? Normally, there is no order - the CPU with fastest electrons wins :). The kernel does provide an alternative, called Queued spin locks that serve CPUs on a FIFO basis.

Paged or Non-paged Memory

内核态的内存分paged或non-paged,即是否可以paged out这个区别。用户态的内存都是可paged!(用户态地位底下呀,但是自由啊...)

当IRQL在 >=2 的时候,代码就只能访问non-paged内存区域!

下面这些编译器指令,可在编译时,明确这些指令下面的代码或数据,他们载入的内存区域是paged还是non-paged:

#pragma code_seg()  // default, non-paged
#pragma code_seg("INIT")
#pragma code_seg("PAGE")
#pragma data_seg()  // default, non-paged
#pragma data_seg("INIT")
#pragma data_seg("PAGE")

我重新浏览了一遍微软的麦克风驱动demo代码,几乎所有的函数定义前,都有一个指定paged或non-paged的编译器指令。

DriverEntry接口,应该指定到INIT section。

下面是我见过的另外一种定义方式:

#ifdef ALLOC_PRAGMA
#pragma alloc_text(INIT,DriverEntry)
#pragma alloc_text(PAGE,AddDevice)
#pragma alloc_text(PAGE,Unload)
#pragma alloc_text(PAGE,Pnp)
#pragma alloc_text(PAGE,Power)
#pragma alloc_text(PAGE,Create)
#pragma alloc_text(PAGE,Close)
#pragma alloc_text(PAGE,DeviceControl)
#pragma alloc_text(PAGE,InternalControl)
#pragma alloc_text(PAGE,SystemControl)
#pragma alloc_text(PAGE,CompleteIrp)
#endif

使用NonPagedPoolNx

As a best practice, drivers for Windows 8 and later versions of Windows should allocate most or all of their nonpaged memory from the no-execute (NX) nonpaged pool. By allocating memory from NX nonpaged pool, a kernel-mode driver improves security by preventing malicious software from executing instructions in this memory.

HLK测试有这一项检测。

POOL_FLAG_NON_PAGED这个macro自带Nx属性,但老接口ExAllocatePoolWithTag要用NonPagedPoolNx

IRP

An IRP is a structure that is allocated from non-paged pool typically by one of the “managers” in the Executive (I/O Manager, Plug & Play Manager, Power Manager), but can also be allocated by the driver, perhaps for passing a request to another driver. Whichever entity allocating the IRP is also responsible for freeing it. An IRP is never allocated alone. It’s always accompanied by one or more I/O Stack Location structures (IO_STACK_LOCATION).

When a driver receives an IRP, it gets a pointer to the IRP structure itself, knowing it’s followed by a set of I/O stack location, one of which is for the driver’s use. To get the correct I/O stack location, a driver calls IoGetCurrentIrpStackLocation (actually a macro).

irp_io_stack_location.png

The parameters of the request are somehow “split” between the main IRP structure and the current IO_STACK_LOCATION. (入参分散在IRP和IO_STACK_LCATION两个地方,我觉得可能是因为有入参的性质,因此名称为IO+STACK,但其实只是IRP后面的一块non-paged内存。)

一个device node中是分层的,每一层一个driver object和一组device object,IRP的传递如下图:

irp_flow.png

我现在的理解,比如上图的FDO,里面有一个Driver Object,但可能有多个Device Object,这些device具有不同的性质,有些device就算有symlink,用户态也无法访问。(这部分前文还有描述)

IRP就是这样流动的...从上到下(只到IoCompleteRequest),再回到最初...

The Plug & Play (P&P) manager, in this case, is responsible for loading the appropriate drivers, starting from the bottom. (从底向上创建)There are at least two layers - PDO and FDO, but there could be more if filters are involved.

WHLK测试

https://learn.microsoft.com/en-us/windows-hardware/test/hlk/

本文链接:https://cs.pynote.net/sf/win/202211221/

-- EOF --

-- MORE --