Last Updated: 2024-05-13 01:40:48 Monday
-- TOC --
动态语言的优势是极其的灵活,但严肃大型的项目,需要更多的是高安全性,高可靠性,更好的代码可读性和可维护性,更有信心的refactor。Typing就是给Python这种动态类型语言做静态类型标注的手段。有了typing信息,python代码会更安全可靠。
在Python3刚开始的时候,就引入了Function Annotation,PEP3107。它与Typing有很多相似的地方,比如都是用__annotations__
属性,代码形式基本一致。不同点在于,Function Annotation只定义了形式,未内容做要求,任何内容都可以,对内容的解释,完全交由第三发工具。比如:
def compile(source: "something compilable",
filename: "where the compilable thing comes from",
mode: "is this a single statement or a suite?"):
...
这个示例中的annotation信息全是字符串,可以有第三方工具提取出来,作为函数使用的help信息。
Type Hints,PEP484,就是使用与Function Annotation一样的方式,只是约束其内容为type信息。typing模块就此出现。
Variable Annotations,PEP526,定义如何对各种变量进行type注释。
函数和变量有了类型type信息后,就可以非常方便的使用第三方工具,比如mypy,进行运行前的静态检查,发现一些潜在的bug。动态语言的一个痛点,有些类型不匹配的bug,只有在代码运行到那里的时候,才会暴露出来。如果测试不小心没能覆盖到那部分代码,bug就泄漏出去了。做静态类型检查,可以发现这类bug,catches bugs in code without running it!代码中所有typing信息,都不会带来runtime overhead。
看到一个观点:严肃认真的Python项目,typing和unittest一样重要!不能很好的typing,正如不能很好的unittest一样,都可能暗示了可能需要重构!
有好些给Python代码做静态检查的工具,mypy是一般情况下最好的选择。
安装mypy:
$ pip3 install mypy
官网:http://mypy-lang.org/
必学入门资料《Type hints cheat sheet (Python 3)》:https://mypy.readthedocs.io/en/stable/cheat_sheet_py3.html#cheat-sheet-py3
下面的内容,大多来自这份cheat sheet,有一些自己的补充理解。
# This is how you declare the type of a variable
age: int = 1
# You don't need to initialize a variable to annotate it
a: int # Ok (no value at runtime until assigned)
# Doing so is useful in conditional branches
child: bool
if age < 18:
child = True
else:
child = False
# For most types, just use the name of the type.
# Note that mypy can usually infer the type of a variable from its value,
# so technically these annotations are redundant
x: int = 1
x: float = 1.0
x: bool = True
x: str = "test"
x: bytes = b"test"
当对变量所赋值的类型,与typing类型不一致的时候,mypy会有错误提示。因此,虽然mypy说它可以自动推导出变量的类型,但在最开始对变量进行typing,依然是非常好的习惯!
# For collections on Python 3.9+, the type of the collection item is in brackets
x: list[int] = [1]
x: set[int] = {6, 7}
# For mappings, we need the types of both keys and values
x: dict[str, float] = {"field": 2.0} # Python 3.9+
# For tuples of fixed size, we specify the types of all the elements
x: tuple[int, str, float] = (3, "yes", 7.5) # Python 3.9+
# For tuples of variable size, we use one type and ellipsis
# 不定长度,但只能是int类型
x: tuple[int, ...] = (1, 2, 3) # Python 3.9+
# On Python 3.8 and earlier, the name of the collection type is
# capitalized, and the type is imported from the 'typing' module
from typing import List, Set, Dict, Tuple
x: List[int] = [1]
x: Set[int] = {6, 7}
x: Dict[str, float] = {"field": 2.0}
x: Tuple[int, str, float] = (3, "yes", 7.5)
x: Tuple[int, ...] = (1, 2, 3)
容器类型使用[]
来申明内部元素的类型,与创建对象区分开来。
from typing import Union, Optional
# On Python 3.10+, use the | operator when something could be one of a few types
x: list[int|str] = [3, 5, "test", "fun"] # Python 3.10+
# On earlier versions, use Union
x: list[Union[int, str]] = [3, 5, "test", "fun"]
# Use Optional[X] for a value that could be None
# Optional[X] is the same as X|None or Union[X, None]
x: Optional[str] = "something" if some_condition() else None
# Mypy understands a value can't be None in an if-statement
if x is not None:
print(x.upper())
# If a value can never be None due to some invariants, use an assert
assert x is not None
print(x.upper())
# 单纯的变量,如果可能出现多种类型:
y: int|str|None
y = 123
y = '123'
y = None
动态语言的特点,变量只是对象的引用,变量的值发生变化,引入不同的对象,类型也就发生了变化。因此,变量可能会有多种类型。显然新代码要使用|
。对于没有typing的变量,默认类型为Any。
from typing import Any
# 不定长度的tuple,任意类型
t: tuple[Any, ...]
Any关键词很多时候可以省掉不写。
from typing import Callable, Iterator, Union, Optional
# This is how you annotate a function definition
def stringify(num: int) -> str:
return str(num)
# And here's how you specify multiple arguments
def plus(num1: int, num2: int) -> int:
return num1 + num2
# If a function does not return a value, use None as the return type
# Default value for an argument goes after the type annotation
def show(value: str, excitement: int = 10) -> None:
print(value + "!" * excitement)
# Note that arguments without a type are dynamically typed (treated as Any)
# and that functions without any annotations not checked
# 如果只对返回值进行注释,mypy也没有检查!
def untyped(x):
x.anything() + 1 + "string" # no errors
对函数参数的typing,与对变量的typing一样,只是对于返回值,要使用literal符号->
。如果函数参数没有typing,mypy就不会对这个函数做检查。如果只typing了返回值,也不会做检查。上面那个显而易见的错误,难道要在运行时发现吗!
如果使用mypy的
--strict
或其它参数,mypy只是会提示函数缺少type annotations,Function is missing a type annotation,这并不是mypy在做具体的检查。
# This is how you annotate a callable (function) value
x: Callable[[int, float], float] = f
def register(callback: Callable[[str], int]) -> None: ...
# A generator function that yields ints is secretly just a function that
# returns an iterator of ints, so that's how we annotate it
def gen(n: int) -> Iterator[int]:
i = 0
while i < n:
yield i
i += 1
# You can of course split a function annotation over multiple lines
def send_email(address: Union[str, list[str]],
sender: str,
cc: Optional[list[str]],
bcc: Optional[list[str]],
subject: str = '',
body: Optional[list[str]] = None
) -> bool:
...
# Mypy understands positional-only and keyword-only arguments
# Positional-only arguments can also be marked by using a name starting with
# two underscores
def quux(x: int, / *, y: int) -> None:
pass
quux(3, y=5) # Ok
quux(3, 5) # error: Too many positional arguments for "quux"
quux(x=3, y=5) # error: Unexpected keyword argument "x" for "quux"
# This says each positional arg and each keyword arg is a "str"
def call(self, *args: str, **kwargs: str) -> str:
reveal_type(args) # Revealed type is "tuple[str, ...]"
reveal_type(kwargs) # Revealed type is "dict[str, str]"
request = make_request(*args, **kwargs)
return self.do_api_query(request)
# 入参是dict类型时:
def test(d: dict[int,str]) -> int: ...
使用Callable时,如果需要Any,Any这个关键词就不能省掉了。
class BankAccount:
# The "__init__" method doesn't return anything, so it gets return
# type "None" just like any other method that doesn't return anything
def __init__(self, account_name: str, initial_balance: int = 0) -> None:
# mypy will infer the correct types for these instance variables
# based on the types of the parameters.
self.account_name = account_name
self.balance = initial_balance
# For instance methods, omit type for "self"
def deposit(self, amount: int) -> None:
self.balance += amount
def withdraw(self, amount: int) -> None:
self.balance -= amount
# User-defined classes are valid as types in annotations
account: BankAccount = BankAccount("Alice", 400)
def transfer(src: BankAccount, dst: BankAccount, amount: int) -> None:
src.withdraw(amount)
dst.deposit(amount)
# Functions that accept BankAccount also accept any subclass of BankAccount!
class AuditedBankAccount(BankAccount):
# You can optionally declare instance variables in the class body
# 这里如果没有typing信息,就会是个语法错误
# 当instance创建这个名称的变量是,类型错误好像要用--strict才能检查出来
audit_log: list[str]
# This is a class variable with a default value
auditor_name: str = "The Spanish Inquisition"
def __init__(self, account_name: str, initial_balance: int = 0) -> None:
super().__init__(account_name, initial_balance)
self.audit_log: list[str] = []
def deposit(self, amount: int) -> None:
self.audit_log.append(f"Deposited {amount}")
self.balance += amount
def withdraw(self, amount: int) -> None:
self.audit_log.append(f"Withdrew {amount}")
self.balance -= amount
audited = AuditedBankAccount("Bob", 300)
transfer(audited, account, 100) # type checks!
[]
,ClassVar的好处:如果通过instance对ClassVar变量赋值,mypy能够检查出来。(在member function中,通过self对象赋值,就无法检查)# You can use the ClassVar annotation to declare a class variable
class Car:
seats: ClassVar[int] = 4
passengers: ClassVar[list[str]]
# If you want dynamic attributes on your class, have it
# override "__setattr__" or "__getattr__"
class A:
# This will allow assignment to any A.x, if x is the same type as "value"
# (use "value: Any" to allow arbitrary types)
def __setattr__(self, name: str, value: int) -> None: ...
# This will allow access to any A.x, if x is compatible with the return type
def __getattr__(self, name: str) -> int: ...
a = A()
a.foo = 42 # Works
a.bar = 'Ex-parrot' # Fails type checking
singledispatch的base接口,需要包含所有register的接口第1个参数的类型。
from typing import Union
@singledispatch
def make_toc(lines: Union[list[str],str]) -> str:
"""Return the TOC contents."""
return _make_toc(lines)[1]
@make_toc.register
def _(strlines: str) -> str:
return _make_toc(strlines.split('\n'))[1]
这段代码来自toc4github项目,用来自动给Github的README.md生成TOC。
对generator有两种typing方法,使用Generator或Iterator,前者可以指定YieldType,SendType,ReturnType
,后者只能指定YieldType。
class trafix():
""" traffic exchanging class """
def send_sk_nonblock_gen(self, sk: socket.socket) \
-> Generator[int, tuple[bytes|None,int], None]:
""" socket nonblocking send generator """
data = b''
while True:
bmsg, sid = yield len(data)
if bmsg is not None:
if self.x: bmsg = cx(bmsg)
data += (len(bmsg)+8).to_bytes(4,'little') \
+ sid.to_bytes(4,'big') \
+ bmsg
try:
while True:
if len(data) == 0:
break
if (i:=sk.send(data[:SK_IO_CHUNK_LEN])) == -1:
raise ConnectionError('send_sk_nonblock_gen send -1')
data = data[i:]
except BlockingIOError:
continue
def recv_sk_nonblock_gen(self, sk: socket.socket) \
-> Iterator[tuple[int|None,bytes,bytes]]:
""" socket nonblocking recv generator,
yield sid,type,msg """
data = b''
while True:
try:
d = sk.recv(SK_IO_CHUNK_LEN)
if len(d) == 0:
raise ConnectionError('recv_sk_nonblock_gen recv 0')
data += d
while (dlen:=len(data)) > 4:
mlen = int.from_bytes(data[:4], 'little')
if dlen >= mlen:
sid = int.from_bytes(data[4:8], 'big')
msg = dx(data[8:mlen]) if self.x else data[8:mlen]
yield sid, msg[:1], msg[1:]
data = data[mlen:]
else:
break
except BlockingIOError:
yield None, b'\x00', b''
如果对某个对象的类型到底是什么不确定,可以用mypy提供的接口reveal_type
和reveal_locals
来检查。
$ cat tt.py
a = [1, 'abc']
reveal_type(a)
b = (1, None)
reveal_type(b)
$ mypy tt.py
tt.py:4: note: Revealed type is "builtins.list[builtins.object]"
tt.py:7: note: Revealed type is "Tuple[builtins.int, None]"
Success: no issues found in 1 source file
reveal_type不是一个runtime接口,只有mypy理解这个接口,最后要将这些代码从源文件中删除。
下面是reveal_locals的使用示例:
$ cat tt.py
def func():
a: int = 1
b: str = '123'
c = [a,b] # no type hint
reveal_locals()
$ mypy tt.py
tt.py:7: note: Revealed local types are:
tt.py:7: note: a: builtins.int
tt.py:7: note: b: builtins.str
tt.py:7: note: c: Any
Success: no issues found in 1 source file
typing标注可以放在独立的stub文件中,比如Python的标准库和各种流行的第三方库。.pyi
文件,i
可以理解为interface
首字母。
Mypy uses the
typeshed
repository oftype stubs (type definitions for a module in the style of a header file)
to provide type data for both the Python standard library and dozens of popular libraries like requests, six, and sqlalchemy. Importantly, mypy is designed for gradually adding types; if type data for an import isn’t available, it just treats that import as being consistent with anything. 如果import的库没有typing信息,mypy会直接检查通过。
参考:https://github.com/python/typeshed
下面摘一段pyi文件中的定义:
class Exif(MutableMapping[int, Any]):
endian: Incomplete
bigtiff: bool
def load(self, data: bytes) -> None: ...
def load_from_fp(self, fp, offset: Incomplete | None = None) -> None: ...
def tobytes(self, offset: int = 8) -> bytes: ...
def get_ifd(self, tag: int): ...
def hide_offsets(self) -> None: ...
def __len__(self) -> int: ...
def __getitem__(self, tag: int) -> Any: ...
def __contains__(self, tag: object) -> bool: ...
def __setitem__(self, tag: int, value: Any) -> None: ...
def __delitem__(self, tag: int) -> None: ...
def __iter__(self) -> Iterator[int]: ...
...
来代替函数体。这种写法,可以作为Python函数接口的signature的写法使用。For example, suppose you want to make sure all functions within your codebase are using static typing and make mypy report an error if you add a dynamically-typed function by mistake. You can make mypy do this by running mypy with the --disallow-untyped-defs
flag.
建议使用,包含--disallow-untyped-defs
。
如果使用了--strict
,可以通过如下方式,让mypy跳过没有typing的函数:
def func(*args, **kwargs): # type: ignore
...
还可以跳过某些代码行,比如:
# server parameters
param = {'host': smtp,
'port': port,
'timeout': timeout}
# create server
if port in (25, 465, 587):
if port == 465:
server = smtplib.SMTP_SSL(**param) # type: ignore
mypy不能正确处理unpacking。
给长长的typing取一个别名。
import socket
sk_t = socket.socket # sk_t is a type now
Vector = list[float]
# or more explicitly
from typing import TypeAlias
Vector: TypeAlias = list[float]
Python3.11增加了一种Self
类型,很好的解决在定义class的过程中,需要使用正在定义的这个类型对象的尴尬。
from typing import Self
class tt:
def __init__(self, a: Self):
pass
def fn(self, *argv) -> Self:
...
return self
本文链接:https://cs.pynote.net/sf/python/202211041/
-- EOF --
-- MORE --