Magic Methods of Python,魔法接口

Last Updated: 2023-08-20 14:15:50 Sunday

-- TOC --

Python Class的Magic Method或Special Method,也称为Dunder Method:

dunder: double underscore before and after, magic method is also named dunder method。

这些magic methods,基本都是被Python解释器调用的接口。可以把Python解释器理解为一个framework,将自己写的class嵌入这个framework,充分利用Python解释器的各项功能,这就是Pythonic

One of the biggest advantages of using Python's magic methods is that they provide a simple way to make objects behave like built-in types. That means you can avoid ugly, counter-intuitive, and nonstandard ways of performing basic operators.

Operators and some global functions are just syntax sugars for magic methods in Python.

Object Creating, Init, Del

__new__&__init__

创建对象首先调用new

创建对象时,Python解释器调用的第1个接口,并不是我们常见和熟悉的init,而是new。new创建对象,init进行必要的初始化。一般我们写的Python代码,并不会重载new。如果重载new,大概是这样的:

class aa():

    def __new__(cls):
        print('__new__')
        return object.__new__(cls)

    def __init__(self):
        print('__init__')


a1 = aa()
print(a1)

运行效果:

__new__
__init__
<__main__.aa object at 0x7f61db9cc7f0>

new是一个无需修饰的class method,通过object.__new__(cls)这行代码创建对象,在new返回对象后,Python解释器继续调用init初始化对象(如果定义了init)。因此,init是不需要返回的,它只是被隐式的调用了一下。(object对象的new没有别的参数,其它对象可以有)

init的第1个参数是self,所以Python解释器在调用init之前,就一定会将这个self创建出来,即先调用new,由new返回self对象,最后由Python解释器返回给用户代码。

不要在init中抛出异常

在init中抛出异常,这恐怕不是个好的设计:

可以不定义init

没有init,Python解释器就只是不调用而已。而new自然就是一个所谓的factory function:

class aa():

    def __new__(cls, num):
        obj =  object.__new__(cls)
        if num < 0:
            obj.num = 0
        else:
            obj.num = num
        return obj


a1 = aa(-123)
print(a1.num)
a2 = aa(123)
print(a2.num)

new和init需要有相同的参数列表

重载new的作用

在stackoverflow上看到一段文字,对使用__new__的场景有一点介绍:

New-style classes introduced a new class method __new__() that lets the class author customize how new class instances are created. By overriding __new__() a class author can implement patterns like the Singleton Pattern, return a previously created instance (e.g., from a free list), or to return an instance of a different class (e.g., a subclass). However, the use of __new__ has other important applications. For example, in the pickle module, __new__ is used to create instances when unserializing objects. In this case, instances are created, but the __init__ method is not invoked.

Another use of __new__ is to help with the subclassing of immutable types. By the nature of their immutability, these kinds of objects can not be initialized through a standard __init__() method. Instead, any kind of special initialization must be performed as the object is created; for instance, if the class wanted to modify the value being stored in the immutable object, the __new__ method can do this by passing the modified value to the base class __new__ method.

Singleton Pattern

运行期间某个class的实例只有一个!比如Python内置的None。

The singleton pattern is one of the simplest design patterns. Sometimes we need to have only one instance of our class for example a single DB connection shared by multiple objects as creating a separate DB connection for every object may be costly. Similarly, there can be a single configuration manager or error manager in an application that handles all problems instead of creating multiple managers.

class xyz():

    meOnly = None

    def __new__(cls):
        if cls.meOnly:
            return cls.meOnly
        else:
            cls.meOnly = object.__new__(cls)
            return cls.meOnly

    def __init__(self):
        self.x = 1
        self.y = 2
        self.z = 3


x = xyz()
print(id(x))
y = xyz()
print(id(y))
print(x.x)
print(y.y)
if x is y:
    print('x is y')

以上代码运行效果如下:

$ python3 singleton.py
35731976
35731976
1
2
x is y

__del__

执行del x这行代码,只是让对象的引用计数-1。当对象的引用计数为0时,GC回收对象,此时del会被调用。

If __new__ and __init__ formed the constructor of the object, __del__ is the destructor. It can be quite useful for objects that might require extra cleanup upon deletion, like sockets or file objects.

Python解释器并不保证对象的del一定会被调用,比如当对象还是alive状态,但解释器退出了(比如使用os._exit),此时可能存在某些系统资源没有被释放。因此,如果重载del,需要很小心。del并不能代替良好的编程习惯,主动显式地释放资源。

Customizing Attribute Access

__getattr__,__setattr__&__delattr__

访问实例属性时...

class xyz():

    def __setattr__(self, name, value):
        print('__setattr__', name, value)
        self.__dict__[name] = value

    def __getattr__(self, name):
        print('__getattr__', name)
        try:
            rt = self.__dict__[name]
        except KeyError:
            raise AttributeError() from None
        return rt

    def __delattr__(self, name):
        print('__delattr__', name)
        self.__dict__.pop(name)


x = xyz()
x.a = 1
setattr(x, 'b', 2)
print(x.a, x.b)
del x.a
print(hasattr(x,'a'))  # False
print(getattr(x,'b'))  # 2
print(getattr(x,'a',111))  # 111

__getattribute__

访问任意属性时...

一般不会用到,与__getattr__的不同:

测试代码:

class xyz():

    label = 'xyz'

    def __getattribute__(self, name):
        print('__getattribute__')
        return super().__getattribute__(name)

    def __getattr__(self, name):
        print('__getattr__', name)
        rt = self.__dict__[name]
        return rt

    def show(self):
        print('show')


x = xyz()
print(x.label)
x.show()

__dir__

dir(object)时被调用。它应该返回一个attribute list。

__get__,__set__,__delete__&__set_name__

访问类属性时...

这一组magic methods用于Python的descriptor协议。啥是descriptor?

我的理解,一般情况下,descriptor标识了一个资源,通过descriptor,可以操作资源。比如Linux内的file descriptor,对应Windows下的Handle。Python的descriptor,是一个Class Variable,当通过对象示例访问时,对应的magic method会被调用。

通常情况下,通过对象访问属性,如果是setter动作,同时该属性不存在时,Python解释器会给这个对象创建此属性并赋值。但是,如果该名称属于Class Variable,同时它是个descriptor,Python解释器则不会为对象创建此属性,而是调用descriptor的magic methods。

https://docs.python.org/3/howto/descriptor.html

Descriptors are a powerful, general purpose protocol. They are the mechanism behind properties, methods, static methods, class methods, and super(). They are used throughout Python itself. Descriptors simplify the underlying C code and offer a flexible set of new tools for everyday Python programs.

Representing Classes

__str__&__repr__

这两个magic method分别在对象被str和repr调用时调用,前者返回的string面向end user,后者返回的字符串更专业,如果有可能,repr得到的字符串可以用来创建这个对象。

在Python解释器interactive模式下,输入对象,返回的就是repr得到的结果。如果对象没有定义repr,则返回类似下面这样的信息:

>>> class xyz: pass
... 
>>> x = xyz()
>>> x
<__main__.xyz object at 0x7fd2a0fd3880>  # default __repr__

而str会默认被print接口调用。

如果这两个要实现至少一个,请实现repr,因为如果没有实现str,repr是str的fallback。

class xyz:    
    def __repr__(self):
        return 'i am xyz'

测试:

>>> x = xyz()
>>> x
i am xyz
>>> str(x)
'i am xyz'

__format__

在格式化字符串{:spec}的时候被调用。

class xyz:

    def __str__(self):
        return '__str__'

    def __repr__(self):
        return '__repr__'

    def __format__(self, spec):
        print("in __format__")
        return spec


x = xyz()
print(f"{x:j}")
print(f"{x:&j}")
print("{:999}".format(x))
print("{:@@@}".format(x))
print(f"{x!s}")  # call __str__
print(f"{x!r}")  # call __repr__

__len__

len(object)的时候被调用。

__bool__

bool(object)的时候被调用。

在Python2中,这个接口叫做__nonzero__

__sizeof__

sys.getsizeof(object)的时候被调用,返回结果+16,就是最终的返回结果。

__hash__

hash(object)的时候被调用,此magic应该返回int对象,如果对象被用来作为dict中的key,真实的key,就是hash返回的int。

Note that this usually entails implementing __eq__ as well. Live by the following rule: a == b implies hash(a) == hash(b).

一般用户自定义的对象,都默认hashable!无需用户自定义__hash__接口。

一般可以认为hashable的对象都是immutable对象。

__index__

当对象被用作index时,即[object],被调用。

class xyz():

    def __init__(self):
        self.a = 1

    def __index__(self):
        return self.a


x = xyz()
a = [1,2,3]
print(a[x])

Context Manager

__enter__&__exit__

with语句在Python中对应context management,

Context Managers are Python’s resource managers. In most cases, we use files as resources (a simple resource). We often don’t care about closing the files at the end of execution. This is a bad coding practice and also this causes issues when too many files are opened, or when the program is terminated with failure as the resource is not properly released. Context managers are the rescue for this issue by automatically managing resources. In Python, the with keyword is used.

常见的对文件或mutex资源的操作:

with open(filename) as f:
    f.read()
...
import threading
mutex = threading.Lock()
with mutex:
    ...

with语句保证了,当执行到with block之外时,资源会得到妥善的释放。

定义enter和exit

通过在自定义的class指定,实现enter和exit,即可让自己的class也能够支持with语句。

# a simple file writer object
class MessageWriter(object):
    def __init__(self, file_name):
        self.file_name = file_name

    def __enter__(self):
        self.file = open(self.file_name, 'w')
        return self.file

    def __exit__(self, *args): 
        self.file.close()


# using with statement with MessageWriter
with MessageWriter('my_file.txt') as f:
    f.write('hello world')

首先创建对象,然后在with的作用下,调用enter,返回的对象赋给f。在with block内通过f操作资源,在离开with block后,exit被调用。

exit的参数和with block中的异常

exit接口除了self之外,还有3个参数,并且可以直接处理with block抛出的异常。

class Divide:
    def __init__(self, num1, num2):
        self.num1 = num1
        self.num2 = num2

    def __enter__(self):
        print("Inside __enter__")
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        print("Inside __exit__")
        print("Exception type:", exc_type)
        print("Exception value:", exc_value)
        print("Traceback:", traceback)
        return True

    def do_divide(self):
        print(self.num1 / self.num2)


a = Divide(1,2)
with a as f:
    f.do_divide()

a = Divide(1,0)
with a as f:
    f.do_divide()
    raise ValueError

执行效果:

Inside __enter__
0.5
Inside __exit__
Exception type: None
Exception value: None
Traceback: None
Inside __enter__
Inside __exit__
Exception type: <class 'ZeroDivisionError'>
Exception value: division by zero
Traceback: <traceback object at 0x7f5e19a05740>

exit接口的全部参数如上代码。当with block执行期间没有异常抛出时,这3个参数全都是None,当有异常抛出时,这3个参数就有值了。我们完全可以在exit内部将异常处理掉,然后返回True。如果exit没有返回True,发生的异常将会从with block中抛出。

用with同时enter进入多个对象

>>> with open('tox.ini') as f, open('ping.py') as g:
...   f.read()
...   g.read()
...

with与generator

参考:generator总结

Comparative Operators

__cmp__被Python3取消了.....__cmp__ is gone due to redundancy with other magic methods

__eq__&__ne__

class xyz:
    def __init__(self, num):
        self.num = num

    def __eq__(self, other):
        return self.num == other.num

    def __ne__(self, other):
        return self.num != other.num



a = xyz(1)
b = xyz(2)
c = xyz(1)

print(a == b)  # False
print(a == c)  # True
print(a != b)  # True
print(a != c)  # False

__gt__,__ge__,__lt__&__le__

class xyz:
    def __init__(self, num):
        self.num = num

    def __gt__(self, other):
        print('self id', id(self))
        return self.num > other.num

    def __ge__(self, other):
        print('self id', id(self))
        return self.num >= other.num


a = xyz(1)
b = xyz(2)
print(a > b)
print(a < b)

c = xyz(1)
print(a >= c)
print(a <= c)

以上代码,只定义了gt和ge,但也够了,当执行a<b时,Python解释器会调用b.__gt__接口。以上代码运行打印如下:

self id 139894170461296
False
self id 139894170463648
True
self id 139894170461296
True
self id 139894170465520
True

@total_ordering

相等与否,以及大小比较,如果把所有magic method都定义出来,会显得多余。Python提供了一个装饰器,total_ordering,自己只要实现eq和gt(或lt),其它的就都能够推导出来了。

from functools import total_ordering


@total_ordering
class xyz:
    def __init__(self, num):
        self.num = num

    def __gt__(self, other):
        print('__gt__')
        return self.num > other.num

    def __eq__(self, other):
        print('__eq__')
        return self.num == other.num


a = xyz(1)
b = xyz(2)
c = xyz(3)
print(a < b)
print(a >= c)
print(a <= c)
print(a != b)

Conversions

__pos__&__neg__

class strnum(str):
    def __new__(cls, num):
        return str.__new__(cls, num)

    def __pos__(self):
        return '+'+self

    def __neg__(self):
        return '-'+self


a = strnum(12345)
print(+a)  # +12345
print(-a)  # -12345

__abs__

abs(object)时调用。

__round__

round(object,[n])时调用。

__floor__,__ceil__&__trunc__

在被math.floormath.ceilmath.trunc调用时调用。

import math


class xyz():

    def __abs__(self):
        return 1

    def __round__(self, n=0):
        return 2

    def __floor__(self):
        return 3

    def __ceil__(self):
        return 4

    def __trunc__(self):
        return 5


x = xyz()
print(abs(x))
print(round(x))
print(math.floor(x))
print(math.ceil(x))
print(math.trunc(x))

__int__,__float__&__complex__

class xyz():

    def __int__(self):
        return 9

    def __float__(self):
        return 9.9

    def __complex__(self):
        return 1+2j

    def __hex__(self):
        return '0xFF'


x = xyz()
print(int(x))
print(float(x))
print(complex(x))

Bitwise Operators

__invert__

__and__,__or__&__xor__

__lshift__&__rshift__

class xyz():

    def __invert__(self):
        return 1

    def __and__(self, other):
        return 2

    def __or__(self, other):
        return 3

    def __xor__(self, other):
        return 4

    def __lshift__(self, other):
        return 5

    def __rshift__(self, other):
        return 6


x = xyz()
y = xyz()
print(~x)
print(x & y)
print(x | y)
print(x ^ y)
print(x << 2)
print(x >> 2)

Reflected Bitwise Operators

所谓reflected,就是反过来,正常是object op other,reflected是other op object

__rand__,__ror__&__rxor__

__rlshift__&__rrshift__

class xyz():

    def __rand__(self, other):
        return 2

    def __ror__(self, other):
        return 3

    def __rxor__(self, other):
        return 4

    def __rlshift__(self, other):
        return 5

    def __rrshift__(self, other):
        return 6


x = xyz()
y = xyz()
print(1 & y)
print(1 | y)
print(1 ^ y)
print(100 << x)
print(100 >> x)

Argumented Assignment Bitwise Operators

__iand__,__ior__&__ixor__

__ilshift__&__irshift__

class xyz():

    def __iand__(self, other):
        return 2

    def __ior__(self, other):
        return 3

    def __ixor__(self, other):
        return 4

    def __ilshift__(self, other):
        return 5

    def __irshift__(self, other):
        return 6


x = xyz(); x &= 1; print(x)
x = xyz(); x |= 1; print(x)
x = xyz(); x ^= 1; print(x)
x = xyz(); x <<= 1; print(x)
x = xyz(); x >>= 1; print(x)

Arithmetic Operators

__add__&__sub__

__mul__&__matmul__

__floordiv__&__truediv__

__mod__&__pow__

__divmod__

class xyz():

    def __add__(self, other):
        return 1

    def __sub__(self, other):
        return 2

    def __mul__(self, other):
        return 3

    def __floordiv__(self, other):
        return 4

    def __truediv__(self, other):
        return 5

    def __mod__(self, other):
        return 6

    def __pow__(self, other):
        return 7

    def __divmod__(self, other):
        return 6,8

    def __matmul__(self, other):
        return 9


x = xyz()
print(x + 1)
print(x - 1)
print(x * 1)
print(x // 1)
print(x / 1)
print(x % 4)
print(x**2)
print(divmod(x,2))
print(x @ 7)

Reflected Arithmetic Operators

__radd__&__rsub__

__rmul__&__rmatmul__

__rfloordiv__&__rtruediv__

__rmod__&__rpow__

__rdivmod__

class xyz():

    def __radd__(self, other):
        return 1

    def __rsub__(self, other):
        return 2

    def __rmul__(self, other):
        return 3

    def __rfloordiv__(self, other):
        return 4

    def __rtruediv__(self, other):
        return 5

    def __rmod__(self, other):
        return 6

    def __rpow__(self, other):
        return 7

    def __rdivmod__(self, other):
        return 6,8

    def __matmul__(self, other):
        return 9


x = xyz()
print(1 + x)
print(1 - x)
print(1 * x)
print(8 // x)
print(8 / x)
print(8 % x)
print(8**x)
print(divmod(8,x))
print(7 @ x)

Augmented Assignment Arithmetic Operators

__iadd__&__isub__

__imul__&__imatmul__

__ifloordiv__&__itruediv__

__imod__&__ipow__

class xyz():

    def __iadd__(self, other):
        return 1

    def __isub__(self, other):
        return 2

    def __imul__(self, other):
        return 3

    def __ifloordiv__(self, other):
        return 4

    def __itruediv__(self, other):
        return 5

    def __imod__(self, other):
        return 6

    def __ipow__(self, other):
        return 7

    def __imatmul__(self, other):
        return 9


x = xyz(); x += 1; print(x)
x = xyz(); x -= 1; print(x)
x = xyz(); x *= 1; print(x)
x = xyz(); x //= 1; print(x)
x = xyz(); x /= 1; print(x)
x = xyz(); x **= 1; print(x)
x = xyz(); x %= 1; print(x)
x = xyz(); x @= 7; print(x)

似乎在Python中有这样一个规则:augmented operator用在mutable对象上时,不创建新对象,原地修改。

Callable Object

__call__

class xyz():

    def __call__(self, a, b, c):
        return a+b+c


x = xyz()
print(x(1,2,3))

Function in Python is the first class citizen!

这句话的含义,函数跟其它对象一样,可以赋给变量,可以作为参数传递。在Python中定义的函数对象,自带__call__属性,因此可以被解释器直接调用。

函数式编程

比如map,它的第1个参数是一个function,这个function作用到第2个参数上。有个术语,叫做higher-order function,就如map。这就是函数式编程的一个case。

Copying Object

__copy__&__deepcopy__

在标准copy.copy和copy.deepcopy时被调用。

默认情况下,对象都可以用这两个接口调用,实现这两个magic method,会覆盖默认的动作,具体如何实现还没有深入研究。

Making Sequence

__getitem__,__setitem__&__delitem__

在使用[]时被调用:

__contains__

在使用innot in时被调用。

class xyz():

    def __init__(self):
        self.a = 1
        self.b = 2
        self.c = 3
        self.d = 4
        self._map = {0:'a',1:'b',2:'c',3:'d'}

    def __len__(self):
        return len(self._map)

    def __getitem__(self, idx):
        if idx not in self._map:
            raise IndexError('Out of Index')
        return eval('self.'+self._map[idx])

    def __setitem__(self, idx, val):
        exec('self.'+self._map[idx]+'='+str(val))

    def __delitem__(self, idx):
        """ set to zero """
        exec('self.'+self._map[idx]+'=0')

    def __contains__(self, val):
        if val in (self.a, self.b, self.c, self.d):
            return True
        return False


x = xyz()
print(len(x))
print(x[2])
x[2] = 999
print(x[2])
del x[2]
print(x[2])
print(1 in x)
print(999 not in x)

Iterator

__iter__

iter(object)时被调用,返回一个iterator。

__next__

next(object)时被调用。支持next接口的对象,都属于iterator。

class joy:

    def __init__(self, max):
        self.max = max

    def __iter__(self):

        class joy_iterator:

            def __init__(it):
                # function's local class,
                # self can be accessed directly.
                it.num = [i for i in range(self.max)]
                it.len = len(it.num)
                it.idx = 0

            def __next__(it):
                if it.idx >= it.len:
                    raise StopIteration()
                rtv = it.num[it.idx]
                it.idx += 1
                return rtv

        return joy_iterator()


a = joy(2)
for i in a:
    print(i)

b = joy(4)
for i in b:
    print(i)

iterator的特点是:只能遍历一遍!

而iterable可以反复遍历,每次遍历,实际上Python解释器都会得到一个新的基于此iterable创建的iterator。

上例用generator来实现更简单

本文链接:https://cs.pynote.net/sf/python/202304291/

-- EOF --

-- MORE --