Magic Methods of Python，魔法接口

Last Updated: 2023-08-20 14:15:50 Sunday

-- TOC --

Object Creating, Init, Del
- __new__&__init__
- __del__
Customizing Attribute Access
Representing Classes
- __str__&__repr__
- __format__
- __len__
- __bool__
- __sizeof__
- __hash__
- __index__
Context Manager
- __enter__&__exit__
Comparative Operators
- __eq__&__ne__
- __gt__,__ge__,__lt__&__le__
Conversions
Bitwise Operators
Arithmetic Operators
Callable Object
- __call__
Copying Object
- __copy__&__deepcopy__
Making Sequence
- __getitem__,__setitem__&__delitem__
- __contains__
Iterator
- __iter__
- __next__

Python Class的Magic Method或Special Method，也称为Dunder Method:

dunder: double underscore before and after, magic method is also named dunder method。

这些magic methods，基本都是被Python解释器调用的接口。可以把Python解释器理解为一个framework，将自己写的class嵌入这个framework，充分利用Python解释器的各项功能，这就是Pythonic。

One of the biggest advantages of using Python's magic methods is that they provide a simple way to make objects behave like built-in types. That means you can avoid ugly, counter-intuitive, and nonstandard ways of performing basic operators.

Operators and some global functions are just syntax sugars for magic methods in Python.

Object Creating, Init, Del

`new`&`init`

创建对象首先调用new

创建对象时，Python解释器调用的第1个接口，并不是我们常见和熟悉的init，而是new。new创建对象，init进行必要的初始化。一般我们写的Python代码，并不会重载new。如果重载new，大概是这样的：

class aa():

    def __new__(cls):
        print('__new__')
        return object.__new__(cls)

    def __init__(self):
        print('__init__')


a1 = aa()
print(a1)

运行效果：

__new__
__init__
<__main__.aa object at 0x7f61db9cc7f0>

new是一个无需修饰的class method，通过object.__new__(cls)这行代码创建对象，在new返回对象后，Python解释器继续调用init初始化对象（如果定义了init）。因此，init是不需要返回的，它只是被隐式的调用了一下。（object对象的new没有别的参数，其它对象可以有）

init的第1个参数是self，所以Python解释器在调用init之前，就一定会将这个self创建出来，即先调用new，由new返回self对象，最后由Python解释器返回给用户代码。

不要在init中抛出异常

在init中抛出异常，这恐怕不是个好的设计：

创建对象的代码需要在try/except中包起来，有些ugly
如果抛出异常，用户代码不会得到对象，有个变量可能会是undefined状态，存在潜在风险
在对象没有成功创建时，__del__还是会被立即调用

可以不定义init

没有init，Python解释器就只是不调用而已。而new自然就是一个所谓的factory function:

class aa():

    def __new__(cls, num):
        obj =  object.__new__(cls)
        if num < 0:
            obj.num = 0
        else:
            obj.num = num
        return obj


a1 = aa(-123)
print(a1.num)
a2 = aa(123)
print(a2.num)

new和init需要有相同的参数列表

重载new的作用

在stackoverflow上看到一段文字，对使用__new__的场景有一点介绍：

New-style classes introduced a new class method __new__() that lets the class author customize how new class instances are created. By overriding __new__() a class author can implement patterns like the Singleton Pattern, return a previously created instance (e.g., from a free list), or to return an instance of a different class (e.g., a subclass). However, the use of __new__ has other important applications. For example, in the pickle module, __new__ is used to create instances when unserializing objects. In this case, instances are created, but the __init__ method is not invoked.

Another use of __new__ is to help with the subclassing of immutable types. By the nature of their immutability, these kinds of objects can not be initialized through a standard __init__() method. Instead, any kind of special initialization must be performed as the object is created; for instance, if the class wanted to modify the value being stored in the immutable object, the __new__ method can do this by passing the modified value to the base class __new__ method.

Singleton Pattern

运行期间某个class的实例只有一个！比如Python内置的None。

The singleton pattern is one of the simplest design patterns. Sometimes we need to have only one instance of our class for example a single DB connection shared by multiple objects as creating a separate DB connection for every object may be costly. Similarly, there can be a single configuration manager or error manager in an application that handles all problems instead of creating multiple managers.

class xyz():

    meOnly = None

    def __new__(cls):
        if cls.meOnly:
            return cls.meOnly
        else:
            cls.meOnly = object.__new__(cls)
            return cls.meOnly

    def __init__(self):
        self.x = 1
        self.y = 2
        self.z = 3


x = xyz()
print(id(x))
y = xyz()
print(id(y))
print(x.x)
print(y.y)
if x is y:
    print('x is y')

以上代码运行效果如下：

$ python3 singleton.py
35731976
35731976
1
2
x is y

`del`

执行del x这行代码，只是让对象的引用计数-1。当对象的引用计数为0时，GC回收对象，此时del会被调用。

If __new__ and __init__ formed the constructor of the object, __del__ is the destructor. It can be quite useful for objects that might require extra cleanup upon deletion, like sockets or file objects.

Python解释器并不保证对象的del一定会被调用，比如当对象还是alive状态，但解释器退出了（比如使用os._exit），此时可能存在某些系统资源没有被释放。因此，如果重载del，需要很小心。del并不能代替良好的编程习惯，主动显式地释放资源。

Customizing Attribute Access

`getattr`,`setattr`&`delattr`

访问实例属性时...

class xyz():

    def __setattr__(self, name, value):
        print('__setattr__', name, value)
        self.__dict__[name] = value

    def __getattr__(self, name):
        print('__getattr__', name)
        try:
            rt = self.__dict__[name]
        except KeyError:
            raise AttributeError() from None
        return rt

    def __delattr__(self, name):
        print('__delattr__', name)
        self.__dict__.pop(name)


x = xyz()
x.a = 1
setattr(x, 'b', 2)
print(x.a, x.b)
del x.a
print(hasattr(x,'a'))  # False
print(getattr(x,'b'))  # 2
print(getattr(x,'a',111))  # 111

__setattr__的实现，要通过操作对象的__dict__属性，否则就会自己调用自己，无限递归循环。
__getattr__的实现，也要通过访问__dict__属性实现，否则也会陷入自我递归的无限循环。
常用的3个builtin接口：getattr，setattr和hasattr。
使用getattr获取属性值更安全，因为可以设置一个在attr不存在时候的默认值。
hasattr接口内部会处理AttributeError异常，对应返回False。
访问其它method，不会触发__getattr__。
访问class variable，不会触发__getattr__。（第1个参数是self）

`getattribute`

访问任意属性时...

一般不会用到，与__getattr__的不同：

访问__dict__属性也会导致自我无限递归，必须要使用super。
除非显式的，或者出现AttributeError，__getattr__将不会得到调用。
访问class variable和其它method，也会触发此接口被调用。

测试代码：

class xyz():

    label = 'xyz'

    def __getattribute__(self, name):
        print('__getattribute__')
        return super().__getattribute__(name)

    def __getattr__(self, name):
        print('__getattr__', name)
        rt = self.__dict__[name]
        return rt

    def show(self):
        print('show')


x = xyz()
print(x.label)
x.show()

`dir`

在dir(object)时被调用。它应该返回一个attribute list。

`get`,`set`,`delete`&`__set_name__`

访问类属性时...

这一组magic methods用于Python的descriptor协议。啥是descriptor？

我的理解，一般情况下，descriptor标识了一个资源，通过descriptor，可以操作资源。比如Linux内的file descriptor，对应Windows下的Handle。Python的descriptor，是一个Class Variable，当通过对象示例访问时，对应的magic method会被调用。

通常情况下，通过对象访问属性，如果是setter动作，同时该属性不存在时，Python解释器会给这个对象创建此属性并赋值。但是，如果该名称属于Class Variable，同时它是个descriptor，Python解释器则不会为对象创建此属性，而是调用descriptor的magic methods。

https://docs.python.org/3/howto/descriptor.html

Descriptors are a powerful, general purpose protocol. They are the mechanism behind properties, methods, static methods, class methods, and super(). They are used throughout Python itself. Descriptors simplify the underlying C code and offer a flexible set of new tools for everyday Python programs.

Representing Classes

`str`&`repr`

这两个magic method分别在对象被str和repr调用时调用，前者返回的string面向end user，后者返回的字符串更专业，如果有可能，repr得到的字符串可以用来创建这个对象。

在Python解释器interactive模式下，输入对象，返回的就是repr得到的结果。如果对象没有定义repr，则返回类似下面这样的信息：

>>> class xyz: pass
... 
>>> x = xyz()
>>> x
<__main__.xyz object at 0x7fd2a0fd3880>  # default __repr__

而str会默认被print接口调用。

如果这两个要实现至少一个，请实现repr，因为如果没有实现str，repr是str的fallback。

class xyz:    
    def __repr__(self):
        return 'i am xyz'

测试：

>>> x = xyz()
>>> x
i am xyz
>>> str(x)
'i am xyz'

`format`

在格式化字符串{:spec}的时候被调用。

class xyz:

    def __str__(self):
        return '__str__'

    def __repr__(self):
        return '__repr__'

    def __format__(self, spec):
        print("in __format__")
        return spec


x = xyz()
print(f"{x:j}")
print(f"{x:&j}")
print("{:999}".format(x))
print("{:@@@}".format(x))
print(f"{x!s}")  # call __str__
print(f"{x!r}")  # call __repr__

`len`

在len(object)的时候被调用。

`bool`

在bool(object)的时候被调用。

在Python2中，这个接口叫做__nonzero__。

`sizeof`

在sys.getsizeof(object)的时候被调用，返回结果+16，就是最终的返回结果。

`hash`

在hash(object)的时候被调用，此magic应该返回int对象，如果对象被用来作为dict中的key，真实的key，就是hash返回的int。

Note that this usually entails implementing __eq__ as well. Live by the following rule: a == b implies hash(a) == hash(b).

一般用户自定义的对象，都默认hashable！无需用户自定义__hash__接口。

一般可以认为hashable的对象都是immutable对象。

`index`

当对象被用作index时，即[object]，被调用。

class xyz():

    def __init__(self):
        self.a = 1

    def __index__(self):
        return self.a


x = xyz()
a = [1,2,3]
print(a[x])

Context Manager

`enter`&`exit`

with语句在Python中对应context management，

Context Managers are Python’s resource managers. In most cases, we use files as resources (a simple resource). We often don’t care about closing the files at the end of execution. This is a bad coding practice and also this causes issues when too many files are opened, or when the program is terminated with failure as the resource is not properly released. Context managers are the rescue for this issue by automatically managing resources. In Python, the with keyword is used.

常见的对文件或mutex资源的操作：

with open(filename) as f:
    f.read()
...
import threading
mutex = threading.Lock()
with mutex:
    ...

with语句保证了，当执行到with block之外时，资源会得到妥善的释放。

定义enter和exit

通过在自定义的class指定，实现enter和exit，即可让自己的class也能够支持with语句。

# a simple file writer object
class MessageWriter(object):
    def __init__(self, file_name):
        self.file_name = file_name

    def __enter__(self):
        self.file = open(self.file_name, 'w')
        return self.file

    def __exit__(self, *args): 
        self.file.close()


# using with statement with MessageWriter
with MessageWriter('my_file.txt') as f:
    f.write('hello world')

首先创建对象，然后在with的作用下，调用enter，返回的对象赋给f。在with block内通过f操作资源，在离开with block后，exit被调用。

exit的参数和with block中的异常

exit接口除了self之外，还有3个参数，并且可以直接处理with block抛出的异常。

class Divide:
    def __init__(self, num1, num2):
        self.num1 = num1
        self.num2 = num2

    def __enter__(self):
        print("Inside __enter__")
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        print("Inside __exit__")
        print("Exception type:", exc_type)
        print("Exception value:", exc_value)
        print("Traceback:", traceback)
        return True

    def do_divide(self):
        print(self.num1 / self.num2)


a = Divide(1,2)
with a as f:
    f.do_divide()

a = Divide(1,0)
with a as f:
    f.do_divide()
    raise ValueError

执行效果：

Inside __enter__
0.5
Inside __exit__
Exception type: None
Exception value: None
Traceback: None
Inside __enter__
Inside __exit__
Exception type: <class 'ZeroDivisionError'>
Exception value: division by zero
Traceback: <traceback object at 0x7f5e19a05740>

exit接口的全部参数如上代码。当with block执行期间没有异常抛出时，这3个参数全都是None，当有异常抛出时，这3个参数就有值了。我们完全可以在exit内部将异常处理掉，然后返回True。如果exit没有返回True，发生的异常将会从with block中抛出。

用with同时enter进入多个对象

>>> with open('tox.ini') as f, open('ping.py') as g:
...   f.read()
...   g.read()
...

with与generator

参考：generator总结

Comparative Operators

__cmp__被Python3取消了.....__cmp__ is gone due to redundancy with other magic methods

`eq`&`ne`

class xyz:
    def __init__(self, num):
        self.num = num

    def __eq__(self, other):
        return self.num == other.num

    def __ne__(self, other):
        return self.num != other.num



a = xyz(1)
b = xyz(2)
c = xyz(1)

print(a == b)  # False
print(a == c)  # True
print(a != b)  # True
print(a != c)  # False

`gt`,`ge`,`lt`&`le`

class xyz:
    def __init__(self, num):
        self.num = num

    def __gt__(self, other):
        print('self id', id(self))
        return self.num > other.num

    def __ge__(self, other):
        print('self id', id(self))
        return self.num >= other.num


a = xyz(1)
b = xyz(2)
print(a > b)
print(a < b)

c = xyz(1)
print(a >= c)
print(a <= c)

以上代码，只定义了gt和ge，但也够了，当执行a<b时，Python解释器会调用b.__gt__接口。以上代码运行打印如下：

self id 139894170461296
False
self id 139894170463648
True
self id 139894170461296
True
self id 139894170465520
True

@total_ordering

相等与否，以及大小比较，如果把所有magic method都定义出来，会显得多余。Python提供了一个装饰器，total_ordering，自己只要实现eq和gt（或lt），其它的就都能够推导出来了。

from functools import total_ordering


@total_ordering
class xyz:
    def __init__(self, num):
        self.num = num

    def __gt__(self, other):
        print('__gt__')
        return self.num > other.num

    def __eq__(self, other):
        print('__eq__')
        return self.num == other.num


a = xyz(1)
b = xyz(2)
c = xyz(3)
print(a < b)
print(a >= c)
print(a <= c)
print(a != b)

Conversions

`pos`&`neg`

class strnum(str):
    def __new__(cls, num):
        return str.__new__(cls, num)

    def __pos__(self):
        return '+'+self

    def __neg__(self):
        return '-'+self


a = strnum(12345)
print(+a)  # +12345
print(-a)  # -12345

import math


class xyz():

    def __abs__(self):
        return 1

    def __round__(self, n=0):
        return 2

    def __floor__(self):
        return 3

    def __ceil__(self):
        return 4

    def __trunc__(self):
        return 5


x = xyz()
print(abs(x))
print(round(x))
print(math.floor(x))
print(math.ceil(x))
print(math.trunc(x))

`int`,`float`&`complex`

class xyz():

    def __int__(self):
        return 9

    def __float__(self):
        return 9.9

    def __complex__(self):
        return 1+2j

    def __hex__(self):
        return '0xFF'


x = xyz()
print(int(x))
print(float(x))
print(complex(x))

class xyz():

    def __invert__(self):
        return 1

    def __and__(self, other):
        return 2

    def __or__(self, other):
        return 3

    def __xor__(self, other):
        return 4

    def __lshift__(self, other):
        return 5

    def __rshift__(self, other):
        return 6


x = xyz()
y = xyz()
print(~x)
print(x & y)
print(x | y)
print(x ^ y)
print(x << 2)
print(x >> 2)

Reflected Bitwise Operators

所谓reflected，就是反过来，正常是object op other，reflected是other op object！

`rand`,`ror`&`rxor`

`rlshift`&`rrshift`

class xyz():

    def __rand__(self, other):
        return 2

    def __ror__(self, other):
        return 3

    def __rxor__(self, other):
        return 4

    def __rlshift__(self, other):
        return 5

    def __rrshift__(self, other):
        return 6


x = xyz()
y = xyz()
print(1 & y)
print(1 | y)
print(1 ^ y)
print(100 << x)
print(100 >> x)

Argumented Assignment Bitwise Operators

`iand`,`ior`&`ixor`

`ilshift`&`irshift`

class xyz():

    def __iand__(self, other):
        return 2

    def __ior__(self, other):
        return 3

    def __ixor__(self, other):
        return 4

    def __ilshift__(self, other):
        return 5

    def __irshift__(self, other):
        return 6


x = xyz(); x &= 1; print(x)
x = xyz(); x |= 1; print(x)
x = xyz(); x ^= 1; print(x)
x = xyz(); x <<= 1; print(x)
x = xyz(); x >>= 1; print(x)

class xyz():

    def __add__(self, other):
        return 1

    def __sub__(self, other):
        return 2

    def __mul__(self, other):
        return 3

    def __floordiv__(self, other):
        return 4

    def __truediv__(self, other):
        return 5

    def __mod__(self, other):
        return 6

    def __pow__(self, other):
        return 7

    def __divmod__(self, other):
        return 6,8

    def __matmul__(self, other):
        return 9


x = xyz()
print(x + 1)
print(x - 1)
print(x * 1)
print(x // 1)
print(x / 1)
print(x % 4)
print(x**2)
print(divmod(x,2))
print(x @ 7)

Reflected Arithmetic Operators

class xyz():

    def __radd__(self, other):
        return 1

    def __rsub__(self, other):
        return 2

    def __rmul__(self, other):
        return 3

    def __rfloordiv__(self, other):
        return 4

    def __rtruediv__(self, other):
        return 5

    def __rmod__(self, other):
        return 6

    def __rpow__(self, other):
        return 7

    def __rdivmod__(self, other):
        return 6,8

    def __matmul__(self, other):
        return 9


x = xyz()
print(1 + x)
print(1 - x)
print(1 * x)
print(8 // x)
print(8 / x)
print(8 % x)
print(8**x)
print(divmod(8,x))
print(7 @ x)

Augmented Assignment Arithmetic Operators

class xyz():

    def __iadd__(self, other):
        return 1

    def __isub__(self, other):
        return 2

    def __imul__(self, other):
        return 3

    def __ifloordiv__(self, other):
        return 4

    def __itruediv__(self, other):
        return 5

    def __imod__(self, other):
        return 6

    def __ipow__(self, other):
        return 7

    def __imatmul__(self, other):
        return 9


x = xyz(); x += 1; print(x)
x = xyz(); x -= 1; print(x)
x = xyz(); x *= 1; print(x)
x = xyz(); x //= 1; print(x)
x = xyz(); x /= 1; print(x)
x = xyz(); x **= 1; print(x)
x = xyz(); x %= 1; print(x)
x = xyz(); x @= 7; print(x)

似乎在Python中有这样一个规则：augmented operator用在mutable对象上时，不创建新对象，原地修改。

Callable Object

`call`

class xyz():

    def __call__(self, a, b, c):
        return a+b+c


x = xyz()
print(x(1,2,3))

Function in Python is the first class citizen!

这句话的含义，函数跟其它对象一样，可以赋给变量，可以作为参数传递。在Python中定义的函数对象，自带__call__属性，因此可以被解释器直接调用。

函数式编程

比如map，它的第1个参数是一个function，这个function作用到第2个参数上。有个术语，叫做higher-order function，就如map。这就是函数式编程的一个case。

a[i] = v # set
a[i] # get
del a[i] # del

`contains`

在使用in或not in时被调用。

class xyz():

    def __init__(self):
        self.a = 1
        self.b = 2
        self.c = 3
        self.d = 4
        self._map = {0:'a',1:'b',2:'c',3:'d'}

    def __len__(self):
        return len(self._map)

    def __getitem__(self, idx):
        if idx not in self._map:
            raise IndexError('Out of Index')
        return eval('self.'+self._map[idx])

    def __setitem__(self, idx, val):
        exec('self.'+self._map[idx]+'='+str(val))

    def __delitem__(self, idx):
        """ set to zero """
        exec('self.'+self._map[idx]+'=0')

    def __contains__(self, val):
        if val in (self.a, self.b, self.c, self.d):
            return True
        return False


x = xyz()
print(len(x))
print(x[2])
x[2] = 999
print(x[2])
del x[2]
print(x[2])
print(1 in x)
print(999 not in x)

class joy:

    def __init__(self, max):
        self.max = max

    def __iter__(self):

        class joy_iterator:

            def __init__(it):
                # function's local class,
                # self can be accessed directly.
                it.num = [i for i in range(self.max)]
                it.len = len(it.num)
                it.idx = 0

            def __next__(it):
                if it.idx >= it.len:
                    raise StopIteration()
                rtv = it.num[it.idx]
                it.idx += 1
                return rtv

        return joy_iterator()


a = joy(2)
for i in a:
    print(i)

b = joy(4)
for i in b:
    print(i)

iterator的特点是：只能遍历一遍！

而iterable可以反复遍历，每次遍历，实际上Python解释器都会得到一个新的基于此iterable创建的iterator。

上例用generator来实现更简单。

本文链接：https://cs.pynote.net/sf/python/202304291/

-- EOF --

-- MORE --