Python 3.10: new dataclass features

Dataclasses is a feature of the Python programming language which I designed and implemented in Python 3.7. It allows you to easily add a number of useful features to any Python class that you write.

The Python 3.10 beta 1 was released last week. I've added two significant features to dataclasses in this release: support for __slots__ and support for keyword-only __init__ parameters.

__slots__

I've long resisted adding support for __slots__ to dataclasses. The reason is simple: dataclasses has always worked under the model of “it's just adding methods to a normal class”. There are no metaclasses involved, no required inheritance or base classes, and no Abstract Base Classes (ABCs). If your class wants to use these features, then you can do so freely, without interference from dataclasses.

But that's not possible with __slots__. A requirement of specifying __slots__ is that it must be set at class creation time. But that's not possible with the @dataclass decorator: it only gets called after the class is already created. So adding __slots__ requires that a new class be created once @dataclass has computed the __slots__ value.

So that's what I did in Python 3.10. In the case where slots=True is specified, and only in that case, @dataclass will create a new class with the computed __slots__ and return that to the caller. This isn't exactly new functionality. My dataclasses backport for Python 3.6 has a helper decorator called @add_slots that does the same thing. And attrs has supported it for a long time.

While this does go against dataclasses's initial design principal, I feel that this feature is sufficiently useful that it's worth bending the rules in this one specific case. Adding support for __slots__ was one of the most commonly requested features for dataclasses.

keyword-only __init__ parameters

Another commonly requested feature for dataclasses is to support keyword-only __init__ parameters. There are two reasons people request this feature:

  • When a dataclass has many fields, specifying them by position can become unreadable. It also requires that for backward compatibility, all new fields are added to the end of the dataclass. This isn't always desirable.
  • When a dataclass inherits from another dataclass, and the base class has fields with default values, then all of the fields in the derived class must also have defaults.

There are several ways to specify that fields specify keyword-only __init__ parameters. The simplest way is to specify kw_only=True to @dataclass:

@dataclass(kw_only=True)
class Point1:
    x: int
    y: int

In this example, both the x and y parameters to the generated __init__ are keyword-only. The generated __init__ is:

def __init__(self, *, x:int, y:int=0):

Another way is to specify kw_only=True to the field() object used to initialize a field.

@dataclass
class Point2:
    x: int
    y: int = field(kw_only=True)

In this example, the generated __init__ is:

def __init__(self, x:int, *, y:int):

And the third way is to use the special marker type KW_ONLY. This is a module-level value in the dataclasses module. Any fields appearing after a field of type KW_ONLY will be keyword-only in the generated __init__.

@dataclass
class Point3:
    x: int
_: KW_ONLY
    y: int

In this case, the generated __init__ looks the same as for Point2.

There are a number of issues related to parameter re-ordering when using keyword-only parameters. For a detailed discussion, see the official dataclasses documentation for Python 3.10.