Skip to content

Centralize all de_json's into TelegramObject #5189

@harshil21

Description

@harshil21

What kind of feature are you missing? Where do you notice a shortcoming of PTB?

Something I've wanted to do for a while is reduce duplication of de_json throughout the library. As of today, de_json is typically doing 4 things:

  • Converting unix timestamps to datetimes
  • Converting dict -> TelegramObject's obviously
  • Handling Sequence inputs with TO.de_list
  • Decide which class would be created depending on a field like type (i.e. ChatMember classes and the like). I'll hereby refer to these kinds as _de_json delegators.

Describe the solution you'd like

I propose handling de_json by doing type introspection of classes. With that, one can know what kind of transformation (i.e. lambdas) to apply for every field.

Type inspection, however, is an expensive operation on every de_json call. But if you combine that with per class caching, there would be ~no runtime cost of this approach compared to what we have today.

Implementation details

The primary cache

The basic idea is to introduce a class variable, __DE_JSON_PLAN__, which is going to be the per-class dict cache of field names -> transformations. This will be populated on the first call to de_json for that particular class. de_json will then simply iterate through the data and apply the transformation if it's needed.

Handling de_json delegators

For these classes, we'll still have to manually define a mapping as another class variable (__DE_JSON_DISPATCH__). This would be telling de_json what field name to delegate on, e.g. status or type, and the mapping of that value -> class. On every de_json call, this mapping is used to decide which class is going to be instantiated.

Supporting deprecated fields from the API

de_json currently also removes a few deprecated fields like can_send_media_messages and places them into api_kwargs. With this implementation, a subclass can simply define another class variable, __REMOVED_API_FIELDS__, which will hold that information. de_json will move the data to api_kwargs.

Describe alternatives you've considered

Using another (de)serialization library like mentioned in #2698. Adding another third party dependency however is not very appealing, specially in this day and age.

Additional context

I have a draft PoC up at #5186. Initial performance benchmarks show -15% to a +20% performance gain. See that PR description for details on that benchmark

Any thoughts / alternative approaches are welcome!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions