-
Notifications
You must be signed in to change notification settings - Fork 6k
Centralize all de_json's into TelegramObject #5189
Description
What kind of feature are you missing? Where do you notice a shortcoming of PTB?
Something I've wanted to do for a while is reduce duplication of de_json throughout the library. As of today, de_json is typically doing 4 things:
- Converting unix timestamps to datetimes
- Converting dict -> TelegramObject's obviously
- Handling Sequence inputs with
TO.de_list - Decide which class would be created depending on a field like
type(i.e.ChatMemberclasses and the like). I'll hereby refer to these kinds as_de_jsondelegators.
Describe the solution you'd like
I propose handling de_json by doing type introspection of classes. With that, one can know what kind of transformation (i.e. lambdas) to apply for every field.
Type inspection, however, is an expensive operation on every de_json call. But if you combine that with per class caching, there would be ~no runtime cost of this approach compared to what we have today.
Implementation details
The primary cache
The basic idea is to introduce a class variable, __DE_JSON_PLAN__, which is going to be the per-class dict cache of field names -> transformations. This will be populated on the first call to de_json for that particular class. de_json will then simply iterate through the data and apply the transformation if it's needed.
Handling de_json delegators
For these classes, we'll still have to manually define a mapping as another class variable (__DE_JSON_DISPATCH__). This would be telling de_json what field name to delegate on, e.g. status or type, and the mapping of that value -> class. On every de_json call, this mapping is used to decide which class is going to be instantiated.
Supporting deprecated fields from the API
de_json currently also removes a few deprecated fields like can_send_media_messages and places them into api_kwargs. With this implementation, a subclass can simply define another class variable, __REMOVED_API_FIELDS__, which will hold that information. de_json will move the data to api_kwargs.
Describe alternatives you've considered
Using another (de)serialization library like mentioned in #2698. Adding another third party dependency however is not very appealing, specially in this day and age.
Additional context
I have a draft PoC up at #5186. Initial performance benchmarks show -15% to a +20% performance gain. See that PR description for details on that benchmark
Any thoughts / alternative approaches are welcome!