Skip to content

Why does compressed JSON usually smaller than compressed Msgpack? #328

@CNSeniorious000

Description

@CNSeniorious000

I benchmarked JSON and Msgpack in some real data, but it shows that after compression, data encoded by msgpack is always larger than JSON, although raw msgpack is smaller than JSON. I've tested brotli, lzma, blosc on python.

There is a common use case and here is an Reproducible example:

>>> a = { ... }  # the embeddings API response from OpenAI

>>> len(msgpack.encode(a))
13951

>>> len(json.encode(a))
19506

>>> len(compress(msgpack.encode(a)))
9620

>>> len(compress(json.encode(a)))
6409

I wonder why and I am thinking maybe it is not worthy to use Msgpack in Web responses (because almost every browser supports compressing nowdays)? No offence, I was a big fan of Msgpack and used to use it everywhere.


I find this already discussed in #203 but I've also tested msgpack on data of string (like OpenAI's chat completion response), and compressed JSON is still a bit smaller. I am confusing. Isn't Length-Prefixed Data better than Delimiter-Separated Data?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions