[NO-REVIEW] Use Zstandard for compressing singlefile assemblies#123542
[NO-REVIEW] Use Zstandard for compressing singlefile assemblies#123542rzikm wants to merge 2 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @agocke, @elinor-fung |
|
@VSadov, @vitek-karas, @janvorli do you know what would be the easiest way to test/benchmark this kind of change? |
|
For benchmarking, you may want to pick an app with more managed code, like self-contained ASP.NET webapi. We want to measure the binary and startup time (just make the app exit immediately and then you can just measure the time the process took to run). |
There was a problem hiding this comment.
Pull request overview
This is an experimental pull request that replaces DEFLATE compression with Zstandard compression for bundled managed assemblies in single-file published applications. The PR is marked as "[NO-REVIEW]" indicating it's a prototype or work-in-progress.
Changes:
- Native decompression code updated from zlib to Zstandard in both bundle extraction and PE image layout modules
- Managed compression code updated to use ZstandardStream (for .NET) instead of DeflateStream
- Build system updated to reference Zstandard headers instead of zlib headers
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/native/corehost/bundle/extractor.h | Added zstd.h include and ZSTD_DCtx member for reusable decompression context with proper cleanup in destructor |
| src/native/corehost/bundle/extractor.cpp | Replaced zlib decompression logic with Zstandard streaming decompression, removed pal_zlib.h include |
| src/installer/managed/Microsoft.NET.HostModel/Microsoft.NET.HostModel.csproj | Added reference to System.IO.Compression.Zstandard for .NET builds |
| src/installer/managed/Microsoft.NET.HostModel/Bundle/Bundler.cs | Updated compression logic to use ZstandardStream with SmallestSize compression level for .NET builds, keeping DeflateStream for .NET Framework |
| src/coreclr/vm/peimagelayout.cpp | Replaced zlib decompression with Zstandard single-shot decompression for compressed PE images |
| src/coreclr/vm/CMakeLists.txt | Updated include directory from System.IO.Compression.Native to external/zstd/lib |
src/installer/managed/Microsoft.NET.HostModel/Bundle/Bundler.cs
Outdated
Show resolved
Hide resolved
|
I finally managed to measure some numbers, please take a look a the data in the top post. Is the difference enough for us to consider taking the change? we can also later explore lzma compression once it gets implemented later in the release cycle. |
|
I think the interesting part is the tail end |
The bundler is using Deflate with CompressionLevel.SmallestSize, so I think it already is using the highest setting. |
This is an experiment that replaces DEFLATE with Zstandard in single-file published applications for compression of bundled managed assemblies.
Edit: This has been quite complicated to figure out how to test, documenting the steps below for future reference.
Resulting csproj file:
NuGet.config
The testing application was a simple ASP.NET hello world with the weather API, with immediate exit after startup
Startup speed impact
I temporarily put extra logs in the apphost to verify that we decompress at least some images, and there seem to have been about 20 of them. Startup times themselves had quite a multimodal distribution (probably due to filesystem caching), the times below show the lowest modality:
I would expect similar diffs on higher modalities, there was no difference across different compression level options that I could measure, so I did not include more lines in the table above
Build time and binary size impact:
For measuring build times and actual binary sizes I measured the execution of
dotnet publish ...after deletingbinandobjfolders (keeping the nuget_cache intact).Compression disabledZstandard (Quality=1)Zstandard (Quality=2)Zstandard (Quality=3)Zstandard (Quality=4)Zstandard (Quality=5)DeflateZstandard (Quality=6)Zstandard (Quality=7)Zstandard (Quality=8)Zstandard (Quality=9)Zstandard (Quality=10)Zstandard (Quality=11)Zstandard (Quality=12)Zstandard (Quality=13)Zstandard (Quality=14)Zstandard (Quality=15)Zstandard (Quality=16)Zstandard (Quality=17)Zstandard (Quality=18)Zstandard (Quality=19)Zstandard (Quality=20)Zstandard (Quality=21)Zstandard (Quality=22)Looks like for Quality values between 5 and 11, we start getting small gains in both startup time and binary size, but they seem rather small, we can squeeze some more gains by training a dictionary on the usual dll files (runtime+aspnet?), but this shaves only about additional 1MB at most, so is likely not worth the extra logistics around embedding the decompression dictionary in the singlefile exe