[Rpm-maint] [rpm-software-management/rpm] Examine Compressed Headers (Issue #2220)
Daniel Alley
notifications at github.com
Wed Dec 21 06:33:20 UTC 2022
Here's a script I threw together in 30 minutes for getting a rough estimate of the usefulness
```Python
#!/usr/bin/env python
import os
import sys
import createrepo_c as cr
import lz4.frame
import zstd
results = {}
with os.scandir(sys.argv[1]) as entries:
for entry in entries:
if not entry.is_file() or not entry.path.endswith(".rpm"):
continue
pkg = cr.package_from_rpm(
entry.path,
location_href=str(entry.path),
checksum_type=cr.SHA256,
)
header_size_bytes = pkg.rpm_header_end - pkg.rpm_header_start
with open(entry.path, "rb") as f:
f.seek(pkg.rpm_header_start)
header = f.read(header_size_bytes)
zstd_header = zstd.ZSTD_compress(header)
lz4_header = lz4.frame.compress(header)
results[str(entry.path)] = {
"header_size": header_size_bytes,
"package_size": pkg.size_package,
"archive_size": pkg.size_archive,
"header_size_zstd": len(zstd_header),
"header_size_lz4": len(lz4_header),
}
total_size_headers = 0
total_size_packages = 0
total_size_archives = 0
total_size_lz4 = 0
total_size_zstd = 0
print("Results for {} packages".format(len(results)))
for package, data in results.items():
total_size_headers += data["header_size"]
total_size_packages += data["package_size"]
total_size_archives += data["archive_size"]
total_size_lz4 += data["header_size_lz4"]
total_size_zstd += data["header_size_zstd"]
print("Average header size as proportion of package total: {:.2f}%".format(total_size_headers / total_size_packages * 100))
print("Average header savings for LZ4 compressed headers: {:.2f}%".format(total_size_lz4 / total_size_packages * 100))
print("Average header savings for ZSTD compressed headers: {:.2f}%".format(total_size_zstd / total_size_packages * 100))
```
Run it like so
```
[dalley at thinkpad devel]$ python compressed_header_test.py ~/devel/repos/fixture/
Results for 35 packages
Average header size as proportion of package total: 64.90%
Average header savings for LZ4 compressed headers: 46.56%
Average header savings for ZSTD compressed headers: 33.52%
```
(these sample results ought to be ignored entirely, the packages are effectively completely empty hello-world type stuff, not even remotely real-world)
--
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/2220#issuecomment-1360910971
You are receiving this because you are subscribed to this thread.
Message ID: <rpm-software-management/rpm/issues/2220/1360910971 at github.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rpm.org/pipermail/rpm-maint/attachments/20221220/76e2bd2d/attachment-0001.html>
More information about the Rpm-maint
mailing list