[Rpm-maint] [rpm-software-management/rpm] support reproducible automatic rebuilds (PR #2880)

Jan Zerebecki notifications at github.com
Fri Feb 2 15:17:56 UTC 2024


> > If a build e.g. embeds SOURCE_DATE_EPOCH in the output, then the output changes every time such a rebuild happens, which can be very often.
> 
> It only changes if you change SOURCE_DATE_EPOCH, and if you took the SOURCE_DATE_EPOCH from the changelog then it only changes if you change the changelog, and at that point its no longer the same. By my logic anyhow. It's really hard to constructively comment on what you don't understand.

If you do not change SOURCE_DATE_EPOCH, but change some inputs that may change outputs then tools (like rsync without --checksum) fail that depend on the file modification time stamp always increasing when the content changes.

If you do change SOURCE_DATE_EPOCH and the upstream build script embeds it into the output, then the output also always changes.

So if Fedora would do the manual changelog bump anytime any build depends change and if they would set SOURCE_DATE_EPOCH and thus the build time from the changelog then it would have the same problem. They could use the OLD_SOURCE_DATE_EPOCH mechanism in this PR to fix it. But the information to set it is currently not in the changes file, as the association of version and revision to dates is not machine readable in the changes file. However it is available in Koji or the package source git repo (last commit date that did change something that is not a changes file) and in the revision if one knows the details of how those get used in Fedora.

OpenSUSE does not change the changelog for the rebuild, but that doesn't make any difference here, as we set SOURCE_DATE_EPOCH (also from the changes file, just done outside of rpm) and OLD_SOURCE_DATE_EPOCH appropriately for rpm to use.

> A test-case outside any complicated build-system machineries would perhaps help understand this on a more concrete level. 

Steps to reproduce:
* package gcc: version 1.1
* package hello: BuildDepends gcc; changelog epoch 1; but its build script includes SOURCE_DATE_EPOCH in the binary
* build package hello
* update package gcc to version 1.2
* rebuild package hello with new gcc: this is bad as the mtime of the files in the package didn't change, but their content did change, on a system where such rpms get installed rsync will create inconsistent copies of the filesystem.
* bump package hello changelog to epoch 2
* build package hello. mtimes changed, good. content of files is changed, this is good, as new gcc creates a different build result.
* update package gcc to version 1.2.1 which only changes documentation.
* bump package hello changelog to epoch 3
* build package hello. content changed, only because SOURCE_DATE_EPOCH changed, gcc would otherwise have produced the same output. this is bad, as it would cause people to unnecessarily download and upgrade.
* build package hello with gcc 1.2 and OLD_SOURCE_DATE_EPOCH set to 1. changed mtimes, changed content, good.
* build package hello with gcc 1.2.1 and OLD_SOURCE_DATE_EPOCH set to 1. changed mtimes, otherwise unchanged content, good. comparison to previous build output says unchanged, so discard this build.

> Also, this lumps a whole lot of changes into one which is further bad for undestanding.
> Split this up into per-change commits, adding docs and test-cases for each. That would be required for acceptance anyhow, and should help seeing the individual bits for what they are without needing to know how some buildsystem somewhere processes stuff. Like already said, for example the bit about erroring out on missing changelog is something that makes perfect sense on its own.

I will split it.

> And OTOH I see there's an added check for SOURCE_DATE_EPOCH in the past, which is also quite unrelated to this all (AFAICS), and which I disagree with: ability to set the time into future is useful for testing purposes.

I noticed it had the wrong condition. The problem it solves is also solved by using set_mtime_to_source_date_epoch. So I removed it for now.
But it would be needed if one were to use clamp_mtime_to_source_date_epoch. If the system clock is before SOURCE_DATE_EPOCH then the build will create mtimes that will not be clamped as they are older, thus defeating the intent of using this setting.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/2880#issuecomment-1924084432
You are receiving this because you are subscribed to this thread.

Message ID: <rpm-software-management/rpm/pull/2880/c1924084432 at github.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rpm.org/pipermail/rpm-maint/attachments/20240202/0f485c5a/attachment-0001.html>


More information about the Rpm-maint mailing list