[Rpm-maint] [rpm-software-management/rpm] The debuginfo Rube Goldberg machine (Discussion #3188)
Panu Matilainen
notifications at github.com
Fri Jun 28 10:13:18 UTC 2024
Working on an old piece of code like rpm is much like city infrastructure renewals: you try to expect the unexpected and plan accordingly, but every now and then you'll still get surprised when you break the asphalt: "what are all these pipes, they don't exist in any drawing?". And consequently, work gets delayed to sort it all out. Several times.
Perhaps the main headline feature in [rpm 4.20](https://rpm.org/wiki/Releases/4.20.0) is the [declarative buildsystem](https://github.com/rpm-software-management/rpm/issues/1087) support in the spec files. This was a feature I first dreamed up around 2012, which alone suggests there was quite a bit of plumbing to sort out before it could happen. One of the more critical support features for that was ability to [append and prepend](https://github.com/rpm-software-management/rpm/pull/2728) to existing spec sections. In order to support *that*, the previously very special `%prep` section with its built-in `%setup` and `%prep` pseudo-macros needed to be [turned into](https://github.com/rpm-software-management/rpm/pull/2730) normal scriptlet, and in order to do that, the pseudo-macros needed to be turned into real macros, and in order to do *that*, the macro engine needed a rather thorough rework, also over several years and countless changes like #1406 and #1434. The first concrete step towards declarative builds was [introduction of %autosetup](6c5214950e5885c33c498969ca256c9550f5936b) in 2012, complemented with [%patchlist](https://github.com/rpm-software-management/rpm/pull/679) in 2019. And so on. It was a lot of work spread over more than a decade, but these factors were reasonably well known ahead. But this is all just backdrop to the thing that *did* get us by surprise, right on the finishing lines. If you do things with rpm, you have probably encountered it a few times: debuginfo packages.
That story begins somewhere around 2002. I wasn't deeply involved with rpm at that time so the early parts is based info gathered and deduced from commits to rpm and redhat-rpm-config and various fragments I've heard/read over the years and so may contain inaccuracies. But AIUI, the toolchain people at Red Hat were tasked with making debugging released binary builds meaningful. I don't know whether the task was specifically to achieve this without major changes to rpm itself, but that's how it took place: it was practically all implemented with macro voodoo + a helper script and a binary, none of which needed to be inside rpm. Much of it ended up in the rpm repository sooner or later, but it didn't *need* to be there. It's no mean feat, really, but it also did require some quite, uh, creative solutions.
Of course in the intervening 22 years *a lot* happened. For a long time, the underly was in the rpm repository and In particular, around 2017 Mark Wielaard practically rewrote the underlying debugedit tool and introduced some in-rpm code for better integration, and Michael Schroeder and Richard Biener added support for debuginfo sub-packages, which also needed in-rpm code. And then in 2021 debugedit and the helper script was split to an external project because people outside the rpm ecosystem got interested in them. To a great relief to us rpm maintainers: debugedit deals with deep ELF format internals, and we never really knew what to do with it anyhow. In all that flux, the one thing that didn't change is the one thing almost certainly intended as a temporary hack only: the way debuginfo packages are actually enabled. It also never entered the rpm codebase at all. And that's what we ran into head-on, 22 years later, basically on the eve of the 4.20 alpha release.
In broad strokes, debuginfo packages live as template macros which are used to generate the spec preamble for them, and then a script invoked from %install post template to edit and collect the files. This all be done quite neatly in the generic spec scriptlet template infrastructure, except for one thing: how do you inject something into nearly every single spec preamble, without actually modifying them? I believe the brilliant-awful macro hack was originally by Elliot Lee in redhat-rpm-config, for accomplishing something else. The people adding debuginfo support saw the trick and ran away with it. Since 2002, redhat-rpm-config has contained this macro definition:
```
%install %{?_enable_debug_packages:%{?buildsubdir:%{debug_package}}}\
%%install\
%{nil}
```
This is the entry to our little Rube Goldberg machine. There's an incredible amount of powerful magic embedded in those three lines.
You need to be familiar with the rpm spec syntax and macros to properly follow this, but `%install` marks the beginning of the shell scriptlet where the packager tells rpm which content to put in the resulting binary package. So, `%install` is just a section opener string, hard-coded inside the spec parser for that purpose, and doesn't "do" anything by itself. However the above macro turns this innocent section marker to quite something else. In English: if `%_enable_debug_packages` macro is defined, then if '%setup' was used in the spec (`%buildsubdir` is a side-effect of that), expand the contents of `%{debug_package}` macro here, and then add back the `%install` section marker as if nothing hapened.
`%debug_package` is defined something like this:
```
%debug_package \
%ifnarch noarch\
%global __debug_package 1\
%_debuginfo_template\
%{?_debugsource_packages:%_debugsource_template}\
%endif\
%{nil}
```
`%_debuginfo_template` is the spec preamble definition of a debuginfo package, something like this (trimmed for brewity):
```
%_debuginfo_template \
%package debuginfo\
Summary: Debug information for package %{name}\
%description debuginfo\
This package provides debug information for package %{name}.\
%files debuginfo -f debugfiles.list\
%{nil}
```
So the `%install` macro override emits all that into the spec preamble section inside a `%ifnarch noarch` conditional to prevent it from firing on arch independent packages (which aren't expected to contain ELF files), and then emits that original `%install` to let you proceed with whatever it was your package does in there. It's really quite clever, but at the same time, awful. But it gets weirder from there. Notice how there's a `%global __debug_package 1` inside the `%ifnarch` block? You'd think that it doesn't get defined on noarch packages, but it does. The macro engine doesn't know anything about `%if` and the like, it's only something that looks like a macro but is undefined so falls through untouched. The multiline result then gets passed to the spec parser which processes the %ifs and the other content.
At the other end of `%install`, the rest of the magic is embedded inside `%_spec_install_post` template macro, something like the following, and gets appended to end of `%install` behind the scenes during the actual build of a package:
```
%__spec_install_post\
%{?__debug_package:%{__debug_install_post}}\
%{__arch_install_post}\
%{__os_install_post}\
%{nil}
```
Note how it tests for `%__debug_package` definition to avoid triggering on noarch packages. But we just concluded in the above that it gets always defined! Yet, somehow debuginfo packages are not generated for debuginfo packages, so it must work somehow? Well, it doesn't. The `%__spec_install_post` section actually fires for noarch packages but it silently falls through as there's nothing for it to do on a normal noarch package. But, it can leave behind tell-tale `debugfiles.list` etc files in the build directory if you go looking. So how does it not fail with errors then? The catch is that the `%ifnarch noarch` block in the `%debug_package` macro works for the spec preample part, so the debuginfo *package* is never created, and so rpm doesn't go looking for it, and the *.list files end up just being some junk in the directory, rpm doesn't care.
That's why I call it a Rube Goldberg machine: it may not be intentionally complicated, but it sure is complicated and precarious.
Now, what does this all have to do with our declarative buildsystems? Well, the related append and prepend options means `%install` can occur multiple times in the spec with -a or -p options, and you can probably see how that wouldn't go too well with this. But, couldn't you just turn the `%install` macro override into a parametric macro which only emits the debug stuff when no arguments are passed and look away for another twenty years? Well, maybe, but the madness has to stop somewhere.
The real rub was that this `%install` override over exists in distros and not rpm upstream, we only really ran into it when it was far too late in the release process to start reworking something like that. Technically, I knew it existed but had blissfully forgotten, and certainly didn't realize the implications when adding append/prepend modes. In any case, this blocked the use of our headline feature, so we scrambled for a few weeks to get debuginfo enablement logic properly and fully upstreamed. The existing frail machinery, together with tens of thousands of packages built on top and sometimes around it, each in their own sometimes peculiar ways, was always a terrifying thing to modify, and doubly so when under time pressure.
The end result in 4.20 utilizes some of the new dynamic spec generation features Florian Festi has been working on. It was no walk in the park though, it took us several weeks of experimenting over multiple pull-requests to get it right to the point it currently is. Among other fun, there was a bug which causes `%_target_cpu` and various other macros + variables to disagree with the rest of the spec on specs with BuildArch, when an explicit `--target` is not passed, causing surprises (like, getting debuginfo packages when you don't expect them) with dynamically generated content. And when I finally made rpm automatically reload the platform configuration if `--target noarch` is not specified to address that, we discovered that `mock` always passes something like `--target $(uname -m)` to rpmbuild, even for noarch packages. Except inside `koji` where it always passed `--target noarch`. And so on. We also intended to enable debuginfo packages for packages without %setup, but that turned out to be too much breakage.
What is present in 4.20 resembles the old madness way too much for my liking, but various details of the old implementation have leaked to thousands of specs in ways that make changing them impossible or nearly so. At least all the machinery is now upstream under our eyes where we can hopefully simplify and streamline it gradually over time.
To those who made it this far: this is hopefully the start of on-going blogs about rpm development, "tales from the trenches" and whatnot.
--
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/discussions/3188
You are receiving this because you are subscribed to this thread.
Message ID: <rpm-software-management/rpm/repo-discussions/3188 at github.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rpm.org/pipermail/rpm-maint/attachments/20240628/9b05494a/attachment-0001.html>
More information about the Rpm-maint
mailing list