[Rpm-maint] [rpm-software-management/rpm] Standardize on OCI images for test-suite, even locally (Issue #2643)

Michal Domonkos notifications at github.com
Fri Sep 1 15:46:40 UTC 2023


## Background

The `rpmtests` script, once built, is designed to be run as root and exercises the RPM installation in the root filesystem. When an individual test requires write access to the root filesystem (e.g. to install some packages), a lightweight container ("snapshot") using a combination of [Bubblewrap](https://github.com/containers/bubblewrap) and [OverlayFS](https://docs.kernel.org/filesystems/overlayfs.html) (implemented as `snapshot()` in `atlocal.in`) is created and RPM is executed in it. This is to prevent the individual tests from affecting each other.

When hacking on RPM, it's typically not desired to install the build artifacts into the host filesystem, though, so `make check` wraps the `rpmtests` script in a container of its own. This container runs on top of an OS image that mirrors the host and contains only the necessary runtime and test dependencies, in the versions matching the development headers used during the build. RPM is `make install`-ed  on top of that filesystem to produce the final image. The parent container with `rpmtests` is typically spawned using the snapshot functionality built into the test-suite, with the lower layer being the image directory instead of `/`.

To build such an image, a host-native `mktree` script ("backend") is invoked by `make check`. It is supposed to use whatever the native package management tooling on that platform (Linux distribution) is, e.g. `dnf --installroot` on Fedora or the `zypper` equivalent on OpenSUSE. This is ideally done as an unprivileged user, with the use of Linux `namespaces(7)`.

Currently, we only include `mktree.fedora` but the idea was to gradually add more such backends for the platforms where RPM is typically built and developed, at least for OpenSUSE and perhaps (experimentally) for Debian/Ubuntu too.

## Problem

It turns out that, while this is quite easy to do on Fedora with DNF and `unshare(1)`, it is not as easy on other distros that I've tried, namely OpenSUSE where Zypper doesn't seem to like being run through `unshare(1)`, and would likely require `sudo` instead. The same I've observed with Debian and `debootstrap`. While not a dealbreaker per se, we should really try to avoid `sudo` for something as trivial as a `make check`.

Another drawback of this approach is that we suddenly find ourselves in the business of maintaining distro-specific scripts where each needs its own set of tricks and workarounds, such as having to inject a bunch of RPM macros to DNF to make it [behave](https://github.com/rpm-software-management/rpm/blob/7d017eef51bd3138aad0a7859e76e7bfafd3951a/tests/mktree.fedora#L27) properly for our purposes. This does not scale well and just makes our life harder in the long run as the packaging stacks come and go.

In fact, [mkosi](https://github.com/systemd/mkosi) does almost what we need as it abstracts all these distro-specific details away from the user. In the [latest version](https://github.com/systemd/mkosi/releases/tag/v15.1), it even works without root privileges and apparently can now run the build script natively, meaning that the local build directory produced by CMake could possibly be used.

However, there are still some other limitations preventing us from considering `mkosi`, such as the [inability](https://github.com/systemd/mkosi/issues/248) to run within a container, thus making it less portable. That is something we need for our CI environment where we typically have a limited choice of operating systems (Ubuntu 20.04 LTS in GitHub Actions currently) and thus may or may not have all the runtime dependencies for the latest development snapshot of RPM available in the official distribution repositories.

Also, the core philosophy of `mkosi` is more like "wrap the build system in a container and act as the primary interface for building/testing" whereas we'd prefer it the other way around, i.e. "seamlessly integrate into our existing build system and reuse the local build artifacts".

Hence, `mktree` was born, as a poor-man's version of `mkosi` tailored to our use case, almost dependency free, with some shamelessly stolen ideas and naming conventions from `mkosi`, but having the issues mentioned.

## Solution

As it happens, there already is a standardized way of distributing container images, namely [OCI](https://opencontainers.org/), or more commonly known as Podman and Docker images. The public registries contain a lot of different Linux distributions, certainly those that we care about. And most developer workstations likely have Podman or Docker installed already.

In fact, we already use these through `mktree.podman` and our `Dockerfile`. This backend currently acts as a fallback for non-Fedora distros and is our go-to backend in the CI. As opposed to `mktree.fedora`, `mktree.podman` uses Podman/Docker to also spawn the test-suite container from the image. This is obviously the most natural way once you're already using Podman/Docker to build the image. However, it doesn't allow for "hot swapping" the RPM installation with the latest bits when testing RPM interactively in `make shell`, due to the nature of Podman/Docker image layering. This is only supported in `mktree.fedora` where we manage these layers as plain directories ourselves.

The only downside of these premade images is that, in the case of Fedora or OpenSUSE, they, well, contain a stock installation of RPM already, and we would like to get rid of it before planting our own. Which is what we do in our `Dockerfile`, by simply self-destructing RPM with `rpm --nodeps -e rpm ...`. This is ugly as it creates an additional layer internally in Podman/Docker, but hey, it works. In the future, we could always publish our own OCI image for RPM testing (basically the one that we currently build with `mktree.fedora`), but that's really out of scope here :laughing: 

So, why not just make `mktree.podman` the default backend everywhere, add host-specific `Dockerfile`s, and drop `mktree.fedora`? Well, it's currently not optimized for iterative `make check` use since it doesn't reuse the local build directory (instead, it builds RPM itself as part of the `Dockerfile` and thus in a container, much like `mkosi`).

Conveniently, Podman allows for mounting an image into a directory on the host, which means we can simply use that directory instead of bootstrapping our own with DNF/Zypper/etc.

Docker sadly doesn't support mounting images, but it does support extracting them into a destination directory, so we can fall back to that for Docker, with the small CPU and disk space penalty involved (it's still much faster than using a package manager to install packages).

## Summary

So, to summarize, the goal of this ticket is to adjust `mktree.podman` as follows:

1) Use a `Dockerfile` matching the host, e.g. `Dockerfile.fedora` and pass the distro version as an argument (somehow)
2) Layer the RPM installation on top using the local build directory, instead of doing it from scratch in the `Dockerfile`
3) Use `snapshot()` to spawn the `rpmtests` container, much like `mktree.fedora`
4) Remove `mktree.fedora`

In other words, replace `mktree.<OSNAME>` with `Dockerfile.<OSNAME>`, thus delegating the details of image creation to the appropriate tools.

This change isn't targeting 4.19 since it's too late for that and also it's not something that's relevant in the distributed tarball, it's more like a feature/optimization for the developers working off of a git checkout.



-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/2643
You are receiving this because you are subscribed to this thread.

Message ID: <rpm-software-management/rpm/issues/2643 at github.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rpm.org/pipermail/rpm-maint/attachments/20230901/762593e4/attachment.html>


More information about the Rpm-maint mailing list