[Rpm-ecosystem] varlink for libdnf discussion

Neal Gompa ngompa13 at gmail.com
Thu May 3 17:07:09 UTC 2018

On Wed, May 2, 2018 at 1:28 PM Colin Walters <walters at verbum.org> wrote:

> Let's talk about

> It relates a lot to the libdnf efforts and higher level design.  And this
discussion is also
> probably generally interesting for other librpm-based systems like
zypper, as those
> also intersect with PackageKit today.

> The whole topic of API for package management is a messy and complex one.
> However, I've been thinking about varlink more, and I like the idea
(although I confess
> to not really having tried to really use or implement it in anger, so my
opinion could change
> after that =) )

> The TL;DR of what I'm thinking here is to consider dropping the SWIG
effort from libdnf, and focus
> on varlink.  The question of implementation language (C/C++/whatever)
then becomes
> a lot more orthogonal as it's decoupled from how language bindings work.

> In terms of varlink advantages: Some aspects of libdnf are
dynamic/event-driven; I'm
> specifically thinking here of package downloads for example.  That type
of thing
> is very natural and nice in a varlink-style system.  In contrast with C++
> no real standard for this - it's why both Qt and GObject have signals.

> There are clear advantages in terms of robustness for forking off a
separate process
> for package management; librpm will directly call chroot() in-process
> is pretty hostile to anything else you have going on.  (rpm-ostree does
not do this btw
> for client-side layering today).

> Further, the enormous size of Fedora metadata today causes lots of memory
> in libsolv - this was one of the drivers behind having rpm-ostree do
> https://github.com/projectatomic/rpm-ostree/pull/606
> which works today, but it would have worked better from the start if the
> could just stay around and all of the heavy lifting was in a subprocess.

> To repeat one of the varlink rationales; every programming language can
speak JSON.
> I suspect we don't need high performance in most cases - the only
exception I can
> think of offhand is say a `dnf search foo` type thing; but even then I
suspect there
> is a lot more things to optimize in searching over json parsing.

> varlink doesn't apply to everything; to state the obvious, there's still
a role
> for shared libraries (and language bindings to them) and still a role for
DBus, etc.
> But specifically for package management I think varlink could make a lot
of sense.

I wanted to give a lot of thought to this before I responded, because I
maintain a few projects that would be directly affected by this proposal.
Among other things, I'm the maintainer of dnfdaemon, and also I co-maintain
dnfdragora. In Mageia, we've also been working on our migration from urpmi
to DNF, and our tooling is mostly in Perl, rather than Python.

My initial thought around this was that it's absolutely nuts. You're more
or less proposing the model that is used by Docker, Kubernetes/OpenShift,
where libraries aren't actually libraries, but in fact daemons that
everyone is forced to operate through subshells and IPC.

On further reflection, I still consider it crazy... for libdnf. However, I
think it would make sense for splitting "dnf" in two. Something that has
been brought up more than a couple of times in the Fedora development IRC
channel and the mailing list is the idea of merging dnfdaemon into DNF
itself. Among other things, this would provide a means to "harden" from
events such as the terminal dying when Xorg gets into a bad state during a
live upgrade.

This model actually works fairly well for dnfdragora. During the
development of dnfdragora, there were many cases where the UI would break
and quit abruptly, but because the daemon independently was handling the
transaction, the package management action still succeeded. It doesn't
happen very often anymore, but it's still nice to have, and it makes
dnfdragora safer than dnf itself in certain circumstances. However,
dnfdaemon is very brittle because we rely on hooking into "private" APIs of
dnf to expose enough functionality to be useful. The "public" API isn't
enough. One way to fix this is to merge dnfdaemon into dnf itself, and
rewire python-dnfdaemon so that it's a client for daemonized dnf.

Now, there are still problems with this model that I'm unclear that a
varlink-enhanced dnf would solve:
* Can we survive events that cause all daemons to need to be restarted?
D-Bus doesn't handle this gracefully. I don't know if dbus-broker makes
this better...
* Can we operate in minimal contexts (where we don't have management
services or even daemons?)
* Can we deal with large quantities of information requested well? One of
the bottlenecks in dnfdaemon that there's not a great answer for is how to
deal with requesting _all available packages to install_.

To be clear, I think we should _not_ abandon having native bindings for
libdnf, because those are critical for building purposeful applications.

真実はいつも一つ!/ Always, there's only one truth!

More information about the Rpm-ecosystem mailing list